Capgemini
Metrics Platform Site Reliability Engineer
Join to apply for the
Metrics Platform Site Reliability Engineer
role at
Capgemini
Job Location – Atlanta, GA
Responsibilities
Manage and mentor a team of Site Reliability Engineers
Define and implement SRE strategies and best practices in alignment with organizational objectives
Monitor clients' service level agreements (SLAs), service level objectives (SLOs) and service level indicators (SLIs)
Lead initiatives to improve system reliability, availability, scalability and performance
Collaborate with development and operations teams to ensure reliability and resiliency goals are met
Implement and improve incident management processes to minimize downtime and ensure timely resolutions
Review and contribute to the architecture of critical systems ensuring they meet reliability and performance goals
Drive observability practices by implementing robust monitoring, logging and alerting systems
Implement and maintain monitoring systems to proactively identify potential issues and alert engineers to problems before they impact users
Respond to incidents and outages, diagnose problems and implement solutions to minimize downtime and restore service
Automate repetitive tasks and processes to improve efficiency and reduce manual effort
Identify and address performance bottlenecks to ensure systems run efficiently and effectively
Manage and maintain the underlying infrastructure including servers, networks and cloud resources
Plan for future capacity needs to ensure systems can handle anticipated workloads
Develop and maintain processes for deploying software updates and releases
Work closely with developers, operations teams and other stakeholders to ensure system reliability and availability
Maintain clear and concise documentation of systems, processes and procedures
Identify areas for improvement and implement changes to enhance system reliability and performance
Skills Required
Proficiency in writing Splunk queries and alerts is a must
Hands‑on experience with at least one APM tool: New Relic, AppDynamics, Honeycomb, Data Dog is a must
Expertise in automation tools and scripting languages (Python or JavaScript) is a must
Proficiency in scripting languages: Python or Node.js is a must
Proficiency in any cloud platform: AWS, GCP, Azure is a must
Strong understanding of distributed systems, microservices architecture and container orchestration tools (e.g., Kubernetes)
Experience with monitoring tools like Prometheus, Grafana is a must
Benefits
Flexible work
Healthcare including dental, vision, mental health, and well‑being programs
Financial well‑being programs such as 401(k) and Employee Share Ownership Plan
Paid time off and paid holidays
Paid parental leave
Family building benefits like adoption assistance, surrogacy, and cryopreservation
Social well‑being benefits like subsidized back‑up child/elder care and tutoring
Mentoring, coaching and learning programs
Employee Resource Groups
Disaster Relief
Referrals increase your chances of interviewing at Capgemini by 2x
Disclaimer Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
This is a general description of the Duties, Responsibilities and Qualifications required for this position. Physical, mental, sensory or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity, Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodations do not pose an undue hardship.
Capgemini is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.
Click the following link for more information on your rights as an Applicant http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law
Salary Transparency: Capgemini discloses salary range information in compliance with state and local pay transparency obligations. The disclosed range represents the lowest to highest salary we, in good faith, believe we would pay for this role at the time of this posting, although we may ultimately pay more or less than the disclosed range. The base salary range for the tagged location is 100000 - 130000 / year. This role may be eligible for other compensation including variable compensation, bonus, or commission. Full‑time regular employees are eligible for paid time off, medical/dental/vision insurance, 401(k), and any other benefits to eligible employees.
Seniority Level: Mid‑Senior level
Employment Type: Full‑time
Job Function: Engineering and Information Technology
Industries: IT Services and IT Consulting
#J-18808-Ljbffr
Metrics Platform Site Reliability Engineer
role at
Capgemini
Job Location – Atlanta, GA
Responsibilities
Manage and mentor a team of Site Reliability Engineers
Define and implement SRE strategies and best practices in alignment with organizational objectives
Monitor clients' service level agreements (SLAs), service level objectives (SLOs) and service level indicators (SLIs)
Lead initiatives to improve system reliability, availability, scalability and performance
Collaborate with development and operations teams to ensure reliability and resiliency goals are met
Implement and improve incident management processes to minimize downtime and ensure timely resolutions
Review and contribute to the architecture of critical systems ensuring they meet reliability and performance goals
Drive observability practices by implementing robust monitoring, logging and alerting systems
Implement and maintain monitoring systems to proactively identify potential issues and alert engineers to problems before they impact users
Respond to incidents and outages, diagnose problems and implement solutions to minimize downtime and restore service
Automate repetitive tasks and processes to improve efficiency and reduce manual effort
Identify and address performance bottlenecks to ensure systems run efficiently and effectively
Manage and maintain the underlying infrastructure including servers, networks and cloud resources
Plan for future capacity needs to ensure systems can handle anticipated workloads
Develop and maintain processes for deploying software updates and releases
Work closely with developers, operations teams and other stakeholders to ensure system reliability and availability
Maintain clear and concise documentation of systems, processes and procedures
Identify areas for improvement and implement changes to enhance system reliability and performance
Skills Required
Proficiency in writing Splunk queries and alerts is a must
Hands‑on experience with at least one APM tool: New Relic, AppDynamics, Honeycomb, Data Dog is a must
Expertise in automation tools and scripting languages (Python or JavaScript) is a must
Proficiency in scripting languages: Python or Node.js is a must
Proficiency in any cloud platform: AWS, GCP, Azure is a must
Strong understanding of distributed systems, microservices architecture and container orchestration tools (e.g., Kubernetes)
Experience with monitoring tools like Prometheus, Grafana is a must
Benefits
Flexible work
Healthcare including dental, vision, mental health, and well‑being programs
Financial well‑being programs such as 401(k) and Employee Share Ownership Plan
Paid time off and paid holidays
Paid parental leave
Family building benefits like adoption assistance, surrogacy, and cryopreservation
Social well‑being benefits like subsidized back‑up child/elder care and tutoring
Mentoring, coaching and learning programs
Employee Resource Groups
Disaster Relief
Referrals increase your chances of interviewing at Capgemini by 2x
Disclaimer Capgemini is an Equal Opportunity Employer encouraging diversity in the workplace. All qualified applicants will receive consideration for employment without regard to race, national origin, gender identity/expression, age, religion, disability, sexual orientation, genetics, veteran status, marital status or any other characteristic protected by law.
This is a general description of the Duties, Responsibilities and Qualifications required for this position. Physical, mental, sensory or environmental demands may be referenced in an attempt to communicate the manner in which this position traditionally is performed. Whenever necessary to provide individuals with disabilities an equal employment opportunity, Capgemini will consider reasonable accommodations that might involve varying job requirements and/or changing the way this job is performed, provided that such accommodations do not pose an undue hardship.
Capgemini is committed to providing reasonable accommodations during our recruitment process. If you need assistance or accommodation, please reach out to your recruiting contact.
Click the following link for more information on your rights as an Applicant http://www.capgemini.com/resources/equal-employment-opportunity-is-the-law
Salary Transparency: Capgemini discloses salary range information in compliance with state and local pay transparency obligations. The disclosed range represents the lowest to highest salary we, in good faith, believe we would pay for this role at the time of this posting, although we may ultimately pay more or less than the disclosed range. The base salary range for the tagged location is 100000 - 130000 / year. This role may be eligible for other compensation including variable compensation, bonus, or commission. Full‑time regular employees are eligible for paid time off, medical/dental/vision insurance, 401(k), and any other benefits to eligible employees.
Seniority Level: Mid‑Senior level
Employment Type: Full‑time
Job Function: Engineering and Information Technology
Industries: IT Services and IT Consulting
#J-18808-Ljbffr