Compunnel, Inc.
We are seeking a highly skilled Senior SRE/Observability Engineer with a deep understanding of SRE practices, observability, and extensive experience in creating and managing Service Level Objectives (SLOs) and Service Level Indicators (SLIs).
This role will drive the adoption of SRE and Observability best practices across the enterprise, utilizing observability services to deliver tangible business value.
Key Responsibilities
Design, develop, and manage observability solutions, including metric identification, validation, and centralization in GEM, Prometheus, and visualizing in Grafana dashboards
Write and manage complex queries and alert definitions
Act as a bridge between Operations Support teams and SRE operations
Configure and manage monitoring, alerts, and observability using tools like GEM, Splunk, Netcool, ELK, and AIM
Maintain technical expertise and operational experience with tools such as AppDynamics, GEM, AIM\ELK, Splunk, Prometheus, and Grafana
Write code (Java, Python, Ansible, etc.), configure files, and develop complex queries
Establish design patterns for monitoring and benchmarking SLOs
Provide thought leadership and strategy in implementing and maintaining observability solutions
Create and maintain operational process documentation for observability solutions
Optimize the Observability Suite for monitoring applications
Required Qualifications
5+ years of experience with Grafana or equivalent observability tools 2+ years of experience with Python, Java, or Ansible 3+ years of experience with AppDynamics, GEM, AIM\ELK, Splunk, and Prometheus Preferred Qualifications
Experience in creating and managing SLOs and SLIs Strong background in implementing and optimizing observability solutions across large environments Expertise in creating and maintaining operational process documentation Ability to bridge gaps between development, operations, and SRE teams Experience with additional observability and monitoring tools (e.g., Datadog, New Relic)
#J-18808-Ljbffr
5+ years of experience with Grafana or equivalent observability tools 2+ years of experience with Python, Java, or Ansible 3+ years of experience with AppDynamics, GEM, AIM\ELK, Splunk, and Prometheus Preferred Qualifications
Experience in creating and managing SLOs and SLIs Strong background in implementing and optimizing observability solutions across large environments Expertise in creating and maintaining operational process documentation Ability to bridge gaps between development, operations, and SRE teams Experience with additional observability and monitoring tools (e.g., Datadog, New Relic)
#J-18808-Ljbffr