Universal Music Group
We are UMG, the Universal Music Group. We are the world's leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.
How You'll LEAD:
As a
Senior Observability Engineer
within UMG's IT Technology Services team, you will drive the reliability, performance, and stability of our global technology ecosystem. You'll own the design and evolution of our observability platform, ensuring visibility across systems, applications, and services. This role is both hands‑on and strategic—ideal for an engineer about building scalable, automated, and data‑driven monitoring solutions that empower teams to deliver high‑performing, resilient systems. You'll partner with DevOps, Infrastructure, and Application teams to lead observability best practices and shape a culture of proactive system insight across UMG. How you'll CREATE:
Observability Architecture & Implementation
Design, implement, and maintain end‑to‑end observability solutions across infrastructure, applications, and services. Select, configure, and integrate industry‑leading monitoring and telemetry tools (e.g., Prometheus, Grafana, ELK, Dynatrace, Datadog). Develop automation and integrations to streamline metrics, logging, and tracing pipelines. Monitoring & Incident Response
Establish effective alerting frameworks and SLO/SLA‑driven dashboards for real‑time visibility. Partner with incident response and SRE teams to diagnose, remediate, and prevent production issues. Conduct root cause analysis and proactively identify performance bottlenecks and capacity needs. Collaboration & Leadership
Partner with development, security, and operations teams to embed observability into system design. Lead cross‑functional initiatives to standardize monitoring practices and enhance operational maturity. Mentor peers and provide training on observability tools and best practices. Continuous Improvement
Evaluate emerging technologies to evolve UMG's observability strategy. Drive automation and process improvements to improve system performance, resiliency, and insight quality. Integrate observability with security monitoring and compliance workflows. Data Analysis & Reporting
Analyze metrics, logs, and traces to surface insights into system behavior and performance trends. Deliver reports and visualizations tailored for both technical and business stakeholders. Bring your VIBE:
Required Skills & Experience
7+ years
of professional experience in information technology, including
3+ years specializing in observability, monitoring, or SRE engineering. Deep knowledge of monitoring toolsets such as
Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog,
or equivalent. Proficiency in
Python, Go, or Java
for automation and tool development. Hands‑on experience with
Kubernetes, Docker, and cloud platforms (AWS, GCP, or Azure). Strong understanding of
networking, infrastructure, and performance optimization. Familiarity with configuration management tools ( Ansible, Chef, Puppet ) and CI/CD integration. Proven track record designing and delivering
dashboards, alerts, and performance reports
for multiple audiences. Excellent communication skills, with the ability to translate technical insights into actionable recommendations. Preferred Certifications (Highly Desirable)
Prometheus Certified Admin Kubernetes Administrator or Application Developer Grafana Certified Observer Dynatrace Associate Splunk Core Certified Power User/Admin Elastic Certified Engineer DevOps Engineer Certification (AWS and/or Google) Perks Playlist:
Be part of an entrepreneurial, global organization that values authenticity, drive, creativity, relationships, and a competitive spirit
100% coverage for out‑patient mental health services Wellbeing reimbursements for fitness classes, spa treatments, meal services, travel, and so much more (up to $720/year) A lifetime fertility support allowance of $30,000 to plan participants Student Loan Repayment Assistance and Tuition Reimbursement 100% immediately vested 401(k) match on the first 5% of your contribution on eligible compensation
Variety of ways to prioritize much‑needed time away from work including
Flexible Paid Time Off (PTO) for exempt employees 3‑weeks PTO for non‑exempt employees 2‑weeks paid Winter Break 10 Company Holidays (including Juneteenth and Wellbeing Day) Summer Fridays (between Memorial Day and Labor Day) Generous paid parental leave for every type of parent
Check out our full overview of benefits on the Perks Playlist page of the career site. Disclaimer: This job description only provides an overview of job responsibilities that are subject to change. Universal Music Group is an Equal Opportunity Employer We are an E‑Verify employer in Alabama, Arizona, Georgia, Mississippi, North Carolina, South Carolina, Tennessee, and Utah. Job Category:
Technology Salary Range:
$147,385 - $185,235 The actual base salary offered depends on a variety of factors, which may include, as applicable, the qualifications of the individual applicant for the position, years of relevant experience, specific and unique skills, level of education attained, certifications or other professional licenses held, and the location in which the applicant lives and/or from which they will be performing the job. All candidates are encouraged to apply.
#J-18808-Ljbffr
As a
Senior Observability Engineer
within UMG's IT Technology Services team, you will drive the reliability, performance, and stability of our global technology ecosystem. You'll own the design and evolution of our observability platform, ensuring visibility across systems, applications, and services. This role is both hands‑on and strategic—ideal for an engineer about building scalable, automated, and data‑driven monitoring solutions that empower teams to deliver high‑performing, resilient systems. You'll partner with DevOps, Infrastructure, and Application teams to lead observability best practices and shape a culture of proactive system insight across UMG. How you'll CREATE:
Observability Architecture & Implementation
Design, implement, and maintain end‑to‑end observability solutions across infrastructure, applications, and services. Select, configure, and integrate industry‑leading monitoring and telemetry tools (e.g., Prometheus, Grafana, ELK, Dynatrace, Datadog). Develop automation and integrations to streamline metrics, logging, and tracing pipelines. Monitoring & Incident Response
Establish effective alerting frameworks and SLO/SLA‑driven dashboards for real‑time visibility. Partner with incident response and SRE teams to diagnose, remediate, and prevent production issues. Conduct root cause analysis and proactively identify performance bottlenecks and capacity needs. Collaboration & Leadership
Partner with development, security, and operations teams to embed observability into system design. Lead cross‑functional initiatives to standardize monitoring practices and enhance operational maturity. Mentor peers and provide training on observability tools and best practices. Continuous Improvement
Evaluate emerging technologies to evolve UMG's observability strategy. Drive automation and process improvements to improve system performance, resiliency, and insight quality. Integrate observability with security monitoring and compliance workflows. Data Analysis & Reporting
Analyze metrics, logs, and traces to surface insights into system behavior and performance trends. Deliver reports and visualizations tailored for both technical and business stakeholders. Bring your VIBE:
Required Skills & Experience
7+ years
of professional experience in information technology, including
3+ years specializing in observability, monitoring, or SRE engineering. Deep knowledge of monitoring toolsets such as
Prometheus, Grafana, ELK, Splunk, Dynatrace, Datadog,
or equivalent. Proficiency in
Python, Go, or Java
for automation and tool development. Hands‑on experience with
Kubernetes, Docker, and cloud platforms (AWS, GCP, or Azure). Strong understanding of
networking, infrastructure, and performance optimization. Familiarity with configuration management tools ( Ansible, Chef, Puppet ) and CI/CD integration. Proven track record designing and delivering
dashboards, alerts, and performance reports
for multiple audiences. Excellent communication skills, with the ability to translate technical insights into actionable recommendations. Preferred Certifications (Highly Desirable)
Prometheus Certified Admin Kubernetes Administrator or Application Developer Grafana Certified Observer Dynatrace Associate Splunk Core Certified Power User/Admin Elastic Certified Engineer DevOps Engineer Certification (AWS and/or Google) Perks Playlist:
Be part of an entrepreneurial, global organization that values authenticity, drive, creativity, relationships, and a competitive spirit
100% coverage for out‑patient mental health services Wellbeing reimbursements for fitness classes, spa treatments, meal services, travel, and so much more (up to $720/year) A lifetime fertility support allowance of $30,000 to plan participants Student Loan Repayment Assistance and Tuition Reimbursement 100% immediately vested 401(k) match on the first 5% of your contribution on eligible compensation
Variety of ways to prioritize much‑needed time away from work including
Flexible Paid Time Off (PTO) for exempt employees 3‑weeks PTO for non‑exempt employees 2‑weeks paid Winter Break 10 Company Holidays (including Juneteenth and Wellbeing Day) Summer Fridays (between Memorial Day and Labor Day) Generous paid parental leave for every type of parent
Check out our full overview of benefits on the Perks Playlist page of the career site. Disclaimer: This job description only provides an overview of job responsibilities that are subject to change. Universal Music Group is an Equal Opportunity Employer We are an E‑Verify employer in Alabama, Arizona, Georgia, Mississippi, North Carolina, South Carolina, Tennessee, and Utah. Job Category:
Technology Salary Range:
$147,385 - $185,235 The actual base salary offered depends on a variety of factors, which may include, as applicable, the qualifications of the individual applicant for the position, years of relevant experience, specific and unique skills, level of education attained, certifications or other professional licenses held, and the location in which the applicant lives and/or from which they will be performing the job. All candidates are encouraged to apply.
#J-18808-Ljbffr