Logo
LPL Financial

Senior Observability Engineer

LPL Financial, San Diego, California, United States, 92189

Save Job

Job Overview LPL is seeking a Senior Observability Engineer to enhance system resilience and visibility across our enterprise platforms. This role will focus on designing and implementing scalable observability solutions that support rapid incident response, performance optimization, and continuous improvement. You will collaborate with engineering teams to standardize monitoring practices and drive innovation in observability tooling and strategy.

Responsibilities Observability Architecture & Implementation

Design, implement, and maintain observability solutions using AWS CloudWatch, Dynatrace, ELK, SolarWinds, and other monitoring tools.

Integrate OpenTelemetry for distributed tracing and improve end-to-end system observability.

Implement Monitoring as Code using infrastructure-as-code tools such as Terraform and CloudFormation.

Collaboration & Best Practices

Partner with SREs, DevOps, and Software Engineers to define and enforce observability standards.

Develop and standardize practices for monitoring, logging, and alerting across platforms.

Performance Optimization & Incident Response

Optimize performance monitoring, anomaly detection, and automated incident response strategies.

Drive observability-related incident investigations, root cause analysis, and post‑mortem processes.

Assist in major incidents and participate in on‑call rotation for tool support.

Innovation & Continuous Improvement

Continuously evaluate and introduce new observability tools and methodologies.

Create dashboards, alerts, and reports to provide actionable insights into system performance and availability.

What are we looking for? We want strong collaborators who can deliver a world‑class client experience. We are looking for people who thrive in a fast‑paced environment, are client‑focused, team‑oriented, and are able to execute in a way that encourages creativity and continuous improvement.

Requirements

7+ years of experience in observability, monitoring, or site reliability engineering.

Advanced troubleshooting and monitoring expertise using Dynatrace and related APM tools (Dynatrace certification preferred).

Rich experience with metrics and logging tools such as SolarWinds, ELK, Kibana.

Proficiency in scripting and automation using Python, Bash, or PowerShell.

Experience with Monitoring as Code (MaC) using Terraform, CloudFormation, or Ansible.

Strong knowledge of Kubernetes, Docker, and microservices architectures.

Familiarity with CI/CD pipelines and DevOps practices.

Knowledge of AIOps and predictive monitoring techniques.

Cross‑platform experience in Windows Server, Linux/AIX, Networking, Virtualization, Database (MSSQL/Oracle), Cloud Computing (AWS/Azure), and storage platforms (IBM/EMC/INFINIDAT).

Experience with middleware service layers (F5, Tibco, Datapower, MuleSoft), caching technologies, database technologies (MSSQL, Oracle, MySQL, Aurora RDM, MapR), authentication (PingFederate, Forgerock), and RPA tools (Workfusion).

Core Competencies

Thrive in a team‑oriented environment with strong technical peers and leadership.

Demonstrate curiosity, initiative, and a willingness to ask questions and support others.

#LI-Hybrid

Pay Range Pay Range: $92,288–$153,813 per year

Actual base salary varies based on factors, including but not limited to relevant skill, prior experience, education, base salary of internal peers, demonstrated performance, and geographic location. LPL Total Rewards package is highly competitive, designed to support your success at work, at home, and at play – such as 401(k) matching, health benefits, employee stock options, paid time off, volunteer time off, and more.

Principals only. EOE.

#J-18808-Ljbffr