Logo
Ccgmag

Senior Lead Software Engineer - Reliability engineering & SRE

Ccgmag, New York, New York, us, 10261

Save Job

Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top‑notch technology products.

As a Senior Lead Software Engineer at JPMorgan Chase within the Commercial & Investment Banking - Payments Technology, you play a crucial role in an agile team dedicated to enhancing, building, and delivering trusted, market‑leading technology products in a secure, stable, and scalable manner. This hands‑on technical leadership position emphasizes engineering excellence, automation, and innovation. You will lead a global team of engineers, working closely with development, infrastructure, and business partners to deliver resilient, observable, and high‑performing systems.

Job responsibilities

Architect and implement solutions for high availability, fault tolerance, and performance of critical applications across various technologies

Lead technical reviews and drive automation for monitoring, alerting, and operational workflows

Design and advance observability frameworks to improve application health insights and incident response

Mentor and coach engineers on reliability and operational excellence

Collaborate with development, DevOps, and business partners to align reliability initiatives with strategic goals

Engineer robust release validation and production readiness processes for seamless deployments

Own rapid production incident response and root cause analysis, driving permanent engineering solutions

Serves as a function‑wide subject matter expert in one or more areas of focus

Actively contributes to the engineering community as an advocate of firmwide frameworks, tools, and practices of the Software Development Life Cycle

Influences peers and project decision‑makers to consider the use and application of leading‑edge technologies

Required qualifications, capabilities, and skills

Formal training or certification on software engineering concepts and 5+ years applied experience

Extensive experience in application development, reliability engineering or SRE, including substantial years in technical leadership roles

Hands‑on practical experience delivering system design, application development, testing, and operational stability

Advanced software engineering expertise (Java, Python, or similar), with a proven track record of building reliable, scalable systems

Ability to tackle design and functionality problems independently with little to no oversight

Practical cloud native experience

Deep knowledge of observability, monitoring, and logging platforms (Grafana, Prometheus, Splunk, Datadog, AppDynamics, ELK, etc.)

Hands‑on experience with automation, scripting, and infrastructure‑as‑code (Python, Java, Shell, Ansible, etc.)

Strong understanding of incident management, root cause analysis, and reliability frameworks

Experience leading distributed engineering teams in a high‑availability environment and in stakeholder management.

#J-18808-Ljbffr