BNY
Vice President, Site Reliability Engineer
At BNY, our culture allows us to run our company better and enables employees’ growth and success. As a leading global financial services company at the heart of the global financial system, we influence nearly 20% of the world’s investible assets. Every day, our teams harness cutting‑edge AI and breakthrough technologies to collaborate with clients, driving transformative solutions that redefine industries and uplift communities worldwide.
Job Description We’re seeking a future team member for the role of SRE / Site Reliability Engineer to join our Technology team. This role is located in Jersey City, NJ.
In this role, you’ll make an impact in the following ways:
Drive reliability and performance by defining SLOs/SLIs, improving observability, and proactively identifying and addressing system bottlenecks across cloud environments.
Automate infrastructure and operations using Terraform, Kubernetes, and CI/CD tools to eliminate toil and enable scalable, fault‑tolerant deployments.
Collaborate cross‑functionally with product, infrastructure, and DevOps teams to reduce incidents, build resilient services, and ensure architectural clarity.
Lead incident management by participating in on‑call rotations, conducting post‑mortems, and implementing automated recovery to minimize downtime.
Build and maintain monitoring systems with tools like Prometheus, Grafana, AppDynamics, and Splunk to support real‑time alerting and root‑cause analysis.
Develop platform tooling and pipelines for container orchestration, third‑party integrations, and cloud‑native operations to improve system efficiency and reliability.
Maintain and improve live services by measuring and monitoring latency and overall system health, working closely with tech support and operations teams.
Leverage and define KPIs to understand service performance and identify corrective actions.
Create, manage, and use dashboards for continuous monitoring and health checks of applications and underlying infrastructure.
Design and implement solutions to customer friction points and improve the entire lifecycle of services from inception through sustainment.
Assist in creating and maintaining automation to improve reliability and velocity in addressing issues during regular maintenance tasks.
Mentor engineers and champion SRE best practices, embedding a reliability‑first culture and ensuring technical excellence across engineering teams.
Qualifications
Bachelor’s degree in computer science or a related discipline, or equivalent work experience required; advanced degree preferred.
5‑8 years of related experience; experience in the securities or financial services industry is a plus.
Strong expertise in cloud infrastructure (Azure, AWS, or GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Helm).
Proficiency in observability and monitoring tools such as Prometheus, Grafana, AppDynamics, Datadog, Splunk, and experience with incident response and on‑call support.
Solid programming and scripting skills in languages like Python, Go, or Java, with a focus on automation, tooling, and system integration.
Deep understanding of SRE principles, including SLAs, SLOs, error budgets, post‑mortems, and reliability‑focused system design.
Familiarity with automated testing, DevSecOps practices, CI/CD methods, performance engineering, and security controls.
Strong collaboration and communication skills, with experience working in Agile environments and partnering with cross‑functional engineering, product, and operations teams.
Previous success in technical engineering and coding experience beyond simple scripts.
Benefits & Rewards BNY offers highly competitive compensation, benefits, and wellbeing programs rooted in a strong culture of excellence and our pay‑for‑performance philosophy. We provide access to flexible global resources and tools for your life’s journey, along with generous paid leaves, including paid volunteer time.
BNY is an Equal Employment Opportunity / Aff... Underrepresented racial and ethnic groups / Females / Individuals with Disabilities / Protected Veterans. BNY assesses market data to ensure a competitive compensation package for employees. This position is at‑will and the Company reserves the right to modify base salary or any other compensation at any time.
#J-18808-Ljbffr
Job Description We’re seeking a future team member for the role of SRE / Site Reliability Engineer to join our Technology team. This role is located in Jersey City, NJ.
In this role, you’ll make an impact in the following ways:
Drive reliability and performance by defining SLOs/SLIs, improving observability, and proactively identifying and addressing system bottlenecks across cloud environments.
Automate infrastructure and operations using Terraform, Kubernetes, and CI/CD tools to eliminate toil and enable scalable, fault‑tolerant deployments.
Collaborate cross‑functionally with product, infrastructure, and DevOps teams to reduce incidents, build resilient services, and ensure architectural clarity.
Lead incident management by participating in on‑call rotations, conducting post‑mortems, and implementing automated recovery to minimize downtime.
Build and maintain monitoring systems with tools like Prometheus, Grafana, AppDynamics, and Splunk to support real‑time alerting and root‑cause analysis.
Develop platform tooling and pipelines for container orchestration, third‑party integrations, and cloud‑native operations to improve system efficiency and reliability.
Maintain and improve live services by measuring and monitoring latency and overall system health, working closely with tech support and operations teams.
Leverage and define KPIs to understand service performance and identify corrective actions.
Create, manage, and use dashboards for continuous monitoring and health checks of applications and underlying infrastructure.
Design and implement solutions to customer friction points and improve the entire lifecycle of services from inception through sustainment.
Assist in creating and maintaining automation to improve reliability and velocity in addressing issues during regular maintenance tasks.
Mentor engineers and champion SRE best practices, embedding a reliability‑first culture and ensuring technical excellence across engineering teams.
Qualifications
Bachelor’s degree in computer science or a related discipline, or equivalent work experience required; advanced degree preferred.
5‑8 years of related experience; experience in the securities or financial services industry is a plus.
Strong expertise in cloud infrastructure (Azure, AWS, or GCP), containerization (Docker, Kubernetes), and Infrastructure as Code (Terraform, Helm).
Proficiency in observability and monitoring tools such as Prometheus, Grafana, AppDynamics, Datadog, Splunk, and experience with incident response and on‑call support.
Solid programming and scripting skills in languages like Python, Go, or Java, with a focus on automation, tooling, and system integration.
Deep understanding of SRE principles, including SLAs, SLOs, error budgets, post‑mortems, and reliability‑focused system design.
Familiarity with automated testing, DevSecOps practices, CI/CD methods, performance engineering, and security controls.
Strong collaboration and communication skills, with experience working in Agile environments and partnering with cross‑functional engineering, product, and operations teams.
Previous success in technical engineering and coding experience beyond simple scripts.
Benefits & Rewards BNY offers highly competitive compensation, benefits, and wellbeing programs rooted in a strong culture of excellence and our pay‑for‑performance philosophy. We provide access to flexible global resources and tools for your life’s journey, along with generous paid leaves, including paid volunteer time.
BNY is an Equal Employment Opportunity / Aff... Underrepresented racial and ethnic groups / Females / Individuals with Disabilities / Protected Veterans. BNY assesses market data to ensure a competitive compensation package for employees. This position is at‑will and the Company reserves the right to modify base salary or any other compensation at any time.
#J-18808-Ljbffr