Fidelity Investments
Director, Site Reliability Engineering
Fidelity Investments, Roanoke, Texas, United States, 76299
Overview
Our Site Reliability Engineering (SRE) group within Enterprise Infrastructure blends Operational excellence with developer experience to deliver highly available, scalable, and resilient services through automation and infrastructure as code. We embed reliability into our ecosystem by applying best practices in resiliency engineering, automation, observability, and chaos testing. Director, Site Reliability Engineering
role at Fidelity Investments. Responsibilities
Lead a high-performing team of engineers focused on building foundational platforms and tools that power our reliability strategy. Bring a systems-thinking mindset and a passion for automation to help scale our infrastructure and improve the developer experience across the enterprise. Play a key role in people development, performance management, and fostering a culture of collaboration, innovation, and continuous improvement. Define and execute a comprehensive reliability and observability strategy to ensure systems are always available when customers need them. Reduce operational toil and increase efficiency through automation and platform engineering. Drive standardization and process refinement across the SRE organization. Lead incident response and root cause analysis for complex production issues. Coach and mentor SREs and development teams on building and operating highly available systems. Foster a culture of ownership, accountability, and continuous learning within the team. Collaborate with engineering and product leadership to align team goals with business priorities. Qualifications
Bachelor’s degree or higher in Computer Science, Engineering, or a related field; Master’s degree is a plus. 10+ years of experience deploying and supporting highly distributed, multi-tiered systems at scale. 3+ years of experience in a technical leadership or people management role, with a proven ability to lead and grow engineering teams. Deep hands-on experience with public cloud platforms (preferably AWS and Azure); certifications are a plus. Strong background in container orchestration (Kubernetes) and cloud-native architectures. Proven experience in leading complex technical initiatives using Agile methodologies. Proficiency in scripting and automation (Python, Shell, etc.). Experience with infrastructure as code tools (Terraform, ARM, Chef, etc.). Strong understanding of cloud infrastructure components (compute, storage, networking, security). Expertise in CI/CD pipelines and DevOps practices. Solid programming experience in compiled/OOP languages (Java, C#) and scripting languages (Python, JavaScript/TypeScript). Deep knowledge of observability tools and practices (DataDog, Prometheus, Splunk, etc.). Experience with instrumentation, monitoring, logging, and alerting for distributed systems. Strong analytical and troubleshooting skills, especially under pressure. Ability to interpret large datasets using query languages and visualization tools. Excellent communication skills, with the ability to engage both technical and non-technical audiences. Demonstrated ability to mentor, coach, and develop engineers, fostering a high-trust, high-performance team culture. Experience with performance reviews, career development planning, and team capacity management. Certifications
Category: Information Technology Most roles at Fidelity are Hybrid, requiring associates to work onsite every other week (all business days, M-F) in a Fidelity office. This does not apply to Remote or fully Onsite roles. Please be advised that Fidelity’s business is governed by various laws and regulations, which may restrict Fidelity from hiring and/or associating with individuals with certain criminal histories. Seniority level
Director Employment type
Full-time Job function
Engineering and Information Technology Referrals increase your chances of interviewing at Fidelity Investments.
#J-18808-Ljbffr
Our Site Reliability Engineering (SRE) group within Enterprise Infrastructure blends Operational excellence with developer experience to deliver highly available, scalable, and resilient services through automation and infrastructure as code. We embed reliability into our ecosystem by applying best practices in resiliency engineering, automation, observability, and chaos testing. Director, Site Reliability Engineering
role at Fidelity Investments. Responsibilities
Lead a high-performing team of engineers focused on building foundational platforms and tools that power our reliability strategy. Bring a systems-thinking mindset and a passion for automation to help scale our infrastructure and improve the developer experience across the enterprise. Play a key role in people development, performance management, and fostering a culture of collaboration, innovation, and continuous improvement. Define and execute a comprehensive reliability and observability strategy to ensure systems are always available when customers need them. Reduce operational toil and increase efficiency through automation and platform engineering. Drive standardization and process refinement across the SRE organization. Lead incident response and root cause analysis for complex production issues. Coach and mentor SREs and development teams on building and operating highly available systems. Foster a culture of ownership, accountability, and continuous learning within the team. Collaborate with engineering and product leadership to align team goals with business priorities. Qualifications
Bachelor’s degree or higher in Computer Science, Engineering, or a related field; Master’s degree is a plus. 10+ years of experience deploying and supporting highly distributed, multi-tiered systems at scale. 3+ years of experience in a technical leadership or people management role, with a proven ability to lead and grow engineering teams. Deep hands-on experience with public cloud platforms (preferably AWS and Azure); certifications are a plus. Strong background in container orchestration (Kubernetes) and cloud-native architectures. Proven experience in leading complex technical initiatives using Agile methodologies. Proficiency in scripting and automation (Python, Shell, etc.). Experience with infrastructure as code tools (Terraform, ARM, Chef, etc.). Strong understanding of cloud infrastructure components (compute, storage, networking, security). Expertise in CI/CD pipelines and DevOps practices. Solid programming experience in compiled/OOP languages (Java, C#) and scripting languages (Python, JavaScript/TypeScript). Deep knowledge of observability tools and practices (DataDog, Prometheus, Splunk, etc.). Experience with instrumentation, monitoring, logging, and alerting for distributed systems. Strong analytical and troubleshooting skills, especially under pressure. Ability to interpret large datasets using query languages and visualization tools. Excellent communication skills, with the ability to engage both technical and non-technical audiences. Demonstrated ability to mentor, coach, and develop engineers, fostering a high-trust, high-performance team culture. Experience with performance reviews, career development planning, and team capacity management. Certifications
Category: Information Technology Most roles at Fidelity are Hybrid, requiring associates to work onsite every other week (all business days, M-F) in a Fidelity office. This does not apply to Remote or fully Onsite roles. Please be advised that Fidelity’s business is governed by various laws and regulations, which may restrict Fidelity from hiring and/or associating with individuals with certain criminal histories. Seniority level
Director Employment type
Full-time Job function
Engineering and Information Technology Referrals increase your chances of interviewing at Fidelity Investments.
#J-18808-Ljbffr