Logo
Jewelers Mutual Group

DevOps Engineer Sr

Jewelers Mutual Group, Raleigh, North Carolina, United States, 27601

Save Job

Summary As the Senior DevOps Engineer, you will take ownership of optimizing and scaling our serverless platform's infrastructure, CI/CD pipelines, and monitoring capabilities. This is a technical leadership role where you will drive operational excellence, enhance developer productivity, and ensure the resilience, scalability, and security of our microservices platform in a cloud-native environment, particularly AWS. You will collaborate with engineering teams to design, implement, and maintain solutions that support multiple business units, fostering a DevOps-first culture while ensuring seamless integration across our products.

WHY Jewelers Mutual Jewelers Mutual is the leading specialty insurance company dedicated to serving the jewelry industry. We've been protecting jewelry and jewelry businesses for over 100 years, and today, we're trusted by thousands of jewelers, manufacturers, and wholesalers, as well as millions of customers, to protect their most valuable jewelry assets.

What You'll Do Design and Scale CI/CD Pipelines Design, implement, and refine CI/CD pipelines to enable automated testing and seamless continuous delivery across multiple environments, leveraging tools such as GitHub Actions, GitHub Advanced Security, or equivalent solutions that align with a "Dev, Test, Prod" workflow. Automate and enhance CI/CD pipelines to support microservices deployment with minimal downtime, ensuring seamless integration, rollback capabilities, and rapid iteration cycles. Infrastructure as Code and Cloud Management:

Architect and maintain AWS-based infrastructure using Terraform to ensure security, scalability, and reliability across compute, networking, and data services. Manage core AWS services such as Lambda, API Gateway, Step Functions, VPC, S3, and Aurora Serverless (Postgres). Design and implement a robust event-driven communication architecture that enables seamless, scalable, and decoupled interactions across services. Leverage AWS services such as EventBridge, CloudWatch, SNS, and SQS to orchestrate real-time event processing, ensuring high availability, fault tolerance, and responsiveness across the platform Observability and Monitoring

Design and implement a comprehensive observability strategy to ensure real-time visibility into system performance, reliability, and security. Leverage tools like Datadog, AWS CloudTrail, and AWS X-Ray to monitor latency, trace requests across microservices, and detect anomalies before they impact customers. Implement proactive monitoring and alerting to track key performance metrics, error rates, and system health, enabling rapid incident detection and response. Set up real-time dashboards that provide actionable insights into infrastructure, application performance, and event-driven workflows. Establish telemetry and distributed tracing to improve debugging, root cause analysis, and overall system resilience in a highly dynamic, serverless environment. Resiliency and Security:

Design and implement resiliency testing strategies, including chaos engineering practices, to validate fault tolerance and high availability. Lead game days using tools like Gremlin, Chaos Monkey, or AWS Fault Injection Simulator to simulate failures and improve incident response preparedness. Ensure robust security practices across AWS resources, including IAM role-based access control, least privilege enforcement, secrets management, and automated compliance checks. Implement best practices for API security, encryption, and AWS-native security services. Developer Experience and Automation

Collaborate with development teams to optimize workflows, automate repetitive tasks, and implement best practices for cloud management, automation, and continuous integration. Promote a DevOps-first culture across teams, mentoring others on cloud management, observability, and automation practices. What We're Looking For

5+ years of experience in DevOps, Site Reliability Engineering, or a related role, with a focus on cloud-native environments and microservices. Deep AWS knowledge, including experience building infrastructure from the ground up using the AWS Well-Architected Framework. Hands-on security expertise, including IAM management, API security, and cloud-native security practices. Proven expertise in CI/CD pipeline management, automation tools, and modern deployment strategies in cloud-native environments. Strong experience with Infrastructure as Code (e.g., Terraform) to automate the provisioning and management of cloud resources. A deep background in observability tools for monitoring, alerting, and logging. We intend to utilize a full suite of DataDog tooling but the entire AWS ecosystem will be at your disposal. What We Offer

Great Place to Work® Certified: Join a team recognized for an environment of innovation and growth. Collaborative & Inclusive Culture: Work alongside smart, passionate peers who value ownership and continuous learning. Modern Work Environment: Enjoy a state-of-the-art office in Raleigh's North Hills, combined with a hybrid work model that balances teamwork and flexibility. Competitive Compensation & Benefits: Comprehensive healthcare, 6% 401k matching, generous PTO(including floating holidays), and One Pass for fitness subscriptions. Community & Giving: Benefit from 50% charitable gift matching and paid volunteer time to support nonprofit causes.

Equal Opportunity Employer This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor.