eTeam

Senior Platform Engineer / Site Reliability Engineer - Onsite

eTeam, Charlotte, North Carolina, United States, 28245

Job Description: We're looking for a Senior Platform Engineer/Site Reliability Engineer to complement the automation and support work for the container orchestration team (COP). This job requires expertise running Kubernetes in a production environment as well as strong IaC, GitOps, Debugging, Monitoring and Alerting skills. As a Senior Platform Engineer/Site Reliability Engineer, you will be an integral member of the COP team. The goal of the COP team is to provide a platform that facilitates deployment and runtime management of client developer workloads. Part of what will spell success for this role is outreach, integration and training with the client SRE team which is a separate team than the COP team. This will require good organizational and people skills. This role will work closely with the Sr Manager of Platform Engineering, Platform Tech Lead, and Platform Architect to ensure sound technical direction. The Senior Platform Engineer will also work with the Technical Product Owner on their Agile Platform Engineering team to facilitate delivering the most critical platform features needed for the Product Development Teams. The Senior Platform Engineer will be expected to provide guidance on DevOps practices and tools and Azure cloud resources. In addition, this position will help mentor teammates on the COP and SRE teams and serve as an escalation point for Platform Engineer I's, and II's.

Job Responsibilities: • Embrace the client culture and be a positive force for the team • Apply a proactive approach toward tasks and communication • Ability to work independently and engage fully with team Agile practices. • Automation first mindset (IAC, Scripting, etc.) • Outreach, integration and cooperation with the SRE team. • Drive forensic troubleshooting and analysis of automation and platform consistency issues • Proactively ensure the highest levels of system, resource, and infrastructure availability and performance optimization • Be a goto person for subject matter expertise for all aspects of our technology stack - Infrastructure Automation Via Terraform - Azure DevOps Pipelines - Kubernetes / AKS / EKS - CI/CD Patterns - ArgoCD

This position requires participation in an on-call rotation

Experience, Qualifications and Skills: • Minimum 8 years of experience in Platform Engineering, Software Engineering, Site Reliability Engineering, Systems Engineering, Cloud-Native Engineering, or DevOps Engineering • Expertise with containers and Kubernetes • Experience as an active participant of an Agile team using Agile development methodologies • Strong Scripting Experience • Expertise in writing Infrastructure as Code and Configuration as Code techniques • Expertise in managing multiple code bases in Git, including trunk-based and environment-based branching patterns • Expertise in creating Continuous Integration builds and deployment automation, for example, CI/CD Pipelines • Expertise in implementing observability, application monitoring, and log aggregation solutions • Expertise working with cross-functional teams to provide Next-Gen Cloud Native solutions