Logo
Flanksource Inc.

Site Reliability Engineer

Flanksource Inc., Mission, Kansas, United States

Save Job

Design and maintain Kubernetes clusters across multiple environments (development, staging, production) Build automation for cluster deployment, configuration, and management Monitor and troubleshoot clusters to ensure high availability and optimal performance Implement security best practices for Kubernetes and underlying infrastructure Participate in incident response and work to reduce Mean Time To Recovery (MTTR) Enhance the reliability and scalability of our Kubernetes infrastructure Manage CI/CD pipelines and DevOps tooling Collaborate with development teams on deployment strategies and best practices Requirements

Infrastructure as Code

- Experience with 2+ IaC tools (Terraform, Pulumi, etc.) Monitoring & Observability

- Proficiency with Prometheus, Grafana, and related tools Cloud Platforms

- Hands-on experience with AWS, Azure, or GCP CI/CD

- Knowledge of GitHub Actions, GitLab CI, or Azure DevOps Networking & Security

- Understanding of network fundamentals and security best practices Problem-solving

- Strong analytical and troubleshooting abilities Communication

- Fluent English for remote asynchronous work Self-motivated

- Ability to work independently with an agile approach Nice-to-haves

Experience with GitOps tools (Flux, ArgoCD) Go programming knowledge or willingness to learn Active open-source contributions Experience developing Kubernetes operators or controllers 100% remote work with flexible hours Work with cutting-edge cloud-native technologies Contribute to open-source projects Collaborative, distributed team environment Opportunity to shape the future of Kubernetes tooling

#J-18808-Ljbffr