Fetcherr
Overview
Fetcherr is building AI-powered solutions with its Large Market Model (LMM). We are looking for a
DevOps Team Leader
to guide our growing DevOps team in building and maintaining highly available, scalable, and automated infrastructure. You’ll combine hands-on technical expertise with strong leadership skills to mentor engineers, define best practices, and ensure smooth collaboration across development, data, and product teams. Responsibilities
Lead, mentor, and grow a team of DevOps engineers, fostering a culture of ownership, excellence, and continuous improvement Architect, maintain, and optimize our cloud infrastructure on Google Cloud Platform (GCP) for scalability, performance, and security Oversee Kubernetes and Terraform environments in production, ensuring high availability and efficient deployments Drive automation in CI/CD pipelines, release processes, and infrastructure management Implement and maintain robust monitoring, alerting, and logging systems (Prometheus, EFK, GCP Monitoring) to ensure proactive incident detection and resolution Collaborate with development, data, and product teams to design and deliver infrastructure that meets evolving product and customer needs Establish infrastructure as code (IaC) standards and ensure consistent adoption across the team Manage and improve internal tooling for infrastructure, deployments, and developer productivity Stay up to date with emerging technologies and evaluate their potential impact on our platform Requirements
You’ll be a great fit if you have... 6+ years in DevOps, Site Reliability Engineering, or Software Configuration Management roles 2+ years in a team leadership or management position, with proven ability to mentor and guide engineers Strong experience managing Kubernetes in production and writing/maintaining complex Helm charts Proven expertise in Terraform and Infrastructure as Code principles Proficiency in Bash, Python, and at least one additional scripting or programming language Hands-on experience with GCP services and deployments Experience with CI/CD tools and pipelines (ArgoCD, Jenkins, GitLab CI, etc.) Solid monitoring and alerting background with Prometheus, Grafana, and GCP Monitoring Strong problem-solving skills and the ability to handle production incidents calmly and effectively Excellent communication skills, with the ability to work cross-functionally and present technical concepts to both technical and non-technical audiences Nice to Have
Experience with ArgoCD or Kubernetes operator development Exposure to Big Data or ML Ops environments Familiarity with Airflow, Kubeflow, or MLFlow DBA experience and strong SQL skills Experience in Agile/Scrum environments Seniority level
Not Applicable Employment type
Full-time Job function
Management Industries Software Development
#J-18808-Ljbffr
Fetcherr is building AI-powered solutions with its Large Market Model (LMM). We are looking for a
DevOps Team Leader
to guide our growing DevOps team in building and maintaining highly available, scalable, and automated infrastructure. You’ll combine hands-on technical expertise with strong leadership skills to mentor engineers, define best practices, and ensure smooth collaboration across development, data, and product teams. Responsibilities
Lead, mentor, and grow a team of DevOps engineers, fostering a culture of ownership, excellence, and continuous improvement Architect, maintain, and optimize our cloud infrastructure on Google Cloud Platform (GCP) for scalability, performance, and security Oversee Kubernetes and Terraform environments in production, ensuring high availability and efficient deployments Drive automation in CI/CD pipelines, release processes, and infrastructure management Implement and maintain robust monitoring, alerting, and logging systems (Prometheus, EFK, GCP Monitoring) to ensure proactive incident detection and resolution Collaborate with development, data, and product teams to design and deliver infrastructure that meets evolving product and customer needs Establish infrastructure as code (IaC) standards and ensure consistent adoption across the team Manage and improve internal tooling for infrastructure, deployments, and developer productivity Stay up to date with emerging technologies and evaluate their potential impact on our platform Requirements
You’ll be a great fit if you have... 6+ years in DevOps, Site Reliability Engineering, or Software Configuration Management roles 2+ years in a team leadership or management position, with proven ability to mentor and guide engineers Strong experience managing Kubernetes in production and writing/maintaining complex Helm charts Proven expertise in Terraform and Infrastructure as Code principles Proficiency in Bash, Python, and at least one additional scripting or programming language Hands-on experience with GCP services and deployments Experience with CI/CD tools and pipelines (ArgoCD, Jenkins, GitLab CI, etc.) Solid monitoring and alerting background with Prometheus, Grafana, and GCP Monitoring Strong problem-solving skills and the ability to handle production incidents calmly and effectively Excellent communication skills, with the ability to work cross-functionally and present technical concepts to both technical and non-technical audiences Nice to Have
Experience with ArgoCD or Kubernetes operator development Exposure to Big Data or ML Ops environments Familiarity with Airflow, Kubeflow, or MLFlow DBA experience and strong SQL skills Experience in Agile/Scrum environments Seniority level
Not Applicable Employment type
Full-time Job function
Management Industries Software Development
#J-18808-Ljbffr