Logo
Fetcherr

DevOps Team Leader

Fetcherr, Miami, Florida, us, 33222

Save Job

Overview

Fetcherr is building AI-powered solutions with its Large Market Model (LMM). We are looking for a

DevOps Team Leader

to guide our growing DevOps team in building and maintaining highly available, scalable, and automated infrastructure. You’ll combine hands-on technical expertise with strong leadership skills to mentor engineers, define best practices, and ensure smooth collaboration across development, data, and product teams. Responsibilities

Lead, mentor, and grow a team of DevOps engineers, fostering a culture of ownership, excellence, and continuous improvement Architect, maintain, and optimize our cloud infrastructure on Google Cloud Platform (GCP) for scalability, performance, and security Oversee Kubernetes and Terraform environments in production, ensuring high availability and efficient deployments Drive automation in CI/CD pipelines, release processes, and infrastructure management Implement and maintain robust monitoring, alerting, and logging systems (Prometheus, EFK, GCP Monitoring) to ensure proactive incident detection and resolution Collaborate with development, data, and product teams to design and deliver infrastructure that meets evolving product and customer needs Establish infrastructure as code (IaC) standards and ensure consistent adoption across the team Manage and improve internal tooling for infrastructure, deployments, and developer productivity Stay up to date with emerging technologies and evaluate their potential impact on our platform Requirements

You’ll be a great fit if you have... 6+ years in DevOps, Site Reliability Engineering, or Software Configuration Management roles 2+ years in a team leadership or management position, with proven ability to mentor and guide engineers Strong experience managing Kubernetes in production and writing/maintaining complex Helm charts Proven expertise in Terraform and Infrastructure as Code principles Proficiency in Bash, Python, and at least one additional scripting or programming language Hands-on experience with GCP services and deployments Experience with CI/CD tools and pipelines (ArgoCD, Jenkins, GitLab CI, etc.) Solid monitoring and alerting background with Prometheus, Grafana, and GCP Monitoring Strong problem-solving skills and the ability to handle production incidents calmly and effectively Excellent communication skills, with the ability to work cross-functionally and present technical concepts to both technical and non-technical audiences Nice to Have

Experience with ArgoCD or Kubernetes operator development Exposure to Big Data or ML Ops environments Familiarity with Airflow, Kubeflow, or MLFlow DBA experience and strong SQL skills Experience in Agile/Scrum environments Seniority level

Not Applicable Employment type

Full-time Job function

Management Industries Software Development

#J-18808-Ljbffr