Logo
C the Signs

Lead CloudOps Engineer

C the Signs, New York, New York, us, 10261

Save Job

We are looking for a hands‑on Lead CloudOps Engineer to oversee the reliability, scalability, automation, and day‑to‑day operations of our GCP‑based cloud platform. You will drive infrastructure automation, improve developer workflows, enhance observability, and ensure secure, stable platform operations.

While GCP is the primary environment, the role includes operational responsibility for an existing AWS enterprise environment, requiring the ability to troubleshoot issues, maintain existing systems, and support partner teams without owning major AWS architectural redesigns.

This position is ideal for someone who thrives in cloud‑native environments, enjoys automation, and balances engineering rigor with operational excellence.

This is a founding member of the CloudOps team in the US and has a potential to grow into future leadership and management positions.

Responsibilities: GCP Platform Operations & Engineering

Lead day‑to‑day monitoring and management of GCP infrastructure, focusing on reliability, uptime, security, performance, and compliance

Manage GKE clusters, including cluster lifecycle, node pools, workload deployment, and operational best practices

Implement and maintain GCP networking: VPCs, firewall rules, service networking, and private connectivity

Support data and application teams using GCP services such as BigQuery, Cloud SQL, Pub/Sub, Cloud Storage, Cloud Run

Infrastructure as Code & Automation

Own and maintain Terraform configurations for GCP and AWS using reusable modules, remote state, policy checks, and automation pipelines

Automate environment provisioning, scaling, and configuration with CI/CD tools such as Cloud Build, GitHub Actions, ArgoCD, or Jenkins

Build tooling and workflows that improve developer productivity (automated builds, deployments, secrets management, ephemerally mentioned environments)

Monitoring, Observability & Incident Response

Build and enhance observability stacks using Cloud Monitoring, Prometheus/Grafana, ELK/Elastic, or OpenTelemetry

Lead incident response, troubleshooting, RCA generation, and post‑incident improvement efforts

Define and manage SLOs, error budgets, and operational runbooks

Security, Compliance & Cloud Governance

Ensure secure configurations across cloud services, Kubernetes workloads, secrets storage, and network boundaries

Implement guardrails and compliance automation using IAM best practices, GCP Organization Policies, and Terraform checks

Work with security and compliance teams to meet HIPAA, HITRUST, SOC 2, or internal audit requirements

AWS Operational Support (backup support for existing Platform Only)

Maintain stability of a pre‑existing AWS environment by performing tasks such as:

Reviewing IAM roles and security posture

Supporting workloads on EC2, ECS, EKS, RDS, S3

Troubleshooting infrastructure or networking issues

Managing configurations, upgrades, and patching

Assist teams that rely on AWS‑hosted systems and ensure smooth integration with GCP‑centric operations

Make small‑to‑medium improvements or automation updates for AWS infrastructure using Terraform or CI/CD workflows

Leadership & Cross‑Team Collaboration

Mentor DevOps, CloudOps, and Platform Engineers through pair programming, reviews, and best‑practice sharing

Partner with development, data, and security teams to build highly reliable, cloud‑native applications and pipelines

Establish operational standards, documentation, and playbooks for cloud operations

Requirements

8 years of DevOps, CloudOps, or platform engineering experience

Deep hands‑on experience with GCP, including: GKE, workload identity, cluster networking, VPC design, firewalls, load balancers, BigQuery, Pub/Sub, Cloud SQL, Cloud Storage, Cloud Run, Cloud Functions, IAM, KMS, Secret Manager

Strong expertise with Terraform, including modules, workspaces, and governance patterns

Strong CI/CD experience with Git‑based workflows and pipeline automation

Solid understanding of Linux, networking basics, containerization, and distributed systems

Experience supporting production workloads in a regulated environment (HIPAA, HITRUST, SOC 2 or similar)

AWS (Operational Proficiency Required)

Practical experience supporting AWS operations (not architecture‑heavy), including: EC2, EKS, and ALB/NLB, S3, RDS, CloudWatch, IAM troubleshooting, network troubleshooting (VPC, Security Groups, Route 53), comfortable maintaining and improving existing AWS infrastructure

Preferred

Experience with GitOps tools (ArgoCD, Flux)

Familiarity with service mesh (Istio, Anthos) or advanced networking

Experience with policy‑as‑code (OPA/Gatekeeper or Sentinel)

Background with FinOps/cost optimization

Experience building internal developer platforms or platform engineering teams

Benefits Why Join Us? Joining C the Signs is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.

Competitive salary and benefits package

Flexible working arrangements (remote or hybrid options available)

The opportunity to work on life‑changing AI technology that directly impacts patient outcomes

Join a team that combines cutting‑edge innovation with a mission to save lives and improve health equity

Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare

#J-18808-Ljbffr