Logo
Talentmatchmakers

Senior DevSecOps Engineer

Talentmatchmakers, Romania, Pennsylvania, United States

Save Job

ABOUT THE ROLE Looking for a Senior DevSecOps Engineer who will lead the design, automation, and security of our client's AI platform infrastructure. This role combines hands‑on cloud engineering with strategic planning to ensure their AI systems are scalable, observable, and compliant across healthcare‑grade production environments.

You’ll own the reliability, performance, and security of infrastructure supporting AI orchestration, retrieval pipelines, and LLM services—powering next‑generation healthcare applications built on our client's AI platform.

You will be the first dedicated DevSecOps in the PlatformEngineering team, the team that owns infrastructure, AI platform development, and reliability.

SALARY $7.500 - 10.400 GROSS/month;

Collaboration available through CIM/PFA/SRL.

Fully remote (Romania) or flexible remote (Cluj‑Napoca).

ABOUT THE PRODUCT Our client's platform enables healthcare organizations to deploy secure, agentic AI systems at scale. It powers applications like The Analyst (LLM orchestration and agent workflow management) and The Librarian (document intelligence and semantic search with vector embeddings).

The infrastructure spans AWS microservices, containerized workloads, and compliance‑ready data systems optimized for safety and reliability in healthcare contexts.

TECH STACK AWS (ECS, EC2, EKS, Aurora, RDS, VPC, S3), Terraform, Docker, Kubernetes, GitHub Actions, CodePipeline, CloudWatch, Grafana, OpenTelemetry, AWS GuardDuty, AWS IAM, Secrets Manager, SSM Parameter Store, Dependabot/Renovate, Trivy/Grype and Sentry.

WHY THIS ROLE MATTERS Healthcare demands trust, reliability, and security—especially for AI systems.

The Senior DevSecOps Engineer ensures our client's AI infrastructure meets these standards at scale. By combining automation, observability, and compliance‑driven engineering, this role underpins their mission to make advanced AI safe, performant, and impactful across the healthcare ecosystem.

DUTIES AND RESPONSIBILITIES

Infrastructure & Automation

Architect and manage AWS infrastructure (ECS, EC2, EKS, Aurora, RDS, VPC, S3) using Terraform and modern DevOps practices.

Build and maintain CI/CD pipelines (GitHub Actions, AWS CodePipeline) for automated, reliable deployments.

Design scalable container orchestration with Docker and Kubernetes (EKS) across multiple production environments.

Continuously optimize system performance, cost efficiency, and deployment velocity.

Security & Compliance

Enforce least‑privilege IAM policies, secrets management, and network security controls.

Integrate automated vulnerability scanning, dependency management, and audit workflows.

Maintain SOC2/HIPAA‑compliant infrastructure through observability, logging, and traceability.

Observability & Reliability

Develop and manage observability stacks (CloudWatch, Grafana, OpenTelemetry) for metrics and distributed tracing.

Define and track SLIs/SLOs to ensure high service reliability and predictable performance.

Lead incident response, root cause analysis, and continuous reliability improvements.

Collaboration & Leadership

Partner with AI Platform Engineers to deploy and scale LLM, RAG, and vector‑based workloads.

Mentor teammates on infrastructure, automation, and DevSecOps best practices.

Maintain architecture diagrams, runbooks, and ADRs for infrastructure components.

QUALIFICATIONS AND EXPERIENCE Required

7+ years of DevOps, SRE, or Platform Engineering experience in production environments.

Deep expertise with AWS (ECS, EC2, EKS, Aurora/RDS, IAM, VPCs, networking, and security).

Strong proficiency with Terraform, Docker, and Kubernetes (EKS) for orchestration and automation.

Hands‑on experience managing AI/LLM infrastructure, including model orchestration, embedding APIs, and vector databases.

Solid understanding of PostgreSQL operations, performance tuning, and observability.

Proven track record implementing security, compliance, and reliability best practices in healthcare or regulated industries.

Preferred

Experience with asynchronous pipelines, task queues, or workflow orchestration systems.

Familiarity with observability tools (CloudWatch, Grafana, Sentry, OpenTelemetry).

Background in HIPAA‑compliant or SOC2‑audited production environments.

SUCCESS IN THIS ROLE LOOKS LIKE

Secure, scalable infrastructure powering AI systems in production healthcare environments.

Automated and observable deployments that enable rapid iteration and low operational friction.

Measurable improvements in reliability, performance, and compliance posture.

Effective cross‑functional collaboration that accelerates AI delivery across teams.

#J-18808-Ljbffr