Exadel open positions
We’re an AI-first global tech company with 25+ years of engineering leadership, 2,000+ team members, and 500+ active projects powering Fortune 500 clients, including HBO, Microsoft, Google, and Starbucks.
From AI platforms to digital transformation, we partner with enterprise leaders to build what’s next.
What powers it all? Our people are ambitious, collaborative, and constantly evolving.
About the Client The customer is one of the largest online gambling companies in the world, with over 26 million clients across all markets. The company was founded in 1997 and listed on Nasdaq Stockholm in 2004. They are committed to offering their clients the best possible deal and user experience, while ensuring a safe and fair gambling environment.
What You’ll Do Platform & Deployment
Manage and evolve ML/LLM infrastructure on Kubernetes/EKS (CPU/GPU) for multi-tenant workloads across AWS/Azure, ensuring region‑aware scheduling, cross‑region access, and artifact management
Provision cloud environments, maintain deployment workflows, and build GitOps‑native pipelines (GitLab CI, Jenkins, ArgoCD, Helm, FluxCD) for fast, safe rollouts
LLM Operations & Optimization
Deploy, scale, and optimize LLMs (GPT, Claude, etc.) with attention to prompt engineering, performance, and cost
Operate Argo Workflows for data prep, model training, and batch compute, and track model performance and drift via AI observability frameworks
CI/CD & Infrastructure as Code
Design robust CI/CD pipelines across dev, staging, and production. Implement IaC with Terraform, CloudFormation, and Helm
Manage container orchestration, secrets, and secure deployments
Observability & Reliability
Set up monitoring with Prometheus/Grafana, Splunk, CloudWatch, and ELK
Implement alerting strategies, troubleshoot production issues, and ensure high availability
Data Platform & Reproducibility
Build and maintain data pipelines and platforms (Apache Iceberg) for reproducible ML experiments, lineage tracking, and automated governance
Collaborate with data engineers for seamless integration with model training workflows
Developer Experience & Enablement
Create APIs, CLIs, and UIs for self‑serve infrastructure. Provide documentation, templates, and best practices
Treat the ML platform as a product, gathering feedback and improving usability
Architecture, Security & Governance
Define scalable, secure, and compliant platform architecture. Implement FinOps practices, cost monitoring, and multi‑tenant optimization
Drive CI/CD culture and continuous improvement across teams
What You Bring
8+ years in DevOps, Platform Engineering, or SRE, including 2+ years in MLOps/LLMOps
Hands‑on experience with AWS (Bedrock, S3, EC2, EKS, RDS/PostgreSQL, ECR, IAM, Lambda, Step Functions, CloudWatch) and Kubernetes workloads, including GPU, autoscaling, and multi‑tenant configurations
Skilled in container orchestration, secrets management, and GitOps deployments (Jenkins, ArgoCD, FluxCD)
Experience deploying and scaling LLMs (GPT, Claude-family), with prompt engineering and performance optimization
Strong Python skills (FastAPI, Django, Pydantic, boto3, Pandas, NumPy) and solid ML framework knowledge (scikit‑learn, TensorFlow, PyTorch)
Proficient in building reproducible data pipelines, IaC (Terraform, CloudFormation, Helm), CI/CD pipelines, and observability (Prometheus/Grafana, Splunk, Datadog, OpenTelemetry)
Strong networking, security, and Linux fundamentals. Excellent communicator, self‑motivated, and focused on improving developer experience
Nice to have
Experience with distributed compute frameworks such as Dask, Spark, or Ray
Familiarity with NVIDIA Triton, TorchServe, or other inference servers
Experience with ML experiment tracking platforms like Weights & Biases, MLflow, or Kubeflow
FinOps best practices and cost attribution strategies for multi‑tenant ML infrastructure
Exposure to multi‑region and multi‑cloud designs, including dataset replication strategies, compute placement, and latency optimization
Experience with LakeFS, Apache Iceberg, or Delta Lake for data versioning and lakehouse architectures
Knowledge of data transformation tools such as DBT
Experience with data pipeline orchestration tools like Airflow or Prefect
Familiarity with Snowflake or other cloud data warehouses
Understanding of responsible AI practices, model governance, and compliance frameworks
Intermediate+
Legal & Hiring Information
Exadel is proud to be an Equal Opportunity Employer committed to inclusion across minority, gender identity, sexual orientation, disability, age, and more
Reasonable accommodations are available to enable individuals with disabilities to perform essential functions
Please note: this job description is not exhaustive. Duties and responsibilities may evolve based on business needs
Your Benefits at Exadel Exadel benefits vary by location and contract type. Your recruiter will fill you in on the details.
International projects
In‑office, hybrid, or remote flexibility
Medical healthcare
Recognition program
Ongoing learning & reimbursement
Team events & local benefits
Sports compensation
We lead with trust, respect, and purpose. We believe in open dialogue, creative freedom, and mentorship that helps you grow, lead, and make a real difference. Ours is a culture where ideas are challenged, voices are heard, and your impact matters.
#J-18808-Ljbffr
From AI platforms to digital transformation, we partner with enterprise leaders to build what’s next.
What powers it all? Our people are ambitious, collaborative, and constantly evolving.
About the Client The customer is one of the largest online gambling companies in the world, with over 26 million clients across all markets. The company was founded in 1997 and listed on Nasdaq Stockholm in 2004. They are committed to offering their clients the best possible deal and user experience, while ensuring a safe and fair gambling environment.
What You’ll Do Platform & Deployment
Manage and evolve ML/LLM infrastructure on Kubernetes/EKS (CPU/GPU) for multi-tenant workloads across AWS/Azure, ensuring region‑aware scheduling, cross‑region access, and artifact management
Provision cloud environments, maintain deployment workflows, and build GitOps‑native pipelines (GitLab CI, Jenkins, ArgoCD, Helm, FluxCD) for fast, safe rollouts
LLM Operations & Optimization
Deploy, scale, and optimize LLMs (GPT, Claude, etc.) with attention to prompt engineering, performance, and cost
Operate Argo Workflows for data prep, model training, and batch compute, and track model performance and drift via AI observability frameworks
CI/CD & Infrastructure as Code
Design robust CI/CD pipelines across dev, staging, and production. Implement IaC with Terraform, CloudFormation, and Helm
Manage container orchestration, secrets, and secure deployments
Observability & Reliability
Set up monitoring with Prometheus/Grafana, Splunk, CloudWatch, and ELK
Implement alerting strategies, troubleshoot production issues, and ensure high availability
Data Platform & Reproducibility
Build and maintain data pipelines and platforms (Apache Iceberg) for reproducible ML experiments, lineage tracking, and automated governance
Collaborate with data engineers for seamless integration with model training workflows
Developer Experience & Enablement
Create APIs, CLIs, and UIs for self‑serve infrastructure. Provide documentation, templates, and best practices
Treat the ML platform as a product, gathering feedback and improving usability
Architecture, Security & Governance
Define scalable, secure, and compliant platform architecture. Implement FinOps practices, cost monitoring, and multi‑tenant optimization
Drive CI/CD culture and continuous improvement across teams
What You Bring
8+ years in DevOps, Platform Engineering, or SRE, including 2+ years in MLOps/LLMOps
Hands‑on experience with AWS (Bedrock, S3, EC2, EKS, RDS/PostgreSQL, ECR, IAM, Lambda, Step Functions, CloudWatch) and Kubernetes workloads, including GPU, autoscaling, and multi‑tenant configurations
Skilled in container orchestration, secrets management, and GitOps deployments (Jenkins, ArgoCD, FluxCD)
Experience deploying and scaling LLMs (GPT, Claude-family), with prompt engineering and performance optimization
Strong Python skills (FastAPI, Django, Pydantic, boto3, Pandas, NumPy) and solid ML framework knowledge (scikit‑learn, TensorFlow, PyTorch)
Proficient in building reproducible data pipelines, IaC (Terraform, CloudFormation, Helm), CI/CD pipelines, and observability (Prometheus/Grafana, Splunk, Datadog, OpenTelemetry)
Strong networking, security, and Linux fundamentals. Excellent communicator, self‑motivated, and focused on improving developer experience
Nice to have
Experience with distributed compute frameworks such as Dask, Spark, or Ray
Familiarity with NVIDIA Triton, TorchServe, or other inference servers
Experience with ML experiment tracking platforms like Weights & Biases, MLflow, or Kubeflow
FinOps best practices and cost attribution strategies for multi‑tenant ML infrastructure
Exposure to multi‑region and multi‑cloud designs, including dataset replication strategies, compute placement, and latency optimization
Experience with LakeFS, Apache Iceberg, or Delta Lake for data versioning and lakehouse architectures
Knowledge of data transformation tools such as DBT
Experience with data pipeline orchestration tools like Airflow or Prefect
Familiarity with Snowflake or other cloud data warehouses
Understanding of responsible AI practices, model governance, and compliance frameworks
Intermediate+
Legal & Hiring Information
Exadel is proud to be an Equal Opportunity Employer committed to inclusion across minority, gender identity, sexual orientation, disability, age, and more
Reasonable accommodations are available to enable individuals with disabilities to perform essential functions
Please note: this job description is not exhaustive. Duties and responsibilities may evolve based on business needs
Your Benefits at Exadel Exadel benefits vary by location and contract type. Your recruiter will fill you in on the details.
International projects
In‑office, hybrid, or remote flexibility
Medical healthcare
Recognition program
Ongoing learning & reimbursement
Team events & local benefits
Sports compensation
We lead with trust, respect, and purpose. We believe in open dialogue, creative freedom, and mentorship that helps you grow, lead, and make a real difference. Ours is a culture where ideas are challenged, voices are heard, and your impact matters.
#J-18808-Ljbffr