Global Payments Inc.

ML Ops Engineer

Global Payments Inc., Columbus, Georgia, United States, 31900

Join to apply for the

ML Ops Engineer

role at

Global Payments Inc. Description At this time, we are unable to offer visa sponsorship for this position. Candidates must be legally authorized to work in the United States (or applicable country) on a full-time basis without the need for current or future immigration sponsorship. Please note, we are not accepting candidates on H1B or OPT status. Overview We are looking for an experienced AI Ops Engineer to support our AI and ML initiatives, including GenAI platform development, deployment automation, and infrastructure optimization. You will play a critical role in building and maintaining scalable, secure, and observable systems that power scalable RAG solutions, model training platforms, and agentic AI workflows across the enterprise. Responsibilities Design and implement CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment (including LLMs, vectorDB, embedding and reranking models, governance and observability systems, and guardrails). Provision and manage AI infrastructure across cloud hyperscalers (AWS/GCP), using infrastructure-as-code tools -strong preference for Terraform-. Maintain containerized environments (Docker, Kubernetes) optimized for GPU workloads and distributed compute. Support vector database, feature store, and embedding store deployments (e.g., pgVector, Pinecone, Redis, Featureform, MongoDB Atlas, etc). Monitor and optimize performance, availability, and cost of AI workloads, using observability tools (e.g., Prometheus, Grafana, Datadog, or managed cloud offerings). Collaborate with data scientists, AI/ML engineers, and other members of the platform team to ensure smooth transitions from experimentation to production. Implement security best practices including secrets management, model access control, data encryption, and audit logging for AI pipelines. Help support the deployment and orchestration of agentic AI systems (LangChain, LangGraph, CrewAI, Copilot Studio, AgentSpace, etc.). Must Haves: 4+ years of DevOps, AI Ops, or infrastructure engineering experience. Preferably with 2+ years in AI/ML environments. Hands-on experience with cloud-native services (AWS Bedrock/SageMaker, GCP Vertex AI, or Azure ML) and GPU infrastructure management. Strong skills in CI/CD tools (GitHub Actions, ArgoCD, Jenkins) and configuration management (Ansible, Helm, etc.). Proficient in scripting languages like Python, Bash, Go, or similar is a nice plus. Experience with monitoring, logging, and alerting systems for AI/ML workloads. Deep understanding of Kubernetes and container lifecycle management. Bonus Attributes: Exposure to AI Ops tooling such as MLflow, Kubeflow, SageMaker Pipelines, or Vertex Pipelines. Familiarity with prompt engineering, model fine-tuning, and inference serving. Experience with secure AI deployment and compliance frameworks. Knowledge of model versioning, drift detection, and scalable rollback strategies. Abilities: Ability to work with high initiative, accuracy, and attention to detail. Ability to prioritize multiple assignments effectively and meet deadlines. Professional interaction with staff and customers. Excellent organization skills. Critical thinking for complex problems. Flexibility to meet business needs. Creative and independent work style with minimal supervision. Experience navigating organizational structures. Travel Required: 2% Physical Demands: Standing/Walking – minimal Sitting – moderate to high Lifting – up to 15 lbs. Visual Concentration – high Work Environment – typical office environment. Position Type and Hours: Full Time Disclaimer: The above describes the general nature and level of work. It is not exhaustive of responsibilities, duties, or skills required. Seniority level

Mid-Senior level Employment type

Full-time Job function

Engineering and Information Technology Industries

Financial Services and IT Services and IT Consulting

#J-18808-Ljbffr