Logo
Ampstek

SageMaker Platform Administrator || Atlanta, GA

Ampstek, Atlanta, Georgia, United States, 30383

Save Job

SageMaker Platform Administrator || Atlanta, GA

We are seeking an experienced SageMaker Platform Administrator to manage, maintain, and optimize the Amazon SageMaker environment for data science and machine learning teams. The role involves ensuring platform stability, managing user access and governance, optimizing costs, supporting ML workflow deployments, and collaborating with data scientists, ML engineers, and cloud operations teams to drive efficiency and compliance. Key Responsibilities Administer and maintain the Amazon SageMaker platform, including setup, configuration, upgrades, and monitoring. Manage user access controls, roles, and permissions following security and compliance policies. Oversee SageMaker Studio, Notebooks, Endpoints, Pipelines, and Model Registry. Monitor platform health, resource utilization, performance, and optimize costs across compute/storage resources. Implement and maintain automation, monitoring, and alerting for SageMaker workloads. Support data scientists and ML engineers in deploying and managing ML models at scale. Troubleshoot platform and environment-related issues, ensuring minimal downtime. Collaborate with cloud engineering teams to integrate SageMaker with other AWS services (S3, Lambda, API Gateway, EKS, CloudWatch, etc.). Establish governance, compliance, and auditing practices for ML operations. Document standard operating procedures, best practices, and guidelines for platform usage. Required Skills & Qualifications Bachelor’s or Master’s degree in Computer Science, Engineering, or related field. 4+ years of experience in AWS Cloud administration with at least 2+ years managing Amazon SageMaker. Strong understanding of AWS IAM, VPC, CloudWatch, CloudTrail, CloudFormation/Terraform. Hands-on experience with SageMaker Studio, Notebooks, Endpoints, Model Registry, and Pipelines. Experience in ML Ops practices, CI/CD for ML models, and automation of ML workflows. Strong troubleshooting skills with AWS networking, containerization (ECS/EKS/Docker), and integration with external data sources. Good knowledge of security, compliance, and cost optimization in AWS. Familiarity with Python, Boto3, or scripting for automation. Excellent communication, documentation, and collaboration skills. Nice to Have (Preferred Skills) AWS Certified Solutions Architect, SysOps Administrator, or Machine Learning Specialty. Knowledge of Databricks, Kubeflow, MLflow, or other ML platforms. Experience with CI/CD pipelines (CodePipeline, Jenkins, GitHub Actions, etc.). Exposure to data engineering tools like Glue, EMR, or Redshift. Ampstek is an equal opportunities employer and welcomes applications from all qualified candidates.

#J-18808-Ljbffr