Apolis
Cloud Ops Lead
Location: New York, New York
JOB DESCRIPTION:
Architect AWS Solutions - Infrastructure Services
Experience: 12+ Years
Primary Skills: AWS Cloud Solutions and Operations, Architecting and Automation of Cloud Infrastructure, AI/ML Integration
Secondary Skills: Good experience in automation scripts such as PowerShell, AWS CLI, Python, JSON, and familiarity with AI/ML frameworks and tools.
Responsibilities:
* Architect, design, and implement scalable, secure, and cost-optimized cloud solutions leveraging the latest AWS offerings, including generative AI services (e.g., Amazon Bedrock, SageMaker, Amazon Q, CodeWhisperer).
* Deep, hands-on technical expertise in AWS, including new services such as Lambda SnapStart, Graviton-based compute, and advanced analytics (e.g., Amazon QuickSight, Redshift Serverless).
* Support and administration of AWS IaaS and PaaS, including RDS, Athena, DynamoDB, EFS, ElastiCache, Kinesis Firehose, S3, Route53, SNS, Lambda, Data Pipeline, and new AI/ML services.
* Implement and manage monitoring, analytics, and optimization tools (AWS CloudWatch, AWS CloudTrail, AWS Cost Explorer, and AI-driven observability tools).
* Operational understanding of securing cloud instances, including AI-powered security tools (e.g., Amazon GuardDuty, Macie, Inspector).
* Expertise in Linux administration, AWS VPC, subnet management, and troubleshooting.
* Develop and maintain cloud automation scripts/code (Perl, JSON, PowerShell, Terraform, AWS CloudFormation, CDK), including integration with AI/ML pipelines.
* Identify and resolve cloud performance bottlenecks using architectural and AI-driven performance analytics.
* Analyze AWS CloudTrail logs and aggregated log files for advanced troubleshooting, leveraging AI-based log analysis where appropriate.
* Identify and implement cost-saving strategies using AWS's latest cost management and AI-powered optimization tools.
* Understanding of Cloud Platform Engineering, SRE, and AI/ML Ops best practices.
* Good knowledge of application build/release processes, CI/CD pipelines (Jenkins, Chef/Puppet, AWS CodePipeline, CodeBuild, and integration with AI-powered DevOps tools).
* Familiarity with Agile processes and ability to collaborate with cross-functional teams (Development, Infrastructure, Security, Testing, QA, and Data Science/AI teams).
* Analyze customer business and technical requirements, assess environments for cloud and AI enablement, and advise on cloud and AI/ML solutions and risk management.
* Participate in customer cloud and AI/ML pre-sales responses and projects.
* Strong passion for technology exploration, AI/ML development, and continuous learning.
* Excellent written and verbal communication, presentation, and collaboration skills.
* Team leadership skills.
NICE TO HAVE
Assist with backups and recovery, including AI-driven backup optimization.
Deploy applications, performance tuning, troubleshooting, maintain security, and automate routine procedures through scripting and AI-based automation.
Integrate 3rd party tools, APIs, and AI/ML services in AWS environments.
Knowledge of ServiceNow and its integration with AWS and AI/ML workflows is recommended
Location: New York, New York
JOB DESCRIPTION:
Architect AWS Solutions - Infrastructure Services
Experience: 12+ Years
Primary Skills: AWS Cloud Solutions and Operations, Architecting and Automation of Cloud Infrastructure, AI/ML Integration
Secondary Skills: Good experience in automation scripts such as PowerShell, AWS CLI, Python, JSON, and familiarity with AI/ML frameworks and tools.
Responsibilities:
* Architect, design, and implement scalable, secure, and cost-optimized cloud solutions leveraging the latest AWS offerings, including generative AI services (e.g., Amazon Bedrock, SageMaker, Amazon Q, CodeWhisperer).
* Deep, hands-on technical expertise in AWS, including new services such as Lambda SnapStart, Graviton-based compute, and advanced analytics (e.g., Amazon QuickSight, Redshift Serverless).
* Support and administration of AWS IaaS and PaaS, including RDS, Athena, DynamoDB, EFS, ElastiCache, Kinesis Firehose, S3, Route53, SNS, Lambda, Data Pipeline, and new AI/ML services.
* Implement and manage monitoring, analytics, and optimization tools (AWS CloudWatch, AWS CloudTrail, AWS Cost Explorer, and AI-driven observability tools).
* Operational understanding of securing cloud instances, including AI-powered security tools (e.g., Amazon GuardDuty, Macie, Inspector).
* Expertise in Linux administration, AWS VPC, subnet management, and troubleshooting.
* Develop and maintain cloud automation scripts/code (Perl, JSON, PowerShell, Terraform, AWS CloudFormation, CDK), including integration with AI/ML pipelines.
* Identify and resolve cloud performance bottlenecks using architectural and AI-driven performance analytics.
* Analyze AWS CloudTrail logs and aggregated log files for advanced troubleshooting, leveraging AI-based log analysis where appropriate.
* Identify and implement cost-saving strategies using AWS's latest cost management and AI-powered optimization tools.
* Understanding of Cloud Platform Engineering, SRE, and AI/ML Ops best practices.
* Good knowledge of application build/release processes, CI/CD pipelines (Jenkins, Chef/Puppet, AWS CodePipeline, CodeBuild, and integration with AI-powered DevOps tools).
* Familiarity with Agile processes and ability to collaborate with cross-functional teams (Development, Infrastructure, Security, Testing, QA, and Data Science/AI teams).
* Analyze customer business and technical requirements, assess environments for cloud and AI enablement, and advise on cloud and AI/ML solutions and risk management.
* Participate in customer cloud and AI/ML pre-sales responses and projects.
* Strong passion for technology exploration, AI/ML development, and continuous learning.
* Excellent written and verbal communication, presentation, and collaboration skills.
* Team leadership skills.
NICE TO HAVE
Assist with backups and recovery, including AI-driven backup optimization.
Deploy applications, performance tuning, troubleshooting, maintain security, and automate routine procedures through scripting and AI-based automation.
Integrate 3rd party tools, APIs, and AI/ML services in AWS environments.
Knowledge of ServiceNow and its integration with AWS and AI/ML workflows is recommended