TEKsystems
Job Title
AI Site Reliability Engineer
(Contractor, IC5 level)
Location Remote – San Francisco, CA
Employment Type Contract
Duration Open until the end of January (possible extension)
Compensation $90.00 – $100.00 per hour
Description We are seeking an experienced Site Reliability Engineer (SRE) to join our IT AI Infrastructure team. The role focuses on deploying, managing, and optimizing AI‑powered productivity tools and in‑house AI solutions that enhance employee efficiency at scale. The ideal candidate brings deep expertise in public cloud infrastructure (AWS/GCP), backend development (Python, Go, or Java), and automation tooling, and is passionate about building scalable, reliable AI infrastructure while maintaining rigorous security and compliance standards.
Responsibilities
Manage and enhance AI‑driven employee productivity tools (e.g., Glean, Google Workspace, Slack AI)
Deploy, configure, and manage AI‑powered employee productivity tools and in‑house AI solutions
Ensure high availability, reliability, and optimal performance of AI platforms and services through monitoring, alerting, and incident response procedures
Design and implement scalable infrastructure to support growing AI tool demands and user base
Develop and maintain automation scripts and tools (Terraform, Ansible, Bash, Python) to streamline deployment, monitoring, and maintenance tasks
Contribute to experimental sandbox environments for testing new AI solutions
Collaborate with cross‑functional teams (Machine Learning, HR, Security, Data Science, Developer Experience) to support integration of AI solutions
Provide technical support and troubleshooting for AI‑related issues
Implement comprehensive monitoring and metrics to track performance and health of AI systems
Participate in incident response and develop incident response plans for AI‑related outages or performance issues
Build scaffolding APIs for unsupported Glean features to extend functionality
Contribute to backend development tasks to support the integration and functionality of AI tools
Assess and mitigate security risks in AI systems, ensuring compliance with regulatory requirements and company security policies
Qualifications
Proven experience as a Site Reliability Engineer or equivalent role
Strong understanding of AI technologies and platforms
Experience deploying and managing applications in a cloud environment (AWS/GCP)
Solid backend development experience (Python, Java, or Go)
Proficiency in managing and configuring public cloud services for scalability and reliability
Experience with automation tools and scripting (Ansible, Terraform, Bash, Python)
Excellent troubleshooting and problem‑solving skills
Strong communication and collaboration skills, able to present technical information to non‑technical audiences including leadership
Strong security and compliance understanding, with experience working in highly regulated environments
Experience in a fast‑paced, high‑growth company
Skills
Artificial intelligence, Glean, GCP
Python, Go, Java
Terraform, Ansible, Bash
Cloud‑native services (AWS, GCP)
Observability (logging, metrics, dashboards)
Incident response and disaster recovery planning
Security and compliance governance
Benefits
Medical, dental & vision
Critical Illness, Accident, and Hospital coverage
401(k) Retirement Plan – Pre‑tax and Roth options
Life Insurance (Voluntary Life & AD&D for employee and dependents)
Short and long‑term disability
Health Spending Account (HSA)
Transportation benefits
Employee Assistance Program
Paid Time Off/Leave (PTO, vacation, sick leave)
Application Deadline This position is anticipated to close on
Nov 12, 2025 .
About the Employer TEKsystems is an Allegis Group company that partners with clients to drive digital transformation. We provide full‑stack technology services and talent solutions to clients worldwide. The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.
#J-18808-Ljbffr
(Contractor, IC5 level)
Location Remote – San Francisco, CA
Employment Type Contract
Duration Open until the end of January (possible extension)
Compensation $90.00 – $100.00 per hour
Description We are seeking an experienced Site Reliability Engineer (SRE) to join our IT AI Infrastructure team. The role focuses on deploying, managing, and optimizing AI‑powered productivity tools and in‑house AI solutions that enhance employee efficiency at scale. The ideal candidate brings deep expertise in public cloud infrastructure (AWS/GCP), backend development (Python, Go, or Java), and automation tooling, and is passionate about building scalable, reliable AI infrastructure while maintaining rigorous security and compliance standards.
Responsibilities
Manage and enhance AI‑driven employee productivity tools (e.g., Glean, Google Workspace, Slack AI)
Deploy, configure, and manage AI‑powered employee productivity tools and in‑house AI solutions
Ensure high availability, reliability, and optimal performance of AI platforms and services through monitoring, alerting, and incident response procedures
Design and implement scalable infrastructure to support growing AI tool demands and user base
Develop and maintain automation scripts and tools (Terraform, Ansible, Bash, Python) to streamline deployment, monitoring, and maintenance tasks
Contribute to experimental sandbox environments for testing new AI solutions
Collaborate with cross‑functional teams (Machine Learning, HR, Security, Data Science, Developer Experience) to support integration of AI solutions
Provide technical support and troubleshooting for AI‑related issues
Implement comprehensive monitoring and metrics to track performance and health of AI systems
Participate in incident response and develop incident response plans for AI‑related outages or performance issues
Build scaffolding APIs for unsupported Glean features to extend functionality
Contribute to backend development tasks to support the integration and functionality of AI tools
Assess and mitigate security risks in AI systems, ensuring compliance with regulatory requirements and company security policies
Qualifications
Proven experience as a Site Reliability Engineer or equivalent role
Strong understanding of AI technologies and platforms
Experience deploying and managing applications in a cloud environment (AWS/GCP)
Solid backend development experience (Python, Java, or Go)
Proficiency in managing and configuring public cloud services for scalability and reliability
Experience with automation tools and scripting (Ansible, Terraform, Bash, Python)
Excellent troubleshooting and problem‑solving skills
Strong communication and collaboration skills, able to present technical information to non‑technical audiences including leadership
Strong security and compliance understanding, with experience working in highly regulated environments
Experience in a fast‑paced, high‑growth company
Skills
Artificial intelligence, Glean, GCP
Python, Go, Java
Terraform, Ansible, Bash
Cloud‑native services (AWS, GCP)
Observability (logging, metrics, dashboards)
Incident response and disaster recovery planning
Security and compliance governance
Benefits
Medical, dental & vision
Critical Illness, Accident, and Hospital coverage
401(k) Retirement Plan – Pre‑tax and Roth options
Life Insurance (Voluntary Life & AD&D for employee and dependents)
Short and long‑term disability
Health Spending Account (HSA)
Transportation benefits
Employee Assistance Program
Paid Time Off/Leave (PTO, vacation, sick leave)
Application Deadline This position is anticipated to close on
Nov 12, 2025 .
About the Employer TEKsystems is an Allegis Group company that partners with clients to drive digital transformation. We provide full‑stack technology services and talent solutions to clients worldwide. The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information, or any other characteristic protected by law.
#J-18808-Ljbffr