GEICO
Overview
The Geico AI Agent Platform team is seeking an exceptional Senior Staff Software Engineer to build the next generation enterprise Agent OS and SDKs. You will design, implement, and maintain scalable frontend and backend systems that enable business, product, and engineering teams to build, test, and deploy AI agents and workflows. Excellent communication and a proven track record of delivering business value via technical excellence are essential. Base pay range
$115,000.00/yr - $300,000.00/yr Additional compensation
Annual Bonus Location
Location - Remote Responsibilities
Architect and implement scalable multi-tenant backend systems for building AI agent workflows, including agent configuration, offline evaluation, synthetic data generation, workflow simulation, and agent marketplace, using Azure Kubernetes Service (AKS), FastAPI, etc., ensuring economy of scale and maintenance cost control. Collaborate with Design to architect and implement frontend experiences and workflows for onboarding technical and non-technical stakeholders, maximizing user adoption and successful AI agent development. Develop observability frameworks to ensure 99.9%+ uptime for AI agent platforms through robust monitoring, alerting, and incident response. Evaluate and integrate cutting-edge GenAI frameworks, libraries, and vendors to maintain a state-of-the-art stack, including hybrid cloud solutions with AWS/GCP as backup or specialized use cases. Architect and implement scalable, high-performance machine learning platforms and systems capable of processing large data volumes and supporting real-time decision making. DevOps / Application Lifecycle Management
Oversee the end-to-end lifecycle of AI agent applications, ensuring robust testing, deployment, and ongoing monitoring. Ensure adherence to production readiness standards, security protocols, and regulatory compliance throughout the development lifecycle. Optimize platform performance to reduce latency and improve throughput. Design and implement backup, recovery, and business continuity plans for hosted platform applications and services. Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools. Technical Leadership
Act as the tech lead across multiple sub-teams, setting technical direction and ensuring design consistency and best practices. Provide mentorship during design reviews, code assessments, and performance tuning. Lead by example in tackling complex technical challenges and driving system-wide architectural improvements. Establish and champion engineering standards for ML infrastructure, deployment practices, and operational procedures. Create technical documentation, runbooks, and deliver internal training on platform capabilities. Cross-Functional Collaboration
Work with data scientists, software engineers, and product teams to deploy ML systems into production. Translate complex technical concepts into actionable insights for technical and non-technical stakeholders. Foster collaboration and knowledge sharing across teams; present platform roadmaps to leadership and stakeholders. Qualifications
Bachelors degree in Computer Science, Engineering, Mathematics, or related field; advanced degree (MS/PhD) desirable. 10+ years designing, implementing, and maintaining multi-tenant AIML systems and platforms in production. 10+ years of cloud experience (Azure and AWS). Experience with large-scale data pipelines and real-time inference systems; lifecycle management of AI Agent/AI ML systems, including configuration, evaluation, monitoring, observability, and AuthN/AuthR considerations. 8+ years working with backend systems/tools (Kubernetes, Temporal, OpenSearch, PostgreSQL, Redis, Neo4J, etc.); Docker and container optimization; experience with Prometheus, Grafana, OpenTelemetry, and distributed tracing. 4+ years building front-end web applications (React and/or Next.js). Proficiency in Python, Java, Go, etc., with strong coding practices. Bonus for AI coding tool usage. Proficiency with AI/ML frameworks such as TensorFlow, PyTorch, Langraph, etc. Leadership Skills
Mentor engineers and lead technical initiatives; strong communication across seniority levels. Preferred Skills
Knowledge of AI safety, model governance, and regulatory compliance. Experience in regulated industries with data privacy and cybersecurity review processes. Experience with AI agent platforms and capabilities (Langsmith/Langraph, Autogen, N8N, Crew.ai, Dify.ai, etc.). Experience building LLM-based AI agent workflows with no-code/low-code and traditional high-code approaches. Experience using open-source (e.g., llama, Qwen, Mistral) and proprietary LLMs. Industries
Insurance Additional notes: Referrals may increase interview chances. This posting includes standard EEO statements as applicable. Source: GEICO external posting for Senior Staff Software Engineer AI Agent Platform Remote. #J-18808-Ljbffr
The Geico AI Agent Platform team is seeking an exceptional Senior Staff Software Engineer to build the next generation enterprise Agent OS and SDKs. You will design, implement, and maintain scalable frontend and backend systems that enable business, product, and engineering teams to build, test, and deploy AI agents and workflows. Excellent communication and a proven track record of delivering business value via technical excellence are essential. Base pay range
$115,000.00/yr - $300,000.00/yr Additional compensation
Annual Bonus Location
Location - Remote Responsibilities
Architect and implement scalable multi-tenant backend systems for building AI agent workflows, including agent configuration, offline evaluation, synthetic data generation, workflow simulation, and agent marketplace, using Azure Kubernetes Service (AKS), FastAPI, etc., ensuring economy of scale and maintenance cost control. Collaborate with Design to architect and implement frontend experiences and workflows for onboarding technical and non-technical stakeholders, maximizing user adoption and successful AI agent development. Develop observability frameworks to ensure 99.9%+ uptime for AI agent platforms through robust monitoring, alerting, and incident response. Evaluate and integrate cutting-edge GenAI frameworks, libraries, and vendors to maintain a state-of-the-art stack, including hybrid cloud solutions with AWS/GCP as backup or specialized use cases. Architect and implement scalable, high-performance machine learning platforms and systems capable of processing large data volumes and supporting real-time decision making. DevOps / Application Lifecycle Management
Oversee the end-to-end lifecycle of AI agent applications, ensuring robust testing, deployment, and ongoing monitoring. Ensure adherence to production readiness standards, security protocols, and regulatory compliance throughout the development lifecycle. Optimize platform performance to reduce latency and improve throughput. Design and implement backup, recovery, and business continuity plans for hosted platform applications and services. Design and maintain robust CI/CD pipelines for ML model deployment using Azure DevOps, GitHub Actions, and MLOps tools. Technical Leadership
Act as the tech lead across multiple sub-teams, setting technical direction and ensuring design consistency and best practices. Provide mentorship during design reviews, code assessments, and performance tuning. Lead by example in tackling complex technical challenges and driving system-wide architectural improvements. Establish and champion engineering standards for ML infrastructure, deployment practices, and operational procedures. Create technical documentation, runbooks, and deliver internal training on platform capabilities. Cross-Functional Collaboration
Work with data scientists, software engineers, and product teams to deploy ML systems into production. Translate complex technical concepts into actionable insights for technical and non-technical stakeholders. Foster collaboration and knowledge sharing across teams; present platform roadmaps to leadership and stakeholders. Qualifications
Bachelors degree in Computer Science, Engineering, Mathematics, or related field; advanced degree (MS/PhD) desirable. 10+ years designing, implementing, and maintaining multi-tenant AIML systems and platforms in production. 10+ years of cloud experience (Azure and AWS). Experience with large-scale data pipelines and real-time inference systems; lifecycle management of AI Agent/AI ML systems, including configuration, evaluation, monitoring, observability, and AuthN/AuthR considerations. 8+ years working with backend systems/tools (Kubernetes, Temporal, OpenSearch, PostgreSQL, Redis, Neo4J, etc.); Docker and container optimization; experience with Prometheus, Grafana, OpenTelemetry, and distributed tracing. 4+ years building front-end web applications (React and/or Next.js). Proficiency in Python, Java, Go, etc., with strong coding practices. Bonus for AI coding tool usage. Proficiency with AI/ML frameworks such as TensorFlow, PyTorch, Langraph, etc. Leadership Skills
Mentor engineers and lead technical initiatives; strong communication across seniority levels. Preferred Skills
Knowledge of AI safety, model governance, and regulatory compliance. Experience in regulated industries with data privacy and cybersecurity review processes. Experience with AI agent platforms and capabilities (Langsmith/Langraph, Autogen, N8N, Crew.ai, Dify.ai, etc.). Experience building LLM-based AI agent workflows with no-code/low-code and traditional high-code approaches. Experience using open-source (e.g., llama, Qwen, Mistral) and proprietary LLMs. Industries
Insurance Additional notes: Referrals may increase interview chances. This posting includes standard EEO statements as applicable. Source: GEICO external posting for Senior Staff Software Engineer AI Agent Platform Remote. #J-18808-Ljbffr