Argmax, Inc.
About Argmax
AI applications are scaling in user adoption at unprecedented rates. The infrastructure is crumbling: Spinner wheels are back in fashion, sensitive user data is uploaded to the cloud and occasionally leaked, spiky demand leads to infrastructure capacity crunch and waste at the same time. Argmax is building the critical infrastructure required to bring real-time AI workloads to the edge: Autoscaling instantly, private and compliant by design and reliable beyond even the multi-cloud platforms.
About the Role We are looking for a Staff Engineer to join our growing Cloud Systems team. In this role, you will design, implement and optimize systems that serve critical functions such as software licensing, large AI asset distribution, inference performance telemetry and more. Although Argmax deploys AI workloads directly on user devices, these cloud systems serve as the backbone of Argmax SDK, our flagship product, and must be built to handle traffic from millions of devices worldwide with 99.9999% uptime. If the scale and reliability challenge excites you, read further!
Responsibilities
Prepare systems for 10x scale : You will proactively identify and implement improvements to harden our existing infrastructure and ensure that they are ready for 10x higher traffic within the next year.
Architect multi-region expansion : As part of our best-in-market reliability ambition, you will lead Argmax's expansion from AWS to GCP and potentially other CSPs as we embrace Kubernetes. The primary objective will be to retain >99.9999% reliability through redundancy while maintaining cost efficiency at scale, preserving our 95%+ gross margin.
Take new systems from 0 to 1 : You will lead the design and implementation of new cloud systems to support the evolution of Argmax SDK, our flagship product. For example, Argmax SDK currently relies on third-party AI asset distribution infrastructure and one of your first projects will be to pull this infrastructure in-house and build an optimized global CDN that ensures fast and robust delivery of large AI assets worldwide.
Qualifications
3+ years of experience in designing, building and operating cloud systems that served a large cohort of users
Experience with container orchestration (Kubernetes) and cloud environments such as AWS or GCP
Fluency in one of Python, Go, or Javascript
Familiarity with Django, FastAPI or equivalent
Preferred Qualifications
Experience leading production systems serving at least one million monthly active users or handling sustained high QPS
Experience participating in compliance programs such as SOC 2, collecting the necessary evidence and communicating with independent auditorsProven success scaling systems from 0 to 1 and maintaining performance and reliability at scale
Familiarity with multi-region database replication, CDN design, and cost optimization for large-scale systems
Why Argmax
Direct ownership of mission-critical systems supporting millions of devices
No-nonsense and meritocratic culture where the career progression is only limited by how fast you make an impact on product
Top-of-market equity at a fast-growing early-stage startup with a unique mission
Performance-based equity refreshers twice a year
3 days a week in the office from Palo Alto, CA
Palo Alto office offers comprehensive on-site amenities, including chef‑catered meals
Remote possible by exception for industry leader exceptional candidates
Platinum‑tier healthcare with 90% employer contribution, including dependents
401(k) match
Quarterly in‑person team‑building weeks in Palo Alto, CA
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industry Software Development
#J-18808-Ljbffr
About the Role We are looking for a Staff Engineer to join our growing Cloud Systems team. In this role, you will design, implement and optimize systems that serve critical functions such as software licensing, large AI asset distribution, inference performance telemetry and more. Although Argmax deploys AI workloads directly on user devices, these cloud systems serve as the backbone of Argmax SDK, our flagship product, and must be built to handle traffic from millions of devices worldwide with 99.9999% uptime. If the scale and reliability challenge excites you, read further!
Responsibilities
Prepare systems for 10x scale : You will proactively identify and implement improvements to harden our existing infrastructure and ensure that they are ready for 10x higher traffic within the next year.
Architect multi-region expansion : As part of our best-in-market reliability ambition, you will lead Argmax's expansion from AWS to GCP and potentially other CSPs as we embrace Kubernetes. The primary objective will be to retain >99.9999% reliability through redundancy while maintaining cost efficiency at scale, preserving our 95%+ gross margin.
Take new systems from 0 to 1 : You will lead the design and implementation of new cloud systems to support the evolution of Argmax SDK, our flagship product. For example, Argmax SDK currently relies on third-party AI asset distribution infrastructure and one of your first projects will be to pull this infrastructure in-house and build an optimized global CDN that ensures fast and robust delivery of large AI assets worldwide.
Qualifications
3+ years of experience in designing, building and operating cloud systems that served a large cohort of users
Experience with container orchestration (Kubernetes) and cloud environments such as AWS or GCP
Fluency in one of Python, Go, or Javascript
Familiarity with Django, FastAPI or equivalent
Preferred Qualifications
Experience leading production systems serving at least one million monthly active users or handling sustained high QPS
Experience participating in compliance programs such as SOC 2, collecting the necessary evidence and communicating with independent auditorsProven success scaling systems from 0 to 1 and maintaining performance and reliability at scale
Familiarity with multi-region database replication, CDN design, and cost optimization for large-scale systems
Why Argmax
Direct ownership of mission-critical systems supporting millions of devices
No-nonsense and meritocratic culture where the career progression is only limited by how fast you make an impact on product
Top-of-market equity at a fast-growing early-stage startup with a unique mission
Performance-based equity refreshers twice a year
3 days a week in the office from Palo Alto, CA
Palo Alto office offers comprehensive on-site amenities, including chef‑catered meals
Remote possible by exception for industry leader exceptional candidates
Platinum‑tier healthcare with 90% employer contribution, including dependents
401(k) match
Quarterly in‑person team‑building weeks in Palo Alto, CA
Seniority Level Mid‑Senior level
Employment Type Full‑time
Job Function Information Technology
Industry Software Development
#J-18808-Ljbffr