Perplexity is seeking an experienced Infrastructure Capacity Engineer to own our infrastructure scaling, capacity planning, and resource optimization across our AI/ML infrastructure. The ideal candidate will have deep experience in large-scale distributed systems, capacity modeling, and infrastructure efficiency optimization to support our rapidly growing AI products and user base.
Responsibilities
- Design and implement comprehensive capacity planning models and forecasting systems that predict infrastructure needs across compute, storage, and network resources for our AI/ML workloads
- Build and maintain automated capacity management systems that dynamically scale our infrastructure based on real-time demand patterns and usage forecasts
- Lead cross-functional capacity planning initiatives including hardware procurement, data center expansion, and cloud resource optimization
- Develop sophisticated monitoring and alerting systems that provide early warning indicators for capacity constraints and performance degradation
- Create and maintain detailed infrastructure capacity models that account for seasonal patterns, product launches, and scaling efficiency across different workload types
- Optimize resource utilization and cost efficiency through advanced placement algorithms, load balancing strategies, and infrastructure rightsizing
- Design and implement disaster recovery and business continuity plans that ensure service availability during infrastructure failures or capacity emergencies
- Collaborate with Site Reliability Engineering and Platform teams to establish capacity-aware deployment strategies and infrastructure automation
- Play a leading role in defining the capacity engineering discipline within Perplexity’s engineering organization
Qualifications
- Minimum of 4+ years of experience in infrastructure capacity planning, systems engineering, or related technical roles at scale
- Proven experience managing infrastructure capacity for high-growth technology companies, preferably with AI/ML workloads or real-time systems
- Strong background in distributed systems architecture, cloud infrastructure (AWS/GCP/Azure), and container orchestration (Kubernetes)
- Experience with capacity modeling tools, forecasting methodologies, and statistical analysis for infrastructure planning
- Proficiency in programming languages such as Python, Go, or similar for automation and tooling development
- Deep understanding of infrastructure monitoring, observability, and performance optimization techniques
- Experience with infrastructure-as-code tools (Terraform, Ansible) and CI/CD pipelines for infrastructure management
- Strong analytical and problem-solving skills with the ability to make data-driven decisions under uncertainty
- Excellent cross-functional collaboration skills and experience working with engineering, product, and business stakeholders
- Experience with large-scale database systems, caching layers, and content delivery networks preferred
- Background in AI/ML infrastructure, LLM inference, GPU cluster management, or high-performance computing is a plus
Our cash compensation range for this role is $225,000 - $300,000.
Final offer amounts are determined by multiple factors, including, experience and expertise, and may vary from the amounts listed above.
Equity: In addition to the base salary, equity may be part of the total compensation package.
Benefits: Comprehensive health, dental, and vision insurance for you and your dependents. Includes a 401(k) plan.
#J-18808-Ljbffr