Pryon
Senior Engineering Manager - Accelerated Compute Memory Systems
Pryon, Washington, District of Columbia, us, 20022
Senior Engineering Manager – Accelerated Compute Memory Systems
We’re building an industry‑leading knowledge management and Retrieval‑Augmented Generation (RAG) platform that turns unstructured data into actionable insights for millions of users. Pryon scales a petabyte‑scale ingestion and inference engine that powers mission‑critical government and enterprise deployments. We need an Engineering Manager with deep HPC expertise—someone who can teach, not be taught.
In This Role You Will
Lead a team delivering ingestion, retrieval, and inference layers for commercial and federal customers with millions of public users.
Architect horizontally scalable, fault‑tolerant systems that handle billions of documents and 30,000+ concurrent users.
Guide implementation of multimodal ingestion pipelines (PDF, HTML, DOCX, JSON, XML, PPTX, TIFF).
Oversee design and optimization of LLM‑driven data ingestion and retrieval workflows.
Own optimization and tuning of high‑throughput, low‑latency production environments via async orchestration frameworks.
Establish performance benchmarking, compliance frameworks, and automated testing for scale.
Balance technical leadership with people leadership, mentoring a high‑performing engineering team.
Collaborate cross‑functionally with Product, Executive Leadership, and Customer Success.
What You’ll Need to Be Successful
10+ years in software engineering; 5+ years managing large‑scale AI/ML systems and infrastructure.
Expert proficiency in Python and Golang with production distributed system experience.
Experience with orchestration frameworks (Kubernetes, Ray, Dask).
Proficiency with vector databases (Pinecone, Weaviate, Qdrant).
Experience with message queuing systems (Kafka, Pulsar, RabbitMQ).
Hands‑on experience building scalable distributed architectures and high‑performance compute systems.
Proven multimodal ingestion pipelines within RAG platforms.
Direct experience designing, fine‑tuning, and optimizing LLMs for ingestion and retrieval workloads.
Previous success managing engineering teams delivering production‑grade, HPC‑scale RAG systems.
Deep understanding of infra domains: compute, storage, networking, observability, security, disaster recovery, and cost management.
Familiarity with HPC cluster management software (Slurm) and cloud platforms (AWS, Azure, GCP).
Benefits for Full‑Time Employees
Remote‑first organization.
100 % company‑paid health/dental/vision benefits for you and your dependents.
Life insurance, short‑term and long‑term disability.
401(k).
Unlimited PTO.
We are interested in every qualified candidate who is authorized to work in the United States. However, we are not able to sponsor or take over sponsorship of employment visas at this time.
Pryon will not consider race, religion, sex, sexual preference, or national origin in ways that violate the Nation's civil rights laws.
We may use artificial intelligence (AI) tools to support parts of the hiring process. Final hiring decisions are made by humans.
#J-18808-Ljbffr
In This Role You Will
Lead a team delivering ingestion, retrieval, and inference layers for commercial and federal customers with millions of public users.
Architect horizontally scalable, fault‑tolerant systems that handle billions of documents and 30,000+ concurrent users.
Guide implementation of multimodal ingestion pipelines (PDF, HTML, DOCX, JSON, XML, PPTX, TIFF).
Oversee design and optimization of LLM‑driven data ingestion and retrieval workflows.
Own optimization and tuning of high‑throughput, low‑latency production environments via async orchestration frameworks.
Establish performance benchmarking, compliance frameworks, and automated testing for scale.
Balance technical leadership with people leadership, mentoring a high‑performing engineering team.
Collaborate cross‑functionally with Product, Executive Leadership, and Customer Success.
What You’ll Need to Be Successful
10+ years in software engineering; 5+ years managing large‑scale AI/ML systems and infrastructure.
Expert proficiency in Python and Golang with production distributed system experience.
Experience with orchestration frameworks (Kubernetes, Ray, Dask).
Proficiency with vector databases (Pinecone, Weaviate, Qdrant).
Experience with message queuing systems (Kafka, Pulsar, RabbitMQ).
Hands‑on experience building scalable distributed architectures and high‑performance compute systems.
Proven multimodal ingestion pipelines within RAG platforms.
Direct experience designing, fine‑tuning, and optimizing LLMs for ingestion and retrieval workloads.
Previous success managing engineering teams delivering production‑grade, HPC‑scale RAG systems.
Deep understanding of infra domains: compute, storage, networking, observability, security, disaster recovery, and cost management.
Familiarity with HPC cluster management software (Slurm) and cloud platforms (AWS, Azure, GCP).
Benefits for Full‑Time Employees
Remote‑first organization.
100 % company‑paid health/dental/vision benefits for you and your dependents.
Life insurance, short‑term and long‑term disability.
401(k).
Unlimited PTO.
We are interested in every qualified candidate who is authorized to work in the United States. However, we are not able to sponsor or take over sponsorship of employment visas at this time.
Pryon will not consider race, religion, sex, sexual preference, or national origin in ways that violate the Nation's civil rights laws.
We may use artificial intelligence (AI) tools to support parts of the hiring process. Final hiring decisions are made by humans.
#J-18808-Ljbffr