Stelvio Inc.
AI Systems Engineer
Hybrid
Alexandria, VA Active or Eligible U.S. Security Clearance Required
About the Role Were looking for an AI Systems Engineer to support Public Sector initiatives by building secure, distributed, and high-performance AI systems. Youll turn research models into production-ready solutions that run across cloud, edge, and constrained environmentsoptimizing everything from model performance to low-level GPU acceleration.
What Youll Do Convert prototype models into scalable, robust AI systems. Optimize models using PyTorch, TensorFlow, or Hugging Face. Apply quantization, pruning, distillation, and hardware acceleration. Build LoRA/PEFT workflows and on-device inference pipelines. Develop RAG systems integrated with vector databases (FAISS, Milvus, Pinecone). Support multimodal (text/vision/audio) model deployments. Generate synthetic data using GANs or diffusion models. Write performance-critical code in C/C++/Rust with GPU optimization. Engineer for distributed, edge, and offline environments. Collaborate with Infrastructure, MLOps, and Security teams.
What You Bring Active or eligible U.S. security clearance. 5+ years in applied AI, ML engineering, or AI systems development. Strong experience with major ML frameworks and cloud/edge deployment. Expertise in model compression, GPU computing, and CUDA. Experience with RAG pipelines, vector databases, multimodal systems, and synthetic data. Strong C/C++/Rust and systems-level programming skills. Solid algorithmic and distributed-systems problem-solving ability.
Preferred Experience with edge AI, federated learning, or offline inference. Familiarity with distributed training (DeepSpeed, Ray). Knowledge of public-sector AI governance and compliant architectures. Strong communication and documentation skills.
Alexandria, VA Active or Eligible U.S. Security Clearance Required
About the Role Were looking for an AI Systems Engineer to support Public Sector initiatives by building secure, distributed, and high-performance AI systems. Youll turn research models into production-ready solutions that run across cloud, edge, and constrained environmentsoptimizing everything from model performance to low-level GPU acceleration.
What Youll Do Convert prototype models into scalable, robust AI systems. Optimize models using PyTorch, TensorFlow, or Hugging Face. Apply quantization, pruning, distillation, and hardware acceleration. Build LoRA/PEFT workflows and on-device inference pipelines. Develop RAG systems integrated with vector databases (FAISS, Milvus, Pinecone). Support multimodal (text/vision/audio) model deployments. Generate synthetic data using GANs or diffusion models. Write performance-critical code in C/C++/Rust with GPU optimization. Engineer for distributed, edge, and offline environments. Collaborate with Infrastructure, MLOps, and Security teams.
What You Bring Active or eligible U.S. security clearance. 5+ years in applied AI, ML engineering, or AI systems development. Strong experience with major ML frameworks and cloud/edge deployment. Expertise in model compression, GPU computing, and CUDA. Experience with RAG pipelines, vector databases, multimodal systems, and synthetic data. Strong C/C++/Rust and systems-level programming skills. Solid algorithmic and distributed-systems problem-solving ability.
Preferred Experience with edge AI, federated learning, or offline inference. Familiarity with distributed training (DeepSpeed, Ray). Knowledge of public-sector AI governance and compliant architectures. Strong communication and documentation skills.