Knak Digital
About the Role
We're hiring a
Software Engineer
to help build a next-generation
AI inference platform . You'll design and deliver core components that make deploying, scaling, and optimizing ML models fast, reliable, and cost-efficient across diverse hardware and cloud environments. This is a highly interdisciplinary role partnering with ML, systems, and cloud engineers to ensure the entire stack works in harmony.
What You'll Do Own core platform services for model packaging, deployment, and lifecycle management (from dev → staging → production). Research, prototype, and implement advanced model sharding and distributed execution strategies to improve throughput/latency across heterogeneous hardware. Build and iterate on a performance simulator to predict model behavior under varied workloads, hardware targets, and scheduling policies. Collaborate with ML engineers to integrate optimized models (PyTorch/TensorFlow/JAX) into the deployment pipeline with clear APIs and tooling. Work with cloud/systems engineers to maximize utilization of compute, memory, and network resources; contribute to autoscaling and scheduling logic. Design developer-friendly APIs/CLIs that make operating the inference stack simple and observable (metrics, tracing, logs, model/version metadata). Write clean, tested, production-ready code and participate in code reviews, design docs, and release processes. Required Qualifications
BS/MS/PhD in Computer Science or related field (or equivalent experience). 3+ years of professional software development, with emphasis on systems programming, distributed systems, or ML infrastructure. Strong proficiency in Python, C++, or Go. Experience with ML frameworks (e.g., PyTorch, TensorFlow, or JAX) and practical understanding of inference deployment challenges. Solid grounding in data structures, algorithms, concurrency, and software testing best practices. Demonstrated ability to decompose complex problems and ship pragmatic solutions in a collaborative, cross-functional environment. Preferred/Bonus Qualifications
Experience with containerization and orchestration (Docker, Kubernetes) and service meshes. Familiarity with profiling and performance optimization (CPU/GPU utilization, memory management, kernel/graph optimizations). Exposure to performance modeling/simulation tools and methodologies. Knowledge of network protocols, RPC, and distributed computing concepts (load balancing, scheduling, consistency, backpressure). Experience with major cloud providers (AWS/GCP/Azure) and IaC (Terraform, Helm, etc.). How We Work (Culture Snapshot)
Pragmatic and fast-moving: iterate quickly, measure impact, and prioritize reliability in production. High ownership: small, senior team with clear scope and autonomy. Cross-functional by default: close collaboration across ML, systems, and cloud. Location & Work Style
Flexible/remote-friendly within compatible time zones; occasional team meetups as needed. Compensation & Benefits
Competitive salary + meaningful equity; comprehensive benefits. Details provided during the interview process.
We're hiring a
Software Engineer
to help build a next-generation
AI inference platform . You'll design and deliver core components that make deploying, scaling, and optimizing ML models fast, reliable, and cost-efficient across diverse hardware and cloud environments. This is a highly interdisciplinary role partnering with ML, systems, and cloud engineers to ensure the entire stack works in harmony.
What You'll Do Own core platform services for model packaging, deployment, and lifecycle management (from dev → staging → production). Research, prototype, and implement advanced model sharding and distributed execution strategies to improve throughput/latency across heterogeneous hardware. Build and iterate on a performance simulator to predict model behavior under varied workloads, hardware targets, and scheduling policies. Collaborate with ML engineers to integrate optimized models (PyTorch/TensorFlow/JAX) into the deployment pipeline with clear APIs and tooling. Work with cloud/systems engineers to maximize utilization of compute, memory, and network resources; contribute to autoscaling and scheduling logic. Design developer-friendly APIs/CLIs that make operating the inference stack simple and observable (metrics, tracing, logs, model/version metadata). Write clean, tested, production-ready code and participate in code reviews, design docs, and release processes. Required Qualifications
BS/MS/PhD in Computer Science or related field (or equivalent experience). 3+ years of professional software development, with emphasis on systems programming, distributed systems, or ML infrastructure. Strong proficiency in Python, C++, or Go. Experience with ML frameworks (e.g., PyTorch, TensorFlow, or JAX) and practical understanding of inference deployment challenges. Solid grounding in data structures, algorithms, concurrency, and software testing best practices. Demonstrated ability to decompose complex problems and ship pragmatic solutions in a collaborative, cross-functional environment. Preferred/Bonus Qualifications
Experience with containerization and orchestration (Docker, Kubernetes) and service meshes. Familiarity with profiling and performance optimization (CPU/GPU utilization, memory management, kernel/graph optimizations). Exposure to performance modeling/simulation tools and methodologies. Knowledge of network protocols, RPC, and distributed computing concepts (load balancing, scheduling, consistency, backpressure). Experience with major cloud providers (AWS/GCP/Azure) and IaC (Terraform, Helm, etc.). How We Work (Culture Snapshot)
Pragmatic and fast-moving: iterate quickly, measure impact, and prioritize reliability in production. High ownership: small, senior team with clear scope and autonomy. Cross-functional by default: close collaboration across ML, systems, and cloud. Location & Work Style
Flexible/remote-friendly within compatible time zones; occasional team meetups as needed. Compensation & Benefits
Competitive salary + meaningful equity; comprehensive benefits. Details provided during the interview process.