Techfellow Limited
Machine Learning Engineer | Global Prop Trading Firm
Techfellow Limited, Chicago, Illinois, United States, 60290
Role Overview
We’re working with a leading algorithmic trading firm seeking a Machine Learning Systems Engineer to design, optimise, and maintain large-scale ML infrastructure powering advanced trading and research initiatives. This position sits at the intersection of high-performance computing, distributed systems, and applied machine learning – ideal for an engineer who thrives in performance‑critical environments and enjoys bridging cutting‑edge research with real‑world trading applications. You’ll collaborate with data scientists, quantitative researchers, and GPU specialists to develop end‑to‑end systems for training, deployment, and optimisation of machine learning models at scale.
Key Responsibilities
Architect and maintain distributed training pipelines for large datasets and complex model architectures, ensuring scalability and fault tolerance
Build and refine real‑time inference systems capable of delivering ultra‑low‑latency predictions to support live trading and analytics workloads
Optimise model training and inference performance through GPU acceleration, hardware tuning, and efficient use of libraries such as CuDNN, TensorRT, and NCCL
Collaborate with research and HPC engineering teams to streamline workflows, boost throughput, and minimise resource bottlenecks
Develop internal libraries and reusable components to extend and enhance the performance of machine learning frameworks such as PyTorch, TensorFlow, and JAX
Integrate automation and monitoring into ML workflows, covering model retraining, data versioning, and hyperparameter optimisation
Evaluate, customise, and deploy emerging open‑source tools to strengthen the firm’s ML infrastructure capabilities
Deep dive into framework internals to identify bottlenecks and implement performance or scalability improvements
Partner with quantitative teams to translate experimental ideas into robust, production‑ready ML pipelines
What You’ll Bring…
4+ years’ professional experience as a Machine Learning Engineer, Systems Engineer, or similar role working on large‑scale training and inference systems (FAANG background preferred)
Strong software engineering background with proficiency in Python, C++, and/or CUDA
Demonstrated experience building or tuning low‑latency, high‑performance ML pipelines for real‑time environments
Deep knowledge of GPU acceleration techniques and distributed training frameworks (e.g. Horovod, Ray, or similar)
Understanding of end‑to‑end ML lifecycles – from data ingestion and feature processing to model deployment and optimisation
Experience working within high‑performance computing environments and collaborating closely with infrastructure and platform teams
Familiarity with orchestration and scaling tools for ML workloads (e.g. Kubernetes, Slurm, or cloud‑native equivalents)
(Preferred) Exposure to financial markets, algorithmic trading, or other latency‑sensitive domains
#J-18808-Ljbffr
Key Responsibilities
Architect and maintain distributed training pipelines for large datasets and complex model architectures, ensuring scalability and fault tolerance
Build and refine real‑time inference systems capable of delivering ultra‑low‑latency predictions to support live trading and analytics workloads
Optimise model training and inference performance through GPU acceleration, hardware tuning, and efficient use of libraries such as CuDNN, TensorRT, and NCCL
Collaborate with research and HPC engineering teams to streamline workflows, boost throughput, and minimise resource bottlenecks
Develop internal libraries and reusable components to extend and enhance the performance of machine learning frameworks such as PyTorch, TensorFlow, and JAX
Integrate automation and monitoring into ML workflows, covering model retraining, data versioning, and hyperparameter optimisation
Evaluate, customise, and deploy emerging open‑source tools to strengthen the firm’s ML infrastructure capabilities
Deep dive into framework internals to identify bottlenecks and implement performance or scalability improvements
Partner with quantitative teams to translate experimental ideas into robust, production‑ready ML pipelines
What You’ll Bring…
4+ years’ professional experience as a Machine Learning Engineer, Systems Engineer, or similar role working on large‑scale training and inference systems (FAANG background preferred)
Strong software engineering background with proficiency in Python, C++, and/or CUDA
Demonstrated experience building or tuning low‑latency, high‑performance ML pipelines for real‑time environments
Deep knowledge of GPU acceleration techniques and distributed training frameworks (e.g. Horovod, Ray, or similar)
Understanding of end‑to‑end ML lifecycles – from data ingestion and feature processing to model deployment and optimisation
Experience working within high‑performance computing environments and collaborating closely with infrastructure and platform teams
Familiarity with orchestration and scaling tools for ML workloads (e.g. Kubernetes, Slurm, or cloud‑native equivalents)
(Preferred) Exposure to financial markets, algorithmic trading, or other latency‑sensitive domains
#J-18808-Ljbffr