Hayden AI
Staff Software Engineer, Edge Applications
Hayden AI, San Francisco, California, United States, 94199
About Us
At Hayden AI, we are on a mission to harness the power of computer vision to transform the way transit systems and other government agencies address real-world challenges.
From bus lane and bus stop enforcement to transportation optimization technologies and beyond, our innovative mobile perception system empowers our clients to accelerate transit, enhance street safety, and drive toward a sustainable future.
What the job involves
As a member of the Hayden Software Engineering org you will help build our next generation product for Automated Bus Lane Enforcement. This team is developing, integrating, and deploying edge-based perception models. This is a C++ software engineering generalist role. You will be delivering high-quality, modern C++ (C++17/20) code that runs on embedded Linux edge devices, primarily NVIDIA Embedded Platform. You will build and optimize real-time CV/ML pipelines on device, integrating CUDA/TensorRT and efficient camera/video I/O. Expect to collaborate with Product Management to translate customer needs to software solutions. Hayden is a startup. You will be working in an ambiguous, fast-paced environment. As Hayden begins an organizational scale-up phase, we need to build a rock-solid foundation. That means delivering well-designed and well-tested code that can be shared across the organization. Responsibilities
Deliver bullet-proof, rigorously tested C++ code for embedded Linux edge devices (NVIDIA Platform). Iterate on our real-time detection, tracking, and license-plate recognition systems under strict latency/memory constraints. Deep-dive performance optimization on NVIDIA Platform: CUDA streams, TensorRT (FP16/INT8), DLA offload, Nsight/tegrastats profiling, and bottleneck removal. Enhance developer tooling and CI for cross-compilation, containerized embedded system builds, hardware-in-the-loop tests, logging/telemetry, and OTA-friendly releases. Qualifications
Master's or PHD degree in Computer Science, Electrical Engineering, or a closely related field. Background in Machine Learning, Image Processing, Computer Vision, or a similar field. Minimum 8 years of industry experience with strong hands-on C++. Proven experience on embedded Linux and SoC platforms (preferably NVIDIA Jetson Orin/Xavier). Practical experience with CUDA and deploying/optimizing models with TensorRT or similar runtimes. Familiarity with camera/video pipelines and OpenCV; ability to reason about end-to-end latency and throughput. Past experience collaborating with other software engineers in code reviews, design discussions, and production support; ability to operate effectively in a scaling organization. Ways to stand out In-depth embedded systems development experience, including power/thermal budgeting and real-time constraints on mobile SoCs. Hands-on CUDA/OpenCL, stream concurrency, and custom kernel or plugin development for image/video processing. Experience with ML model deployment at the edge (ONNX Runtime, quantization/calibration, DLA offload). Strong Git/GitHub practices, CI/CD for device fleets, and containerized Jetson builds.
At Hayden AI, we are on a mission to harness the power of computer vision to transform the way transit systems and other government agencies address real-world challenges.
From bus lane and bus stop enforcement to transportation optimization technologies and beyond, our innovative mobile perception system empowers our clients to accelerate transit, enhance street safety, and drive toward a sustainable future.
What the job involves
As a member of the Hayden Software Engineering org you will help build our next generation product for Automated Bus Lane Enforcement. This team is developing, integrating, and deploying edge-based perception models. This is a C++ software engineering generalist role. You will be delivering high-quality, modern C++ (C++17/20) code that runs on embedded Linux edge devices, primarily NVIDIA Embedded Platform. You will build and optimize real-time CV/ML pipelines on device, integrating CUDA/TensorRT and efficient camera/video I/O. Expect to collaborate with Product Management to translate customer needs to software solutions. Hayden is a startup. You will be working in an ambiguous, fast-paced environment. As Hayden begins an organizational scale-up phase, we need to build a rock-solid foundation. That means delivering well-designed and well-tested code that can be shared across the organization. Responsibilities
Deliver bullet-proof, rigorously tested C++ code for embedded Linux edge devices (NVIDIA Platform). Iterate on our real-time detection, tracking, and license-plate recognition systems under strict latency/memory constraints. Deep-dive performance optimization on NVIDIA Platform: CUDA streams, TensorRT (FP16/INT8), DLA offload, Nsight/tegrastats profiling, and bottleneck removal. Enhance developer tooling and CI for cross-compilation, containerized embedded system builds, hardware-in-the-loop tests, logging/telemetry, and OTA-friendly releases. Qualifications
Master's or PHD degree in Computer Science, Electrical Engineering, or a closely related field. Background in Machine Learning, Image Processing, Computer Vision, or a similar field. Minimum 8 years of industry experience with strong hands-on C++. Proven experience on embedded Linux and SoC platforms (preferably NVIDIA Jetson Orin/Xavier). Practical experience with CUDA and deploying/optimizing models with TensorRT or similar runtimes. Familiarity with camera/video pipelines and OpenCV; ability to reason about end-to-end latency and throughput. Past experience collaborating with other software engineers in code reviews, design discussions, and production support; ability to operate effectively in a scaling organization. Ways to stand out In-depth embedded systems development experience, including power/thermal budgeting and real-time constraints on mobile SoCs. Hands-on CUDA/OpenCL, stream concurrency, and custom kernel or plugin development for image/video processing. Experience with ML model deployment at the edge (ONNX Runtime, quantization/calibration, DLA offload). Strong Git/GitHub practices, CI/CD for device fleets, and containerized Jetson builds.