Krea

AI / ML Inference Engineer

Krea, San Francisco, California, United States, 94199

About Krea At Krea, we're dedicated to making AI intuitive and controllable for creatives. Our mission is to build tools that empower human creativity, not replace it. We believe AI is a new medium that allows us to express ourselves through various formats—text, images, video, sound, and even 3D. We're building better, smarter, and more controllable tools to harness this medium.

We’re backed by Bain Capital Ventures, A16Z, Abstract Ventures, Pebblebed and many others. If you're passionate about pushing the boundaries of AI and empowering human creativity, we'd love to hear from you.

We're looking for a Machine Learning Engineer to help us optimize the inference and training of our AI models. You will collaborate closely with our AI research and infrastructure teams to integrate optimizations seamlessly.

What you’ll do

Write custom CUDA kernels to speed up multi-node inference on image and video models.

Work on various caching and dynamic compilation techniques to optimize the loading and unloading of the variety of AI models we serve at Krea.

Speed up and improve efficiency of training runs across our GPU clusters.

What you’ll need

Proficiency in CUDA or parallel programming.

Python / C++ programming experience.

Experience in optimizing diffusion / transformer models for performance and scalability.

High agency and resourcefulness.

What we offer

Openness to sponsoring International candidates (e.g STEM OPT, OPT, H1B, O1, E3)

Work alongside a world class developing the future of AI tooling

Significant impact on Krea’s market presence and growth

Competitive compensation (75% percentile of market rates) with significant equity upside

#J-18808-Ljbffr