Join Us as a Training Infrastructure Engineer (ML Systems & Foundation Models)
We're building the next wave of general-purpose AI—efficient, scalable, and designed to integrate seamlessly across enterprises. Our foundation model platform enables users to build, deploy, and optimize AI systems with precision and control.
As part of the technical core, you’ll design and optimize the distributed infrastructure powering everything from lean, task-specific models to large-scale multimodal systems. You’ll work on critical systems that make large-scale training faster, more reliable, and cost-efficient—at the very frontier of AI and infrastructure.
What You’ll Work On
You’ll build the distributed training infrastructure that makes high-throughput, multi-node, multi-GPU training not only possible—but efficient, fault-tolerant, and scalable. From high-performance data pipelines to cutting-edge sharding techniques, your work will directly impact how fast and far our models can go.
Key Challenges You’ll Take On
You're a Great Fit If You Have:
If you're passionate about distributed systems, large-scale training, and creating infrastructure that drives real AI breakthroughs—this is your chance to build it from the ground up.