About Us
At Union, we are solving one of the hardest challenges in AI infrastructure today: enabling high-velocity iteration while maintaining seamless production-readiness for AI workloads at scale. Flyte, the open-source project we steward, is the emerging standard for modern data and AI orchestration, with numerous leading technology organizations - like LinkedIn, Spotify, and Gojek - running millions of mission-critical workloads on the platform. We have a deep bench of infrastructure veterans from companies in the Big Three and beyond and a technical founding team who originally created Flyte while at Lyft.
The Opportunity
Reporting to the infrastructure lead, you will design and build distribution solutions for a large AI/ML platform supporting both internal developers and external customers. This position is ideal for engineers with 5+ years of professional full-stack development experience and a broad knowledge of infrastructure, backend APIs, and client facing applications. This role not only demands a robust background in technologies across the entire stack but an innate ability to tame large-scale distribution problems into approachable and scalable solutions.
In this role, you will:
- Design and build backend services (APIs, controllers, etc) and client components to install, manage, and observe union services in a Kubernetes native environment.
- Work across multiple cloud vendors including AWS, GCP, Azure, and OCI as well as custom platform providers.
- Maintain package management solutions and infrastructure automation platforms.
- Develop and maintain services and tooling to make our systems more reliable, secure, and performant.
- Contribute to architectural decisions and participate in code and design reviews across various teams, ensuring the highest standards of quality and performance.
- Work closely with broader Engineering teams to improve the customer experience.
About you:
- Have 5+ years of experience in deeply technical roles in engineering functions.
- Have a deep passion for all things Kubernetes and the broader container orchestration ecosystem.
- Are a generalist who can navigate and pick up new technologies quickly.
- Always think about the big picture and can put yourself in the shoes of the developer and customer.
- You have hands-on experience with backend programming languages (Go, Python) and front-end Javascript frameworks.
- Have extensive experience with Kubernetes and distribution paradigms within that ecosystem.
- Can own complex projects from planning to completion.
You can expect to work with the following tools at Union, however, we’re constantly evolving our stack!
- Languages: Golang, Python, and NextJS
- Infrastructure as Code: Terraform
- CI/CD: Buildkite, ArgoCD
- Cloud Providers: AWS, GCP, Azure, OCI
Benefits & Belonging
At Union.ai we know that employees who feel their best can build amazing things and we are proud to offer best in class benefits that will continually evolve and grow as the needs of our employees do.
Benefits may vary based on country
- Excellent medical - We pay 100% of your premiums and 90% for your dependents
- Generous dental and vision plans- We pay 90% of the premiums for you and your dependents
- Meaningful equity in the form of options – all employees are owners here
- Unlimited time off + 12 company holidays
- 401K match - Union.ai matches 100% of contributions up to the first 3%, and 50% up to 5%
- 16 weeks paid parental leave for primary and secondary caregivers
- Flexible work schedule (some restrictions apply)
We believe that our differences are what bring us together to achieve truly special outcomes. We strive to be inclusive and focus on building teams that embody that quality too. Union.ai is an equal-opportunity employer and we encourage you to apply, even if your experience doesn’t align exactly with our job description.
Compensation Range: $160K - $200K