Dharmakit Networks

GPU Infrastructure Engineer

Vadodara, GJ, IN

2 days ago
Save Job

Summary

Company Overview

Dharmakit Networks is a premium global IT solutions partner dedicated to innovation and success worldwide. We specialize in website development, SaaS, digital marketing, AI Solutions, and more, helping brands turn thier ideas into high-impact digital products. We're known for blending global standards with deep Indian insight. And now, we’re stepping into our most excited chapter yet.

Project Ax1 is our next-generation Large Language Model (LLM), a powerful AI initiative designed to make intelligence accessible and impactful for Bharat and the world. Built by a team of AI experts, Dharmakit Networks is committed to developing cost-effective, high-performance AI tailored for India and beyond, enabling enterprises to unlock new opportunities and drive deeper connections. Join us in reshaping the future of AI, starting from India.

Role Overview

As a GPU Infrastructure Engineer, you'll be at the core of building, optimizing, and scaling the GPU and AI compute infrastructure that powers Project Ax1. From model pretraining to real-time inference, your work will ensure our AI systems run fast, stay stable, and scale globally. You'll manage cloud and on-prem clusters, set up model CI/CD pipelines, and help us get the most out of every GPU.

Key Responsibilities

  • Design, deploy, and optimize GPU infrastructure for large-scale AI workloads.
  • Manage GPU clusters across cloud (AWS, Azure, GCP) and on-prem setups.
  • Set up and maintain model CI/CD pipelines for efficient training and deployment.
  • Optimize LLM inference using TensorRT, ONNX, Nvidia NVCF, etc.
  • Manage offline/edge deployments of AI models (e.g., CUDA, Lambda, containerized AI).
  • Build and tune data pipelines to support real-time and batch processing.
  • Monitor model and infra performance for availability, latency, and cost efficiency.
  • Implement logging, monitoring, and alerting using Prometheus, Grafana, ELK, CloudWatch.
  • Work closely with AI Experts, ML Experts, backend Experts, and full-stack teams to ensure seamless model delivery.

Must-Have Skills And Qualifications

  • Bachelor’s degree in Computer Science, Engineering, or related field.
  • Hands-on with Nvidia GPUs, CUDA, and deep learning model deployment.
  • Strong experience with AWS, Azure, or GCP GPU instance setup and scaling.
  • Proficiency in model CI/CD and automated ML workflows.
  • Experience with Terraform, Kubernetes, and Docker.
  • Familiar with offline/edge AI, including quantization and optimization.
  • Logging & Monitoring using tools like Prometheus, Grafana, CloudWatch.
  • Experince with backend APIs, data processing workflows, and ML pipelines.
  • Experience with Git, collaboration in agile, cross-functional teams.
  • Strong analytical and debugging skills.
  • Excellent communication, teamwork, and problem-solving abilities.

Good To Have

  • Experience with Nvidia NVCF, DeepSpeed, vLLM, Hugging Face Triton.
  • Knowledge of FP16/INT8 quantization, pruning, and other optimization tricks.
  • Exposure to serverless AI inference (Lambda, SageMaker, Azure ML).

Contributions to open-source AI infrastructure projects or a strong GitHub portfolio showcasing ML model deployment expertise.

Skills: hugging face triton,monitoring,kubernetes,cloudwatch,infrastructure monitoring,logging,pruning,nvidia gpus,gcp,nvidia nvcf,int8,cuda,deepspeed,ci,offline ai,batch processing,problem-solving,cloud,data processing workflows,grafana,ml,microsoft azure,git,ai ethics,ai governance,agile,elk stack,onnx,pipelines,quantization,ml model deployment,aws,gpu,backend apis,google cloud platform,nvidia,optimization,prometheus,azure,deep learning model deployment,ml pipelines,automated ml workflows,tensorrt,vllm,terraform,docker,cd,cost optimization,edge deployments,batch ai processing,model ci/cd,fp16,model ci/cd pipelines,infrastructure,ai

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job