EPAM Systems

Cloud AIOps Architect

Pune, MH, IN

10 days ago
Save Job

Summary

We are seeking an experienced Cloud AIOps Architect to lead the design and implementation of advanced AI-driven operational systems across multi-cloud and hybrid cloud environments. This role demands a blend of technical expertise, innovation, and leadership to develop scalable solutions for complex IT systems with a focus on automation, machine learning, and operational efficiency.

Responsibilities


  • Architect and design the AIOps solution leveraging AWS, Azure, and Cloud Agnostic services, ensuring portability and scalability
  • Develop an end-to-end automated machine learning (ML) pipeline from data ingestion, DataOps, model training, to inference pipelines across multi-cloud environments
  • Design hybrid architectures leveraging cloud-native services like Amazon SageMaker, Azure Machine Learning, and Kubernetes for development, model deployment, and orchestration
  • Design and implement ChatOps integration, allowing users to interface with the platform through Slack, Microsoft Teams, or similar communication platforms
  • Leverage Jupyter Notebooks in AWS SageMaker, Azure Machine Learning Studio, or cloud-agnostic environments to create model prototypes and experiment with datasets
  • Lead the design of classification models and other ML models using AWS SageMaker training jobs, Azure ML training jobs, or open-source tools in a Kubernetes container
  • Implement automated rule management systems using Python in containers deployed to AWS ECS/EKS, Azure AKS, or Kubernetes for cloud-agnostic solutions
  • Architect the integration of ChatOps backend services using Python containers running in AWS ECS/EKS, Azure AKS, or Kubernetes for real-time interactions and updates
  • Oversee the continuous deployment and retraining of models based on updated data and feedback loops, ensuring models remain efficient and adaptive
  • Design platform-agnostic solutions to ensure that the system can be ported across different cloud environments or run in hybrid clouds (on-premises and cloud)


Requirements


  • 13+ years of overall experience and 7+ years of experience in AIOps, Cloud Architecture, or DevOps roles
  • Hands-on experience with AWS services such as SageMaker, S3, Glue, Kinesis, ECS, EKS
  • Strong experience with Azure services such as Azure Machine Learning, Blob Storage, Azure Event Hubs, Azure AKS
  • Hands-on experience working on the design, development, and deployment of contact centre solutions at scale
  • Proficiency in container orchestration (e.g., Kubernetes) and experience with multi-cloud environments
  • Experience with machine learning model training, deployment, and data management across cloud-native and cloud-agnostic environments
  • Expertise in implementing ChatOps solutions using platforms like Microsoft Teams, Slack, and integrating them with AIOps automation
  • Familiarity with data lake architectures, data pipelines, and inference pipelines using event-driven architectures
  • Strong programming skills in Python for rule management, automation, and integration with cloud services
  • Experience in Kafka, Azure DevOps, and AWS DevOps for CI/CD pipelines


How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: