We are seeking an experienced Cloud AIOps Architect to lead the design and implementation of advanced AI-driven operational systems across multi-cloud and hybrid cloud environments. This role demands a blend of technical expertise, innovation, and leadership to develop scalable solutions for complex IT systems with a focus on automation, machine learning, and operational efficiency.
Responsibilities
Architect and design the AIOps solution leveraging AWS, Azure, and Cloud Agnostic services, ensuring portability and scalability
Develop an end-to-end automated machine learning (ML) pipeline from data ingestion, DataOps, model training, to inference pipelines across multi-cloud environments
Design hybrid architectures leveraging cloud-native services like Amazon SageMaker, Azure Machine Learning, and Kubernetes for development, model deployment, and orchestration
Design and implement ChatOps integration, allowing users to interface with the platform through Slack, Microsoft Teams, or similar communication platforms
Leverage Jupyter Notebooks in AWS SageMaker, Azure Machine Learning Studio, or cloud-agnostic environments to create model prototypes and experiment with datasets
Lead the design of classification models and other ML models using AWS SageMaker training jobs, Azure ML training jobs, or open-source tools in a Kubernetes container
Implement automated rule management systems using Python in containers deployed to AWS ECS/EKS, Azure AKS, or Kubernetes for cloud-agnostic solutions
Architect the integration of ChatOps backend services using Python containers running in AWS ECS/EKS, Azure AKS, or Kubernetes for real-time interactions and updates
Oversee the continuous deployment and retraining of models based on updated data and feedback loops, ensuring models remain efficient and adaptive
Design platform-agnostic solutions to ensure that the system can be ported across different cloud environments or run in hybrid clouds (on-premises and cloud)
Requirements
13+ years of overall experience and 7+ years of experience in AIOps, Cloud Architecture, or DevOps roles
Hands-on experience with AWS services such as SageMaker, S3, Glue, Kinesis, ECS, EKS
Strong experience with Azure services such as Azure Machine Learning, Blob Storage, Azure Event Hubs, Azure AKS
Hands-on experience working on the design, development, and deployment of contact centre solutions at scale
Proficiency in container orchestration (e.g., Kubernetes) and experience with multi-cloud environments
Experience with machine learning model training, deployment, and data management across cloud-native and cloud-agnostic environments
Expertise in implementing ChatOps solutions using platforms like Microsoft Teams, Slack, and integrating them with AIOps automation
Familiarity with data lake architectures, data pipelines, and inference pipelines using event-driven architectures
Strong programming skills in Python for rule management, automation, and integration with cloud services
Experience in Kafka, Azure DevOps, and AWS DevOps for CI/CD pipelines
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job