Tredence Inc.

Site Reliability Engineer

Pune, MH, IN

about 2 months ago
Save Job

Summary

Cloud SRE Engineer

2-4yrs

Pune / Chennai


Key Responsibilities:

  • Design, implement, and maintain highly available and scalable infrastructure on Azure and GCP.
  • Develop and deploy comprehensive observability, monitoring, and incident response systems.
  • Automate infrastructure management, scaling, and deployment processes using Infrastructure-as-Code (IaC) tools like Terraform and ARM.
  • Collaborate with development teams to design resilient deployment architectures and ensure production readiness.
  • Implement proactive performance monitoring and capacity planning strategies.
  • Develop automated recovery and self-healing mechanisms for cloud infrastructure.
  • Establish and enforce best practices for SRE and cloud infrastructure management.
  • Ensure compliance, security, and governance standards across cloud environments.

Required Skills:

  • Expertise in observability tools like Prometheus, Grafana, and Datadog.
  • Knowledge of cloud services on Azure and GCP.
  • Hands-on experience with CI/CD tools and deployment automation.
  • Solid understanding of cloud networking, security, and resource management.
  • Strong scripting skills in Python, Bash, or PowerShell.
  • Excellent troubleshooting, problem-solving, and communication skills.

Preferred Qualifications:

  • SRE certifications or relevant cloud certifications.
  • Experience with multi-tenant deployments and high-scale environments.
  • Familiarity with hybrid cloud and complex deployment scenarios.
  • Cloud certifications for Azure and GCP

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: