HighLevel

Site Reliability Engineer

Delhi, IN

28 days ago

Save Job

Summary

We are looking for a Site Reliability Engineer to join our team and help ensure the availability, performance, and scalability of our critical systems. You will work closely with development and operations teams to automate processes, enhance system reliability, and improve observability.

Requirements

Experience: 4+ years in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
Cloud Expertise: Hands-on experience with GCP and AWS
Infrastructure as Code (IaC): Terraform, Helm, or equivalent tools
Containerisation & Orchestration: Docker, Kubernetes (GKE)
Observability: Experience with Prometheus, Grafana, ELK, OpenTelemetry, or similar monitoring/logging tools
Programming/Scripting: Proficiency in Python, Bash, or Shell scripting. Basic understanding of API parsing and JSON manipulation
CI/CD Pipelines: Hands-on experience with Jenkins, GitHub Actions, ArgoCD, or similar tools
Incident Management: Experience with on-call rotations, SLOs, SLIs, SLAs, Escalation Policies, and incident resolution
Databases: Experience in monitoring MongoDB, Redis, ES, Queue based etc

Responsibilities

Develop and improve observability using monitoring, logging, tracing, and alerting tools (Prometheus, Grafana, ELK, OpenTelemetry, etc.)
Optimize system performance, troubleshoot incidents, and conduct post-mortems/RCA to prevent future issues
Collaborate with developers to enhance application reliability, scalability, and performance
Drive cost optimisation efforts in cloud environments.
Monitor multiple databases (MongoDB, Redis, ES, Queue based etc.)

Skills:- prometheus, grafana, ELKI, Kubernetes, Terraform, Docker, Amazon Web Services (AWS), Google Cloud Platform (GCP), Python, Bash and Shell Scripting

HighLevel

Site Reliability Engineer

Delhi, IN

Summary

How strong is your resume?

How strong is your resume?

MORE JOBS LIKE THIS

People also searched:

Our Company

Career Guides

Career Advice

Support