DATAECONOMY

AWS Devops Engineer/SRE

Hyderabad, TS, IN

about 1 month ago
Save Job

Summary

About Us

About DATAECONOMY: We are a fast-growing data & analytics company headquartered in Dublin with offices inDublin, OH, Providence, RI, and an advanced technology center in Hyderabad,India. We are clearly differentiated in the data & analytics space via our suite of solutions, accelerators, frameworks, and thought leadership.

Job Description

Job Summary:

We are seeking an experienced Observability Engineer with a strong DevOps background to design, implement, and manage observability solutions across cloud and on-prem environments. The ideal candidate will have expertise in monitoring, logging, tracing, and alerting to ensure high system availability, performance, and reliability.

Key Responsibilities

  • Design & Implement Observability Solutions: Develop and maintain monitoring, logging, and tracing solutions using industry-leading tools (Prometheus, Grafana, Datadog, New Relic, Splunk, etc.).
  • Performance Monitoring & Optimization: Ensure proactive identification and resolution of performance bottlenecks in distributed systems.
  • Logging & Tracing: Set up and manage centralized logging solutions (ELK/EFK stack, Fluentd, OpenTelemetry).
  • Alerting & Incident Management: Configure alerting mechanisms using tools like PagerDuty, Ops genie, or VictorOps for proactive issue detection.
  • SRE Practices: Implement Site Reliability Engineering (SRE) principles to enhance system reliability and reduce MTTR (Mean Time to Resolution).
  • Automation & Infrastructure as Code (IaC): Automate observability setup and configurations using Terraform, Ansible, or similar tools.
  • Cloud & Kubernetes Monitoring: Implement observability best practices for cloud platforms (AWS, Azure, GCP) and containerized environments (Kubernetes, Docker).
  • Collaboration: Work closely with development, SRE, and operations teams to ensure end-to-end observability of applications and services.
  • Compliance & Security: Ensure logging and monitoring solutions adhere to security and compliance requirements.

Requirements

Required Skills & Qualifications:

  • 6-10 years of experience in DevOps, SRE, or Observability engineering.
  • Strong hands-on experience with observability tools like Prometheus, Grafana, New Relic, Datadog, Splunk, ELK/EFK, OpenTelemetry, AppDynamics, etc.
  • Experience in setting up distributed tracing solutions (Jaeger, Zipkin, OpenTelemetry).
  • Expertise in Kubernetes monitoring using Prometheus, Thanos, Loki, or similar tools.
  • Strong proficiency in scripting (Python, Bash, Shell) for automation.
  • Hands-on experience with Terraform, Ansible, Helm, or CloudFormation for infrastructure automation.
  • Proficiency in CI/CD pipelines and GitOps methodologies using Jenkins, GitLab CI, ArgoCD, or Flux.
  • Experience in public cloud environments (AWS, Azure, GCP) and monitoring cloud-native services.
  • Strong troubleshooting and root cause analysis (RCA) skills.
  • Understanding of SLIs, SLOs, and error budgets as part of SRE best practices.
  • Familiarity with log management, anomaly detection, and AI-based observability solutions is a plus.

Benefits

As per company standards.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: