Epsilon EMEA

Lead Site Reliability Engineer

Bengaluru, KA, IN

28 days ago
Save Job

Summary

Overview

Job Description

About Business Unit

At the core of all that Epsilon does is a team that sets the foundation of our IT infrastructure. The team drives innovation and efficiency through disruptive technology across Epsilon's platforms and business verticals. From being the first point of contact for infrastructure needs to final deployment, the team provides end-to-end solutions for our client-facing platforms. ETS supports all aspects of

revenue-generating platforms for Epsilon and sets the architectural direction for our enterprise deployments. By embracing the latest technologies, such as Cloud, Automation, and Artificial Intelligence, the team is at the front of transforming our digital business and capturing new opportunities.

We are seeking a skilled **Site Reliability Engineer (SRE)** to support and optimize our 11,000+ on-premise servers along with cloud infrastructure. The ideal candidate will have expertise in **Linux, Windows, AWS, and Kubernetes**, with strong scripting skills in Python and Shell.

Click here to view how Epsilon transforms marketing with 1 View, 1 Vision and 1 Voice.

Responsibilities

  • Manage, monitor, and troubleshoot **11,000+ on-premise servers** (Linux & Windows).
  • Support and optimize **AWS cloud infrastructure** (EC2, S3, RDS, Lambda, etc.).
  • Administer and scale **Kubernetes clusters** for containerized workloads.
  • Automate deployments, scaling, and recovery processes using **Python and Shell scripting**.
  • Ensure high availability, performance, and security of production systems.
  • Collaborate with DevOps and development teams to improve CI/CD pipelines.
  • Implement **Infrastructure as Code (IaC)** using Terraform, Ansible, or CloudFormation.
  • Troubleshoot OS-level (Linux/Windows) and network-related issues.
  • Develop and maintain monitoring, logging, and alerting solutions (Grafana, PagerDuty, etc.).
  • Participate in **on-call rotations** for incident response and resolution.
  • Conduct root cause analysis (RCA) for critical outages and performance bottlenecks.
  • Optimize system performance through capacity planning and load testing.
  • Document infrastructure, processes, and automation workflows.
  • Stay updated with emerging cloud and SRE best practices.

Qualifications

  • **5+ years** in **SRE, DevOps, or Systems Engineering**.
  • Strong expertise in **Linux & Windows server administration**.
  • Hands-on experience with **AWS services** and **Kubernetes**.
  • Proficiency in **Python and Shell scripting** for automation.
  • Familiarity with **monitoring tools** (Zabbix, PagerDuty, Grafana).

Additional Information

Epsilon is a global data, technology and services company that powers the marketing and advertising ecosystem. For decades, we’ve provided marketers from the world’s leading brands the data, technology and services they need to engage consumers with 1 View, 1 Vision and 1 Voice. 1 View of their universe of potential buyers. 1 Vision for engaging each individual. And 1 Voice to harmonize engagement across paid, owned and earned channels.

Epsilon’s comprehensive portfolio of capabilities across our suite of digital media, messaging and loyalty solutions bridge the divide between marketing and advertising technology. We process 400+ billion consumer actions each day using advanced AI and hold many patents of proprietary technology, including real-time modeling languages and consumer privacy advancements. Thanks to the work of every employee, Epsilon has been consistently recognized as industry-leading by Forrester, Adweek and the MRC. Epsilon is a global company with more than 9,000 employees around the world.

Epsilon Has a Core Set Of 5 Values That Define Our Culture And Guide Us To Create Value For Our Clients, Our People And Consumers. We Are Seeking Candidates That Align With Our Company Values, Demonstrate Them And Make Them Meaningful In Their Day-to-day Work

  • Act with integrity. We are transparent and have the courage to do the right thing.
  • Work together to win together. We believe collaboration is the catalyst that unlocks our full potential.
  • Innovate with purpose. We shape the market with big ideas that drive big outcomes.
  • Respect all voices. We embrace differences and foster a culture of connection and belonging.
  • Empower with accountability. We trust each other to own and deliver on common goals.

Because You Matter

YOUniverse. A work-world with you at the heart of it!

At Epsilon, we believe people make the place. And everything we do is designed with you in mind. That’s why our work-world, aptly named ‘YOUniverse’ is focused on creating a nurturing environment that elevates your growth, wellbeing and work-life harmony. So, come be part of a people-centric workspace where care for you is at the core of all we do.

Take a trip to YOUniverse and explore our unique benefits, here

Epsilon is an Equal Opportunity Employer.

Epsilon is committed to promoting diversity, inclusion, and equal employment opportunities by using reasonable efforts to attract, recruit, engage and retain qualified individuals of all ethnicities and backgrounds, including, but not limited to, women, people of color, LGBTQ individuals, people with disabilities and any other underrepresented groups, traits or characteristics.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: