SITA

Lead, Site Reliability Engineer

Delhi, IN

3 days ago
Save Job

Summary

Overview

WELCOME TO SITA

We're the team that keeps airports moving, airlines flying smoothly, and borders open. Our tech and communication innovations are the secret behind the success of the world's air travel industry.

You'll find us at 95% of international hubs. We partner closely with over 2,500 transportation and government clients, each with their own unique needs and challenges. Our goal is to find fresh solutions and cutting-edge tech to make their operations run like clockwork. Want to be a part of something big?

Are you ready to love your job? The adventure begins right here, with you, at SITA.

PURPOSE

Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger events and develops both manual remediation approaches and automated workflows to resolve alerts. Oversees the deployment of IT services and solutions, ensuring successful integration with minimal disruption. Focuses on operational automation and integration to enhance efficiency and collaboration between development and operations within service operations.

What Will You Do:

  • Define, build, and maintain support systems to ensure high availability and performance.
  • Handle complex cases for the Operations team.
  • Build events to add to the event catalog for the relevant product or application.
  • Implement automation for system provisioning, self-healing, auto recovery, deployment, and monitoring.
  • Perform incident response and root cause analysis for critical system failures.
  • Monitor system performance and establish service-level indicators (SLIs) and objectives (SLOs).
  • Collaborate with development and operations to integrate reliability best practices, including moving to zero downtime architecture.
  • Proactively identify and remediate performance issues.
  • Work closely with Product, Software & Infra Engineering and Service support architects for new product productization
  • Ensure Operations readiness to support new products
  • Coordinate with internal and external stakeholders for feedback for continual service improvement for inscope products & drive plan till successful closure
  • Accountable for the in-scope product to ensure high availability performance.


Problem Management

  • Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions
  • Coordinate with incident management teams, operations experts and collaborate with different Service Operations and Engineering teams to develop and implement permanent solutions.
  • Monitor the effectiveness of problem resolution activities, provide regular reports on problem management activities, and ensure continuous improvement.


Event Management

  • Define and maintain an event catalog, specifying active events, thresholds, and relevant remediation, and optimize it for efficiency.
  • Develop event response protocols, provide training to teams, and ensure quick and efficient handling of incidents.
  • Collaborate with stakeholders to define events, ensure coverage across the Service Operations, and drive improvements based on post-event reviews and feedback.


Deployment Management

  • Own the quality of new release deployment for the Service Operations, ensuring a clear process and responsibilities are assigned for smooth implementation.
  • Develop and maintain deployment schedules, conduct operational readiness assessments, and manage deployment risk assessments to ensure service stability.
  • Oversee the execution of deployment plans, coordinate resources & process with delivery and lifecycle engineering, communicate with stakeholders, and continuously work with different stakeholders to improve deployment processes based on feedback.


DevOps/NetOps Management

  • Manage continuous integration and deployment (CI/CD) pipelines, ensuring smooth integration between development and operational teams.
  • Automate operational processes, monitor system performance, and resolve issues related to automation scripts to increase efficiency.
  • Implement and manage infrastructure as code, provide ongoing support for automation tools, and continuously improve DevOps practices.


Qualifications

Educational Background

  • Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field.
  • Advanced degree (Master’s or equivalent) is often preferred for senior positions.


Qualifications

  • Relevant certifications such as Linux Administration, Certified Kubernetes Administrator (CKA)
  • Certifications in cloud platforms (AWS, Azure, Google Cloud) or DevOps methodologies (e.g., Certified DevOps Professional)


Experience

  • 8+ years of experience in IT operations, service management, or infrastructure management, including roles such as Site Reliability Engineer, or DevOps lead
  • Proven experience in managing high-availability systems and ensuring operational reliability
  • Extensive experience in root cause analysis (RCA), incident management, and developing permanent solutions for recurring service disruptions.
  • Hands-on experience with CI/CD pipelines, automation, system performance monitoring, and the implementation of infrastructure as code.
  • Strong background in collaborating with cross-functional teams (development, operations, engineering, etc.) to improve operational processes and service delivery.
  • Experience in managing deployments, risk assessments, and optimizing event and problem management processes.
  • Familiarity with cloud technologies, containerization, and scalable architecture, including experience with zero-downtime deployment strategies.


WHAT WE OFFER:

SITA’s workplace is all about diversity, many different countries and cultures are represented in our workforce. We collaborate in our impressive offices, embracing a hybrid work format. As part of our global benefits, we offer:

🏡 Flex Week: Work from home up to 2 days/week (depending on your Team's needs).

⏰ Flex Day: You may wish to flex your arrival time at the office to beat rush hours or leave earlier for personal commitments. We encourage open communication with your manager about your needs and routine.

🌎 Flex-Location: Enjoy up to 30 workdays of benefits, anywhere in the world!

🌿 Employee Wellbeing: Benefit from the Employee Assistance Program (EAP) provided by SITA, a yearly free service offering practical advice in various aspects of your life.

🚀 Professional Development: Enhance your skills with our training platforms, inclusive of LinkedIn Learning!

🙌🏽 Competitive Benefits: Access competitive benefits tailored to the local market and your employment status.

SITA is an Equal Opportunity Employer and values a diverse workforce. In support of our Employment Equity Program, women, aboriginal people, members of visible minorities, and/or persons with disabilities are encouraged to apply and self-identify in the application process.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: