Groww

Site Reliability Engineering Manager

Bengaluru, KA, IN

8 days ago
Save Job

Summary

About Groww


We are a passionate group of people focused on making financial services accessible to every Indian through a multi-product platform. Each day, we help millions of customers take charge of their financial journey. Customer obsession is in our DNA. Every product, every design, every algorithm down to the tiniest detail is executed keeping the customers’ needs and convenience in mind. Our people are our greatest strength. Everyone at Groww is driven by ownership, customer-centricity, integrity and the passion to constantly challenge the status quo.

Are you as passionate about defying conventions and creating something extraordinary as we are? Let’s chat.


Our Vision


Every individual deserves the knowledge, tools, and confidence to make informed financial decisions. At Groww, we are making sure every Indian feels empowered to do so through a cutting-edge multi-product platform offering a variety of financial services. Our long-term vision is to become the trusted financial partner for millions of Indians.


Our Values


Our culture enables us to be what we are — India’s fastest-growing financial services company. Everyone at Groww enjoys the autonomy and flexibility to bring their best work to the table, as well as craft a promising career for themselves.

The values that form our foundation are:

  • Radical customer-centricity
  • Ownership-driven culture
  • Keeping everything simple
  • Long-term thinking
  • Complete transparency


EXPERTISE AND QUALIFICATIONS


  • Collaborate with development teams to ensure the architecture and applications are designed with scalability, reliability and cost in mind.
  • Develop and maintain monitoring, alerting, and logging solutions to proactively identify and address performance issues and outages.
  • Orchestrate and own on-call rotations, responding to incidents, conducting post-incident reviews, and contributing to incident response improvements.
  • Analyze system performance data, identify bottlenecks, and recommend solutions to optimize performance and resource utilization.
  • Contribute to the design and implementation of disaster recovery strategies and backup solutions.
  • Stay with current industry trends, emerging technologies, and best practices to drive innovation and improvements in system reliability.
  • Plan and execute patching and upgrades for the PaaS and IaaS components.
  • Regularly connect with stakeholders to align on their infrastructure requirements
  • Manage a team of Site Reliability Engineers who work closely with our other Engineering teams to provide consistency in monitoring, process, deliverability.
  • Plan, prioritize, track, and deliver on internal and external projects, tasks, and goals.
  • Champion and advocate for your team across the company.
  • Build trust by communicating transparently and honestly with all members of the organization.


Requirements


  • Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent practical experience).
  • 8+ years of experience with at least two years with leadership experience in Site Reliability Engineering or similar role, with a proven track record of managing complex systems in a production environment.
  • Proficiency in programming/scripting languages such as Java, Python, or similar.
  • Expertise in Cloud Infrastructure solutions like Microsoft Azure, Google Cloud or AWS
  • Experience with multiple data stores (MySQL, MongoDB, Cassandra, Elasticsearch).
  • Experience in designing highly efficient in-house observability platforms using open-source tools like thanos, Prometheus, Datalog and Grafana
  • Solid knowledge of networking concepts, including load balancing, DNS, routing, and security.
  • Strong problem-solving skills and the ability to troubleshoot complex issues under pressure.
  • Excellent communication and collaboration skills to work effectively across teams.
  • Experience with CI/CD pipelines and version control systems (e.g., Jenkins, GitHub actions).
  • Possess a passion for reliability, through participation in architectural design.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: