Ascendion

Site Reliability Engineer

McLean, VA, US

$150k–$160k/year
5 days ago
Save Job

Summary

About Ascendion


Ascendion is a full-service digital engineering solutions company. We make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. Our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. Headquartered in New Jersey, our workforce of 6,000+ Ascenders delivers solutions from around the globe. Ascendion is built differently to engineer the next.


Ascendion | Engineering to elevate life


We have a culture built on opportunity, inclusion, and a spirit of partnership. Come, change the world with us:

  • Build the coolest tech for world’s leading brands
  • Solve complex problems – and learn new skills
  • Experience the power of transforming digital engineering for Fortune 500 clients
  • Master your craft with leading training programs and hands-on experience

Experience a community of change makers!

Join a culture of high-performing innovators with endless ideas and a passion for tech. Our culture is the fabric of our company, and it is what makes us unique and diverse. The way we share ideas, learning, experiences, successes, and joy allows everyone to be their best at Ascendion.


*** About the Role ***


Job Title: Site Reliability Engineer


Key Responsibilities:


  • Design and build comprehensive SRE dashboards to monitor system health and business metrics.
  • Define and enforce SLA/SLO/SSO metrics to measure service reliability.
  • Develop observability and alerting solutions using tools like Prometheus, Grafana, CloudWatch, or similar.
  • Architect and implement active-passive architectures and multi-region failover strategies on AWS.
  • Drive end-to-end automation and CI/CD pipeline improvements with DevOps tools (e.g., Jenkins, GitLab, Terraform).
  • Build and deploy microservices-based applications with resilience and scalability in mind.
  • Collaborate with developers, DevOps, and infrastructure teams to troubleshoot and resolve complex production issues.
  • Lead efforts in incident response, postmortem analysis, and root cause mitigation.
  • Share hands-on experience and lessons learned from challenges in building, not just designing, complex systems.


Required Qualifications:


  • 5+ years of hands-on experience in Site Reliability Engineering.
  • Strong knowledge of SRE principles and practices.
  • Proficiency with AWS services, especially VPC, EC2, ECS/EKS, RDS, S3, Route53, and Lambda.
  • Deep experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation.
  • Solid programming/scripting skills in Python.
  • Experience designing and deploying highly available microservices architectures.
  • Working knowledge of containerization (Docker) and orchestration (Kubernetes preferred).
  • Experience building CI/CD pipelines and integrating with Git-based workflows.
  • Proven ability to architect systems for failure and resilience, especially in multi-region deployments.


Preferred Qualifications:


  • Experience building platforms from the ground up or modernizing legacy infrastructure.
  • Strong understanding of distributed systems and cloud-native patterns.
  • Past challenges faced in scaling and hardening systems, with specific examples.
  • Familiarity with service meshes, API gateways, and zero-trust architectures.
  • Experience in financial services or regulated environments is a plus.


Location: McLean - VA (Hybrid Role; Needs to work 3 days from Office in a week).


Salary Range: The salary for this position is between $150,000– $160,000 annually. Factors which may affect pay within this range may include geography/market, skills, education, experience, and other qualifications of the successful candidate.


Benefits: The Company offers the following benefits for this position, subject to applicable eligibility requirements: [medical insurance] [dental insurance] [vision insurance] [401(k) retirement plan] [long-term disability insurance] [short-term disability insurance] [5 personal days accrued each calendar year. The Paid time off benefits meet the paid sick and safe time laws that pertains to the City/ State] [10-15 days of paid vacation time] [6 paid holidays and 1 floating holiday per calendar year] [Ascendion Learning Management System]


Want to change the world? Let us know.

Tell us about your experiences, education, and ambitions. Bring your knowledge, unique viewpoint, and creativity to the table. Let’s talk!

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: