US3 Consulting

Site Reliability Engineer

Netherlands

19 days ago
Save Job

Summary

Location: Eindhoven, Netherlands (Hybrid work mode)


Role Description:

As a Developer with a focus on Site Reliability Engineering (SRE), you will play a pivotal role in ensuring the availability, performance, and scalability of critical systems and services. You will work closely with developers and operations teams to improve system reliability through automation, observability, and robust infrastructure practices.


Core Responsibilities:

System Reliability & Uptime

  • Design and implement strategies for high availability and system performance.
  • Define and monitor SLOs (Service Level Objectives), SLIs (Service Level Indicators), and Error Budgets.

Incident Management & Troubleshooting

  • Respond to outages and lead incident resolution efforts.
  • Drive blameless post-mortems and implement preventive measures.
  • Develop runbooks and automate recovery processes.
  • Participate in on-call rotation.

Infrastructure as Code (IaC)

  • Build and manage infrastructure using Terraform or similar tools.
  • Ensure infrastructure is reproducible, version-controlled, and auditable.

Monitoring & Observability

  • Implement and maintain monitoring tools (preferably Splunk).
  • Set up alerts and dashboards to monitor service health and performance.

Automation & Tooling

  • Automate deployments, scaling, failovers, and backups.
  • Develop internal tools to support CI/CD pipelines and team workflows.

Collaboration

  • Work closely with dev & ops teams to design scalable, supportable systems.
  • Promote CI/CD best practices, testing strategies, and release automation.


Essential Skills:

  • SRE Concepts: Reliability, availability, performance optimization.
  • Infrastructure as Code: Terraform or similar.
  • Monitoring/Logging: Splunk or equivalent observability stacks.
  • Incident Response: On-call support, post-mortems, automation of recovery.


Desirable Skills:

Programming & Scripting

  • Languages: Python, Bash, or Ruby.
  • Build tools, automate tasks, debug production issues.

Cloud Platforms

  • Proficiency in GCP and/or Azure.
  • Experience with cloud-native services, networking, and security.

Systems & Platforms

  • Strong knowledge of Linux/Unix systems, and preferably Windows.
  • Expertise in system internals, performance tuning, and debugging.

Containers & Orchestration

  • Hands-on experience with Docker, Kubernetes, or equivalent platforms.

CI/CD & Automation

  • Familiarity with Jenkins, GitHub Actions, ArgoCD, or similar.
  • Experience building and managing deployment pipelines.

Security & Compliance

  • Knowledge of access control, secrets management, audit logging.


Soft Skills

  • Excellent communication and collaboration skills.
  • Enjoys mentoring junior members.
  • Stays calm under pressure, especially during incidents.
  • Strong analytical and problem-solving mindset.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: