Open Assessment Technologies S.A.

Site Reliability Engineer

Spain

13 days ago
Save Job

Summary

Job purpose


As a DevOps Site Reliability Engineer (SRE), you will be responsible for ensuring the reliability, scalability, and performance of our systems. You will bridge the gap between development and operations by applying software engineering principles to infrastructure and operations problems. Your role will focus on automation, incident response, monitoring, capacity planning, and improving system resilience while supporting production workloads on Google Cloud Platform (GCP).



Duties and responsibilities


  • Design, implement, and maintain highly available, scalable, and resilient cloud-based infrastructure using Google Cloud Platform (GCP).
  • Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Service Level Agreements (SLAs).
  • Conduct capacity planning, performance tuning, and load testing to optimize system performance.
  • Develop chaos engineering practices to identify and mitigate failure scenarios.
  • Develop and maintain Infrastructure as Code (IaC) using Terraform, Ansible, or equivalent tools.
  • Automate system provisioning, configuration management, and deployments using CI/CD pipelines (ArgoCD, GitOps, GitHub Actions).
  • Improve auto-healing and self-recovery capabilities in production environments.
  • Monitor system health and performance using Google Cloud Operations Suite (Stackdriver), Prometheus, Dynatrace, Grafana and Datadog.
  • Participate in on-call rotation, troubleshoot and resolve production incidents by applying root cause analysis (RCA).
  • Implement postmortem processes and drive corrective actions to prevent recurrence.
  • Implement and enforce security best practices, ensuring compliance with ISO 27001, SOC 2, and GDPR.
  • Apply IAM (Identity & Access Management) best practices for secure cloud operations.
  • Manage network security, including firewalls, VPNs, and service mesh (e.g., Istio).
  • Work closely with development, security, and operations teams to improve deployment strategies.
  • Advocate for blameless postmortems, knowledge sharing, and documentation improvements.
  • Lead SRE best practices adoption, including error budgeting and toil reduction.


Qualifications and skills


  • 3+ years of experience in a DevOps, SRE, or Cloud Engineering role.
  • Strong expertise in Google Cloud Platform (GCP) services, including GKE, Cloud Run, Cloud Functions, Cloud SQL, BigQuery, and Pub/Sub.
  • Experience with Kubernetes (GKE) and container orchestration.
  • Proficiency in Terraform, Helm, and Kubernetes operators for infrastructure automation.
  • Strong scripting and automation skills in Python, Bash, or Go.
  • Experience with monitoring, logging, and tracing tools (e.g., Google Cloud Operations Suite, Prometheus, OpenTelemetry).
  • Strong understanding of CI/CD pipelines using tools like ArgoCD, Jenkins, or GitHub Actions.
  • Knowledge of GitOps methodologies and IaC best practices.
  • Strong experience with PostgreSQL, Redis, and NoSQL databases.
  • Strong problem-solving and critical-thinking skills.
  • Ability to work collaboratively in a fast-paced environment.
  • Strong communication and documentation skills.
  • Ability to manage incidents under pressure and work on-call as needed.
  • Experience with multi-cloud (AWS/GCP) and hybrid environments.
  • Knowledge of site reliability engineering principles (Google SRE).
  • Understanding of security best practices for cloud-native applications.
  • Google Cloud Certification (Professional Cloud DevOps Engineer, Professional Cloud Architect) is a plus.


Benefits:


  • International environment
  • Growth opportunities and certifications
  • Company wide events (Summer & Winter)
  • Flextime work scheme
  • Work from home


Preferred location: Spain and Luxembourg


Note: When applying for this job you are accepting the internal hiring policies, this includes a Background check that will take place at the end of the selection process with OAT. You are also acknowledging you have read our Data Privacy policy on how we treat your personal data. You can check that here: https://www.taotesting.com/about/privacy/

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: