Role Name : Senior SRE
Location : Ireland (Full Remote)
Type : Contract
Duration : 12 months
About the Role
• We are seeking a highly experienced Senior Site Reliability Engineer with a deep background in managing mission-critical, Unix-based systems in high-availability, low-latency gaming environments.
• The ideal candidate thrives in resilient service design, excels at automation, and has a strong track record of supporting services with millions of active users.
• As a key member of the SRE team, you will lead the design, reliability, and scalability of platforms supporting live gaming systems.
• You will architect and maintain systems across cloud and on-premise environments, contribute to observability and automation, and be a key part of the global on-call rotation to ensure service uptime.
Required Skills & Experience
• 7+ years in SRE, Infrastructure, or Systems Engineering roles supporting production environments.
• 7+ years extensive experience with Unix/Linux systems including Red Hat, Debian, Ubuntu, and CentOS.
• Strong debugging and optimization skills in memory, performance, and network tracing.
• 6+ years working with AWS and/or GCP on large-scale distributed services.
• Strong programming skills in Python and shell scripting.
• Deep understanding of configuration management, CI/CD workflows, and GitOps practices.
• Expertise with Terraform, Ansible, or similar IaC tools.
• Experience supporting hybrid infrastructure (cloud/on-prem) with VMware, Kubernetes, or containers.
• Hands-on experience with observability tools (Datadog, Prometheus, Grafana) and instrumentation.
• Demonstrated ability to troubleshoot complex, cross-system reliability issues under pressure