A startup based in Arlington VA is seeking a Site Reliability Engineer to support the deployment, performance, and resilience of a cutting-edge platform within a secure, high-stakes environment. This role is ideal for someone who thrives in operationally sensitive environments, is deeply technical, and wants to make a real-world impact supporting national security missions.
What You’ll Do
Maintain and scale production infrastructure within a secure on-prem environment
Automate deployments, monitoring, and maintenance to support high availability and performance
Debug complex infrastructure and application issues under tight operational constraints
Collaborate with software engineers to improve reliability, observability, and platform performance
Monitor system health, develop runbooks, and ensure disaster recovery and backup processes are in place
Work hands-on with classified systems, ensuring compliance with all security requirements
Who You Are:
Experienced SRE, DevOps Engineer, or Infrastructure Engineer with hands-on systems and operations experience
Strong background with Linux systems, containers (Docker, Kubernetes), and scripting (Python, Bash, etc.)
Familiarity with on-prem deployments and air-gapped environments
Skilled in monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK, etc.)
Active TS/SCI clearance
Comfortable working full-time, on-site at Fort Meade
Preferred Qualifications:
Familiarity with CI/CD pipelines, infrastructure-as-code (Terraform, Ansible), and security hardening
Understanding of network protocols and secure system architecture
Posted By: Patrick Fuller
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job