hackajob is collaborating with Comcast to connect them with exceptional tech professionals for this role.
Comcast’s Observability Program provides critical platform services across the company, ensuring teams can monitor, analyze, and optimize application performance. As an Engineer 3, you will play a key role in designing, developing, and maintaining observability platforms, including MetriX, OCP, ADE, Tracing, Grafana, and Elasticsearch.
This position requires a high level of technical expertise, problem-solving skills, and experience working in large-scale, high-traffic distributed systems. You will collaborate with teams across Comcast to enhance logging, metrics, tracing, visualization, and alerting capabilities while leading efforts to improve automation, scalability, and reliability.
Job Description
This team has members in Reston, VA; West Chester, PA; and our downtown Philadelphia offices, and we will equally consider candidates in these locations. It is not open for remote/virtual hire.
We are unable to provide sponsorship for this role now or in the future.
Core Responsibilities
Software Development & Platform Engineering
Design, develop, and maintain observability platforms using Python and Golang.
Build and enhance React-based frontends for observability dashboards and self-service tools.
Implement RESTful APIs and microservices.
Develop and manage infrastructure using Kubernetes and Docker
Automate observability tooling and deployment processes using Helm, Terraform, and Ansible.
Observability & System Monitoring
Architect scalable and resilient observability solutions for logging, metrics, and tracing.
Optimize and scale Elasticsearch, Prometheus/VictoriaMetrics, and Grafana for high-volume data ingest and queries.
Integrate OpenTelemetry for distributed tracing and RED metric generation.
Improve real-time anomaly detection pipelines with ADE (Anomaly Detection Exporter).
Operational Excellence & Support
Provide technical leadership in incident response, troubleshooting, and resolution.
Collaborate with internal teams to improve observability best practices and operational insights.
Participate in an on-call rotation for platform support, including after-hours and weekends.
Enhance CI/CD automation to streamline deployments and reduce manual intervention.
Mentorship & Leadership
Lead technical discussions and provide guidance to junior engineers.
Advocate for best practices in observability, software development, and automation.
Contribute to technical documentation and internal knowledge-sharing initiatives.
Required Skills & Experience
Python (4+ years) and JavaScript (React) (3+ years)
Experience with Kubernetes, Docker, and container orchestration (3+ years)
Expertise in REST API development (3+ years)
Experience working with Linux environments and shell scripting (Bash, Python, or similar)
Strong understanding of CI/CD pipelines and GitOps workflows
Familiarity with Helm, Terraform, Ansible, and Packer
Experience troubleshooting large-scale distributed systems
Excellent communication, collaboration, and problem-solving skills
Preferred/Nice-to-have
Experience with Golang
Knowledge of Microsoft Graph API
Background in enterprise-scale observability systems
Experience with multi-cloud observability architectures (AWS, Azure, GCP)
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job