hackajob

Python Full stack Engineer

Reston, VA, US

13 days ago
Save Job

Summary

hackajob is collaborating with Comcast to connect them with exceptional tech professionals for this role.

  • Comcast’s Observability Program provides critical platform services across the company, ensuring teams can monitor, analyze, and optimize application performance. As an Engineer 3, you will play a key role in designing, developing, and maintaining observability platforms, including MetriX, OCP, ADE, Tracing, Grafana, and Elasticsearch.

This position requires a high level of technical expertise, problem-solving skills, and experience working in large-scale, high-traffic distributed systems. You will collaborate with teams across Comcast to enhance logging, metrics, tracing, visualization, and alerting capabilities while leading efforts to improve automation, scalability, and reliability.

  • Job Description

This team has members in Reston, VA; West Chester, PA; and our downtown Philadelphia offices, and we will equally consider candidates in these locations. It is not open for remote/virtual hire.

We are unable to provide sponsorship for this role now or in the future.

Core Responsibilities

  • Software Development & Platform Engineering
    • Design, develop, and maintain observability platforms using Python and Golang.
    • Build and enhance React-based frontends for observability dashboards and self-service tools.
    • Implement RESTful APIs and microservices.
    • Develop and manage infrastructure using Kubernetes and Docker
    • Automate observability tooling and deployment processes using Helm, Terraform, and Ansible.
  • Observability & System Monitoring
    • Architect scalable and resilient observability solutions for logging, metrics, and tracing.
    • Optimize and scale Elasticsearch, Prometheus/VictoriaMetrics, and Grafana for high-volume data ingest and queries.
    • Integrate OpenTelemetry for distributed tracing and RED metric generation.
    • Improve real-time anomaly detection pipelines with ADE (Anomaly Detection Exporter).
  • Operational Excellence & Support
    • Provide technical leadership in incident response, troubleshooting, and resolution.
    • Collaborate with internal teams to improve observability best practices and operational insights.
    • Participate in an on-call rotation for platform support, including after-hours and weekends.
    • Enhance CI/CD automation to streamline deployments and reduce manual intervention.
  • Mentorship & Leadership
    • Lead technical discussions and provide guidance to junior engineers.
    • Advocate for best practices in observability, software development, and automation.
    • Contribute to technical documentation and internal knowledge-sharing initiatives.

Required Skills & Experience

  • Python (4+ years) and JavaScript (React) (3+ years)
  • Experience with Kubernetes, Docker, and container orchestration (3+ years)
  • Expertise in REST API development (3+ years)
  • Experience working with Linux environments and shell scripting (Bash, Python, or similar)
  • Strong understanding of CI/CD pipelines and GitOps workflows
  • Familiarity with Helm, Terraform, Ansible, and Packer
  • Experience troubleshooting large-scale distributed systems
  • Excellent communication, collaboration, and problem-solving skills

Preferred/Nice-to-have

  • Experience with Golang
  • Knowledge of Microsoft Graph API
  • Background in enterprise-scale observability systems
  • Experience with multi-cloud observability architectures (AWS, Azure, GCP)

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: