EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are looking for an experienced and accomplished
Lead Data Software Engineer with a strong background in full-stack development and a leadership-driven mindset, complemented by an automation-first approach to engineering within a modern cloud data warehouse stack (BigQuery/Databricks).
In this role, you will guide the design and development of scalable, production-grade data infrastructure while mentoring and coordinating with Engineers, Data Analysts, and Data Scientists to deliver actionable real-time insights and empower senior leadership to make data-driven decisions. The ideal candidate combines technical expertise with leadership qualities, thrives in a code-driven environment, and is deeply committed to automation, system performance, and clean coding practices.
Responsibilities
- Lead the design and development of high-performance, fault-tolerant data pipelines using Python and SQL, prioritizing scalability, efficiency, and automation
- Oversee the architecture and implementation of end-to-end, production-grade data systems, integrating ingestion, transformation, and model deployment workflows into robust solutions
- Take ownership of building and maintaining real-time streaming pipelines and batch data workflows leveraging BigQuery/Databricks, Apache Airflow, and DBT
- Establish and advocate for clean, modular code standards with a focus on reusability and automating manual data engineering tasks
- Collaborate actively with cross-functional teams to drive the translation of complex business requirements into scalable technical solutions, with an emphasis on automation and operational excellence
- Design and implement advanced tools for monitoring, logging, and alerting to enhance the reliability and scalability of data infrastructure
- Work closely with application development teams to align backend system workflows with broader business logic and software components
- Lead discussions and decision-making processes regarding architecture, pipelines, and cloud infrastructure in data engineering initiatives
- Mentor and guide junior and senior engineers, fostering a culture of technical growth, knowledge sharing, and continual improvement within the team
- Identify and resolve bottlenecks in data workflows while proactively improving system performance and scalability
Requirements
- BS/MS in Computer Science, Software Engineering, or a related field
- 5+ years of experience in production-grade data engineering, with a focus on full-stack development and automation
- At least 1 year of relevant leadership experience
- Advanced proficiency in Python, SQL, and data processing frameworks such as Spark/PySpark for large-scale data systems
- Deep expertise in modern Cloud Data Warehousing tools like BigQuery or Databricks, coupled with a strong understanding of cloud-native architectures (AWS/GCP/Azure)
- Proven hands-on experience with CI/CD pipelines, version control (Git), and advanced testing frameworks
- Advanced-level familiarity with containerization (Docker) and orchestration technologies (Kubernetes) to scale data applications in distributed environments
- Extensive experience with workflow orchestration tools such as Apache Airflow and DBT for automating complex workflows
- In-depth knowledge of event-driven architectures and streaming systems (e.g., Kafka, Kinesis) to support real-time data applications
- Strong background in Agile, DevOps, or DataOps methodologies, including hands-on use of infrastructure-as-code tools like Terraform or Pulumi
- Exceptional communication, collaboration, and leadership skills, aligning with a minimum of an English B2+ proficiency level
Nice to have
- Proficiency in MySQL and experience with visualization platforms like Looker/Tableau, or large-scale analytics tools such as Amplitude, Snowplow, or Segment
- Proven cloud DevOps experience in managing infrastructure and deployments on platforms like AWS, GCP, or Azure
- Fundamental Linux/Unix system administration and shell scripting skills
- Practical knowledge of machine learning pipelines, MLOps techniques, and the deployment of ML models into production
- Experience delivering real-time analytics solutions using streaming technologies like Apache Flink or Spark Streaming
We offer
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn