The Databricks Data Engineering Lead role is ideal a highly skilled Databricks Data Engineer who will architect and lead the implementation of scalable, high-performance data pipelines and platforms using the Databricks Lakehouse ecosystem. The role involves managing a team of data engineers, establishing best practices, and collaborating with cross-functional stakeholders to unlock advanced analytics, AI/ML, and real-time decision-making capabilities.
Key Responsibilities:
Lead the design and development of modern data pipelines, data lakes, and lakehouse architectures using Databricks and Apache Spark.
Manage and mentor a team of data engineers, providing technical leadership and fostering a culture of excellence.
Architect scalable ETL/ELT workflows to process structured and unstructured data from various sources (cloud, on-prem, streaming).
Build and maintain Delta Lake tables and optimize performance for analytics, machine learning, and BI use cases.
Collaborate with data scientists, analysts, and business teams to deliver high-quality, trusted, and timely data products.
Ensure best practices in data quality, governance, lineage, and security, including the use of Unity Catalog and access controls.
Integrate Databricks with cloud platforms (AWS, Azure, or GCP) and data tools (Snowflake, Kafka, Tableau, Power BI, etc.).
Implement CI/CD pipelines for data workflows using tools such as GitHub, Azure DevOps, or Jenkins.
Stay current with Databricks innovations and provide recommendations on platform strategy and architecture improvements
Qualifications:
Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
Experience:
7+ years of experience in data engineering, including 3+ years working with Databricks and Apache Spark.
Proven leadership experience in managing and mentoring data engineering teams.
Skills:
Proficiency in PySpark, SQL, and experience with Delta Lake, Databricks Workflows, and MLflow.
Strong understanding of data modeling, distributed computing, and performance tuning.
Familiarity with one or more major cloud platforms (Azure, AWS, GCP) and cloud-native services.
Experience implementing data governance and security in large-scale environments.
Experience with real-time data processing using Structured Streaming or Kafka.
Knowledge of data privacy, security frameworks, and compliance standards (e.g., PCIDSS, GDPR).
Exposure to machine learning pipelines, notebooks, and ML Ops practices.
Certifications :
Databricks Certified Data Engineer or equivalent certification.
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job