Crisil

Databricks Data Engineering Lead

Mumbai, MH, IN

6 days ago
Save Job

Summary

Position Overview:

The Databricks Data Engineering Lead role is ideal a highly skilled Databricks Data Engineer who will architect and lead the implementation of scalable, high-performance data pipelines and platforms using the Databricks Lakehouse ecosystem. The role involves managing a team of data engineers, establishing best practices, and collaborating with cross-functional stakeholders to unlock advanced analytics, AI/ML, and real-time decision-making capabilities.

Key Responsibilities:

  • Lead the design and development of modern data pipelines, data lakes, and lakehouse architectures using Databricks and Apache Spark.
  • Manage and mentor a team of data engineers, providing technical leadership and fostering a culture of excellence.
  • Architect scalable ETL/ELT workflows to process structured and unstructured data from various sources (cloud, on-prem, streaming).
  • Build and maintain Delta Lake tables and optimize performance for analytics, machine learning, and BI use cases.
  • Collaborate with data scientists, analysts, and business teams to deliver high-quality, trusted, and timely data products.
  • Ensure best practices in data quality, governance, lineage, and security, including the use of Unity Catalog and access controls.
  • Integrate Databricks with cloud platforms (AWS, Azure, or GCP) and data tools (Snowflake, Kafka, Tableau, Power BI, etc.).
  • Implement CI/CD pipelines for data workflows using tools such as GitHub, Azure DevOps, or Jenkins.
  • Stay current with Databricks innovations and provide recommendations on platform strategy and architecture improvements

Qualifications:

  • Education: Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
  • Experience:
    • 7+ years of experience in data engineering, including 3+ years working with Databricks and Apache Spark.
    • Proven leadership experience in managing and mentoring data engineering teams.
  • Skills:
    • Proficiency in PySpark, SQL, and experience with Delta Lake, Databricks Workflows, and MLflow.
    • Strong understanding of data modeling, distributed computing, and performance tuning.
    • Familiarity with one or more major cloud platforms (Azure, AWS, GCP) and cloud-native services.
    • Experience implementing data governance and security in large-scale environments.
    • Experience with real-time data processing using Structured Streaming or Kafka.
    • Knowledge of data privacy, security frameworks, and compliance standards (e.g., PCIDSS, GDPR).
    • Exposure to machine learning pipelines, notebooks, and ML Ops practices.
  • Certifications :
    • Databricks Certified Data Engineer or equivalent certification.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job