Lead GCP Data Engineer
Canada Remote
We are looking for a skilled and motivated Lead Data Engineer with strong expertise in Python, PySpark, and Pandas, along with hands-on experience in Google Cloud Platform (GCP) services. The ideal candidate should have a deep understanding of data warehousing concepts, dimension modeling, and SQL, with proven experience building and supporting enterprise-level data solutions in a cloud environment..
Key Responsibilities:
• Design, develop, and maintain data pipelines and ETL workflows using Python and PySpark.
• Build and manage Enterprise Data Warehouses and Data Marts, ensuring high performance and scalability on GCP.
• Work on descriptive analytics and reporting by transforming and querying data using BigQuery SQL.
• Conduct peer code reviews, contribute to technical design documents, and help write unit tests and integration test cases.
• Collaborate with business analysts and stakeholders to understand data requirements and ensure data accuracy.
• Support, monitor, and troubleshoot data pipelines and systems in a production environment.
• Participate in deployment and CI/CD processes using GIT and GCP DevOps tools.
• Maintain detailed documentation and contribute to a knowledge repository for team enablement and capability building.
• Demonstrate strong problem-solving and communication skills in a fast-paced, collaborative setting.
• Stay updated with evolving cloud technologies and suggest improvements to enhance performance and efficiency.
Required Skills & Qualifications:
• 6+ years of experience working with Python, PySpark, and GCP.
• Hands-on experience with data transformation, data quality checks, and data validation in large-scale systems.
• Strong expertise in SQL (including advanced SQL concepts) and data warehousing methodologies.
• Hands-on experience with BigQuery, DataProc, and Cloud Storage in GCP.
• Understanding of dimensional modeling, star/snowflake schemas, and data marts.
• Experience with data validation, data quality, and ETL automation in large-scale environments.
• Relevant GCP certification (e.g., Google Professional Data Engineer) is preferred.
• Familiarity with version control systems like GIT and CI/CD processes