The Cloud Data Engineer will be responsible for designing, implementing end to end solutions in Azure Enterprise Data Lake and optimizing data pipelines, ensuring the availability, performance, scalability and security of large-scale data processing systems. This role requires deep understanding of big data technologies, data architecture, infrastructure, CI/CD and data engineering best practices. Experience with Unity Catalog is a bonus. The Cloud Data Engineer will work closely with architects, leads and other stakeholders to support data driven decision making processes.
Experience Required
Minimum 8+ years of experience with strong hands-on experience as a senior Data Engineer or related role
5+ years of demonstrated experience in developing Big Data solutions that support business analytics and data science teams
3-5 years of proficient Data ingestion end-to-end implementation of projects in Azure Enterprise Data Lake, Azure Functions, Databricks, Blob Storage, Cosmos DB, Azure stream analytics, Python, SQL
Extensive hands-on experience implementing Lake house architecture using Data bricks Data Engineering platform, SQL, Unix shell scripting, SQL Analytics, Delta Lake, and Unity Catalog
Good understanding of spark architecture with Databricks structured streaming, setting up Azure with Databricks, managing clusters in Databricks
Experienced in DevOps and deployment automations with Azure DevOps - ARM, YAML, Terraform
Ability to research the latest trends and propose advanced tooling/solutions for Cloud Data Lake & Data Science platforms
Experience with business intelligence and analytics tools such as OBIEE, PowerBI or Tableau
Collaborate applications teams/Business users to develop new pipelines with Cloud data migration methodologies and processes including tools like Azure Data Factory, Event Hub, etc
Roles & Responsibilities
Drive and implement design of data schemas, drive cloud data lake platform design decisions and development standards and maintain data pipelines for data ingestion, processing, and transformation in Azure.
Drive analysis, architecture, design, governance and development of data warehouse, data lake, and business intelligence solutions
Using a combination of Azure Data factory, Azure Blob Storage, T-SQL, Pyspark and Azure Databricks should be able to Extract, transform and load from sources system to Azure Data Storage services
By ensuring data quality, consistency and reliability, Integrate data from various sources,
Define data requirements, gather and mine large scale structured and unstructured data, and validate data using various tools in a cloud environment.
Manage and optimize Azure Enterprise data Lake to achieve efficient data storge and processing.
Develop and optimize ETL processes using Databricks and related tools like Apache Spark
Implementing data validation and cleansing procedures will ensure the quality, integrity, and dependability of the data.
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job