Data Engineer
Location - Hyderabad
Experience - 4-9 Years
Responsibilities:
○ Utilize Data Build Tool (dbt) to transform raw data into curated data models according to business requirements.
○ Implement data transformations and aggregations to support analytical and reporting needs.
- Orchestration and Automation:
○ Design and implement automated workflows using Google Cloud Composer to orchestrate data pipelines and ensure timely data delivery.
○ Monitor and troubleshoot data pipelines, identifying and resolving issues proactively.
○ Develop and maintain documentation for data pipelines and workflows.
○ Leverage GCP services, including BigQuery, Cloud Storage, and Pub/Sub, to build a robust and scalable data platform.
○ Optimize BigQuery performance and cost through efficient query design and data partitioning.
○ Implement data security and access controls in accordance with banking industry standards.
- Collaboration and Communication:
○ Collaborate with Solution Architect and Data Modeler to understand data requirements and translate them into technical solutions.
○ Communicate effectively with team members and stakeholders, providing regular updates on project progress.
○ Participate in code reviews and contribute to the development of best practices.
- Data Pipeline Development:
○ Design, develop, and maintain scalable and efficient data pipelines using Google Cloud Dataflow to ingest data from various sources, including relational databases (RDBMS), data streams, and files.
○ Implement data quality checks and validation processes to ensure data accuracy and consistency.
○ Optimize data pipelines for performance and cost-effectiveness.
- Banking Domain Knowledge (Preferred):
○ Understanding of banking data domains, such as customer data, transactions, and financial products.
○ Familiarity with regulatory requirements and data governance standards in the banking industry.
Required Experience:
- Bachelor's degree in computer science, Engineering, or a related field.
- ETL Knowledge.
- 4-9 years of experience in data engineering, with a focus on building data pipelines and data transformations.
- Strong proficiency in SQL and experience working with relational databases.
- Hands-on experience with Google Cloud Platform (GCP) services, including Dataflow, BigQuery, Cloud Composer, and Cloud Storage.
- Experience with data transformation tools, preferably Data Build Tool (dbt).
- Proficiency in Python or other scripting languages is a plus.
- Experience with data orchestration and automation.
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration skills.
- Experience with data streams like Pub/Sub or similar.
- Experience in working with files such as CSV, JSON and Parquet.
Primary Skills:
GCP, Dataflow, BigQuery, Cloud Composer, Cloud Storage, Data Pipeline, Composer, SQL, DBT, DWH Concepts.
Secondary Skills:
Python, Banking Domain knowledge, pub/sub, Cloud certifications (e.g. Data engineer), Git or any other version control system.