Key Responsibilities:
Design, develop, and maintain ETL pipelines for structured and semi-structured data
Build data models and optimize performance on Snowflake data warehouse
Work with AWS services (e.g., S3, Lambda, Glue, Redshift, EC2) to handle large-scale data workflows
Write efficient, reusable, and testable Python code for data manipulation and pipeline automation
Perform data validation and ensure data quality across ingestion and transformation layers
Collaborate with analysts, data scientists, and business teams to deliver reliable datasets
Monitor pipeline performance and troubleshoot issues proactively
Required Skills:
5+ years of experience in data engineering or a similar role
Strong experience with ETL tools/processes
Proficient in Snowflake (data modeling, performance tuning, SQL scripting)
Hands-on with AWS cloud ecosystem, especially S3, Glue, Lambda, Redshift
Strong programming skills in Python
Familiarity with orchestration tools like Airflow or AWS Step Functions
Experience working with large datasets, batch and streaming data pipelines
Preferred Qualifications:
Experience with DevOps/DataOps practices (CI/CD for data)
Knowledge of data governance, security, and compliance best practices
Experience in agile development environments
Education:
Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or a related field.