Infocus Technologies - AWS Data Engineer - SQL/PySpark
Maharashtra, IN
about 1 month ago
Save Job
Summary
About The Role
We are looking for an ETL Developer with 2-3 years of experience in AWS Data Engineering. The ideal candidate should have strong expertise in PySpark and SQL, along with experience in designing and maintaining efficient ETL pipelines. This role involves working closely with business stakeholders, data engineers, and analysts to ensure seamless data transformation, integration, and analysis.
Key Responsibilities
ETL Development & Data Processing
Design, develop, and maintain ETL pipelines to support business intelligence and analytics needs.
Process, transform, and load large datasets using PySpark and SQL in a cloud-based environment.
Optimize data workflows for performance, scalability, and reliability.
Data Management & Optimization
Implement data quality checks to ensure accuracy, consistency, and reliability.
Monitor and troubleshoot ETL failures, ensuring minimal downtime and quick resolution.
Optimize database queries and Spark jobs to improve efficiency and reduce processing time.
Collaboration & Documentation
Work closely with data engineers, analysts, and business teams to understand data needs and implement solutions.
Document ETL workflows, data transformation logic, and troubleshooting procedures for knowledge sharing.
Participate in code reviews, testing, and deployment activities to maintain high coding standards.
Required Skills & Experience
2-3 years of hands-on experience as an AWS Data Engineer.
Mandatory Skills
PySpark - Strong expertise in writing and optimizing PySpark scripts.
SQL - Ability to write complex SQL queries, optimize query performance, and work with relational databases.
Experience working with AWS Cloud-based data solutions.
Strong understanding of ETL concepts, data modeling, and data transformation.
Ability to troubleshoot and debug ETL processes effectively.
Knowledge of performance tuning techniques for big data processing.
Nice To Have (Preferred Skills)
Experience with AWS services like S3, Glue, Redshift, EMR, Lambda, or Athena.
Exposure to DevOps practices for data pipelines, including CI/CD.