ProtoGene Consulting Private Limited

Data Engineer(AWS+PYSPARK+SQL)

Maharashtra, IN

4 months ago
Save Job

Summary

Data Engineer + Integration engineer + Support specialistExp – 5-8 years

Necessary Skills:

  • SQL & Python / PySpark
  • AWS Services: Glue, Appflow, Redshift
  • Data warehousing
  • Data modelling

Job Description:

  • Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform. Design/ implement, and maintain the data architecture for all AWS data services
  • A strong understanding of data modelling, data structures, databases (Redshift), and ETL processes
  • Work with stakeholders to identify business needs and requirements for data-related projects

Strong SQL and/or Python or PySpark knowledge

  • Creating data models that can be used to extract information from various sources & store it in a usable format
  • Optimize data models for performance and efficiency
  • Write SQL queries to support data analysis and reporting
  • Monitor and troubleshoot data pipelines
  • Collaborate with software engineers to design and implement data-driven features
  • Perform root cause analysis on data issues
  • Maintain documentation of the data architecture and ETL processes
  • Identifying opportunities to improve performance by improving database structure or indexing methods
  • Maintaining existing applications by updating existing code or adding new features to meet new requirements
  • Designing and implementing security measures to protect data from unauthorized access or misuse
  • Recommending infrastructure changes to improve capacity or performance

Experience in Process industry

Data Engineer + Integration engineer + Support specialistExp – 3-5 years

Necessary Skills:

  • SQL & Python / PySpark
  • AWS Services: Glue, Appflow, Redshift
  • Data warehousing basics
  • Data modelling basics

Job Description:

  • Experience of implementing and delivering data solutions and pipelines on AWS Cloud Platform.
  • A strong understanding of data modelling, data structures, databases (Redshift)

Strong SQL and/or Python or PySpark knowledge

  • Design and implement ETL processes to load data into the data warehouse
  • Creating data models that can be used to extract information from various sources & store it in a usable format
  • Optimize data models for performance and efficiency
  • Write SQL queries to support data analysis and reporting
  • Collaborate with team to design and implement data-driven features
  • Monitor and troubleshoot data pipelines
  • Perform root cause analysis on data issues
  • Maintain documentation of the data architecture and ETL processes
  • Maintaining existing applications by updating existing code or adding new features to meet new requirements
  • Designing and implementing security measures to protect data from unauthorized access or misuse
  • Identifying opportunities to improve performance by improving database structure or indexing methods
  • Designing and implementing security measures to protect data from unauthorized access or misuse
  • Recommending infrastructure changes to improve capacity or performance

Skills:- PySpark, Amazon Web Services (AWS), SQL, AWS GLUE and Apache Airflow

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job