We are looking for a highly skilled Python Data Engineer to develop, maintain, and optimize data pipelines, ensuring seamless data integration and processing. The ideal candidate will have expertise in Python, SQL, ETL processes, and cloud data platforms, with a strong focus on performance, scalability, and data integrity.
Key Responsibilities
Develop, optimize, and maintain ETL/ELT data pipelines using Python and other tools.
Design and implement scalable data solutions for real-time and batch processing.
Work with SQL and NoSQL databases (PostgreSQL, MySQL, MongoDB, etc.) for efficient data storage and retrieval.
Integrate data from multiple sources such as APIs, cloud storage, and data warehouses.
Implement data quality checks, validation, and monitoring mechanisms.
Work with big data technologies such as Spark, Hadoop, or Dask (preferred).
Deploy and manage data workflows on cloud platforms like AWS, GCP, or Azure.
Collaborate with data scientists, analysts, and other engineers to build efficient data infrastructure.
Implement CI/CD pipelines for automating data workflows and deployments.
Ensure data security, compliance, and governance best practices.
Required Skills & Qualifications
Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field.
5+ years of experience in data engineering, with a focus on Python development.
Strong expertise in SQL and database management (PostgreSQL, MySQL, Redshift, Snowflake, BigQuery).
Experience with ETL/ELT frameworks like Airflow, Luigi, or Prefect.
Familiarity with big data technologies (Apache Spark, Kafka, Hadoop).
Hands-on experience with cloud platforms (AWS S3, Glue, Lambda, GCP BigQuery, Azure Data Factory).
Proficiency in working with APIs, data lakes, and structured/unstructured data.
Strong problem-solving skills and ability to optimize data processes for scalability.
Experience with containerization (Docker, Kubernetes) and CI/CD is a plus.
Preferred Qualifications
Knowledge of ML Ops and AI-driven data pipelines.
Experience with GraphQL and real-time data streaming.
Understanding of data security best practices and compliance (GDPR, HIPAA, etc.).
(ref:hirist.tech)
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job