We are looking for an Only immediate joiner and experienced Data Engineer with a strong background in Kafka, PySpark, Python/Scala, Spark, SQL, and the Hadoop ecosystem. The ideal candidate should have over 5 years of experience and be ready to join immediately. This role requires hands-on expertise in big data technologies and the ability to design and implement robust data processing solutions.
Key Responsibilities
Design, develop, and maintain scalable data processing pipelines using Kafka, PySpark, Python/Scala, and Spark.
Work extensively with the Kafka and Hadoop ecosystem, including HDFS, Hive, and other related technologies.
Write efficient SQL queries for data extraction, transformation, and analysis.
Implement and manage Kafka streams for real-time data processing.
Utilize scheduling tools to automate data workflows and processes.
Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions.
Ensure data quality and integrity by implementing robust data validation processes.
Optimize existing data processes for performance and scalability.
Preferred Qualifications
Experience with GCP
Knowledge of data warehousing concepts and best practices.
Familiarity with machine learning and data analysis tools.
Understanding of data governance and compliance standards.
(ref:hirist.tech)
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job
How strong is your resume?
Upload your resume and get feedback from our expert to help land this job