Tarento Group

Data Engineer

Delhi, IN

about 1 month ago
Save Job

Summary

Key Responsibilities

  • Data Pipeline Development: Design, build, and maintain scalable and reliable data pipelines using Apache Spark, Kafka, and Apache Flink for real-time data processing and large-scale batch data workflows.
  • Real-time Data Streaming: Implement and manage real-time data streaming architectures leveraging Apache Kafka to process and transmit high volumes of streaming data in a fault-tolerant manner.
  • Data Transformation and Orchestration: Develop data transformation workflows and integrate data from various sources while ensuring that pipelines are robust, efficient, and adhere to data engineering best practices.
  • Data Quality Assurance: Implement data validation, quality checks, and monitoring systems to ensure data integrity and consistency across the entire data pipeline.
  • Collaboration with Cross-functional Teams: Work closely with Data Scientists, Analysts, and other stakeholders to understand data requirements and provide reliable data infrastructure solutions.
  • Performance Optimization: Continuously monitor and optimize data processing performance, focusing on scaling solutions and improving efficiency.
  • Documentation & Best Practices: Maintain clear documentation for data pipelines, data structures, and processes. Advocate for industry-standard data engineering practices across the team.
  • Tool Expertise: Leverage tools like Looker for Business Intelligence (BI) and BigQuery (BQ) for data warehousing to support analytics and decision-making processes.

Requirements

  • Proficiency in Scala and SQL: Strong experience in writing scalable, efficient, and maintainable code in Scala, and querying complex datasets using SQL.
  • Apache Spark Experience: Solid hands-on experience with Apache Spark for large-scale data processing, including performance tuning, fault tolerance, and optimization.
  • Kafka Expertise: Proficient in working with Apache Kafka to set up, manage, and scale real-time data streaming solutions.
  • Real-time Processing with Flink: Familiarity with Apache Flink and its capabilities for building real-time data processing pipelines.
  • Data Engineering Best Practices: Demonstrated experience in implementing industry-standard practices for data transformation, orchestration, and ensuring high data quality.
  • Looker & BigQuery: Knowledge of Looker for business intelligence, as well as BigQuery for data warehousing and querying large datasets.
  • Problem-Solving & Analytical Thinking: Strong analytical and problem-solving skills with a focus on optimizing data workflows and architectures.
  • Collaboration & Communication: Excellent communication and collaboration skills, with the ability to work effectively with both technical and non-technical teams.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: