Kloud9 LLC

Senior Data Engineer - PySpark

Bengaluru, KA, IN

6 months ago
Save Job

Summary

Responsibilities

  • Design and implement product features in collaboration with business and Technology stakeholders.
  • Must be able to write quality code and build secure, highly available systems.
  • Drive collaborative reviews of design, code, test plans, and dataset implementation performed by other data engineers in support of maintaining data engineering standards.
  • Analyze and profile data for the purpose of designing scalable solutions.
  • Clean, prepare, and optimize data at scale for ingestion and consumption by machine learning models.
  • Drive the implementation of new data management projects and re-structure the current data architecture.
  • Implement complex automated workflows and routines using workflow scheduling tools.
  • Build continuous integration, test-driven development, and production deployment frameworks.
  • Anticipate, identify, and solve issues concerning data management to improve data quality.
  • Design and build reusable components, frameworks, and libraries at scale to support learning.
  • Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues.
  • Mentor and develop other data engineers in adopting best practices.
  • Able to influence and communicate effectively, both verbally and written, with team members and business stakeholders.

Requirements

  • 5+ yearsof experience developing scalable Pyspark applications or solutions on distributed platforms.
  • Experience in Google Cloud Platform(GCP and good to have other cloud platform tools.
  • Experience working with Data warehousing tools, including, SQL, and Snowflake.
  • Experience architecting data production Streaming, Serverless, and Microservices Architecture and platform.
  • Experience with Pyspark, Spark (Scala/Python/Java), and Kafka.
  • Work experience with using Databricks (Data Engineering and Delta Lake components.
  • Experience working with Big Data platforms, including, Data Bricks, etc.
  • Experience working with distributed technology tools including Spark, Presto, Databricks, and Airflow.
  • Working knowledge of Data warehousing and data modeling.
  • Experience working in Agile and Scrum development processes.
  • Bachelor's degree in Computer Science, Information Systems, Business, or other relevant subject areas, Business, or another relevant subject area.

This job was posted by Sweta Prasad from Kloud9.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job