ScoreMe Solutions

ScoreMe Solutions - Software Development Engineer II - Apache Spark

Gurugram, HR, IN

27 days ago
Save Job

Summary

We are seeking a highly skilled SDE II with 3 years of experience in developing data processing pipelines.

The ideal candidate will have extensive experience with Apache Spark, Java Spring Boot, and Python.

You will be responsible for creating a data processing pipeline to process millions of PDFs and extract meaningful Responsibilities :

  • Design, develop, and maintain data processing pipelines using Apache Spark.
  • Work extensively with Java Spring Boot to develop backend services.
  • Utilize Python for various data processing tasks and integration with other services.
  • Implement best practices for data processing, storage, and retrieval.
  • Optimize and scale the pipeline to handle large volumes of data efficiently.
  • Collaborate with data scientists, analysts, and other engineering teams to understand requirements and deliver solutions.
  • Troubleshoot and resolve performance issues and bugs in the data processing pipeline.
  • Ensure data quality, integrity, and security throughout the pipeline.
  • Document the architecture, design, and implementation of the Skills and Qualifications:
  • Bachelors or Masters degree in Computer Science, Engineering, or a related field.
  • 5+ years of experience in software development with a strong focus on data processing.
  • Extensive experience with Apache Spark and building data processing pipelines.
  • Strong proficiency in Java and Spring Boot framework.
  • Solid experience in Python for data processing and scripting.
  • Familiarity with distributed computing and parallel processing.
  • Experience with cloud platforms (AWS, Azure, GCP) is a plus.
  • Strong problem-solving skills and ability to work in a fast-paced environment.
  • Excellent communication and teamwork Qualifications:
  • Experience with other big data technologies like Spark, Hadoop, Kafka, etc.
  • Knowledge of data warehousing solutions like Redshift, BigQuery, or Snowflake.
  • Familiarity with containerization technologies like Docker and Kubernetes.
  • Experience with CI/CD pipelines and DevOps practices.
  • Understanding of machine learning and data analytics.

(ref:hirist.tech)

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job