Corteva Agriscience

Data Engineer

Tlajomulco de Zúñiga, Jal., MX

21 days ago
Save Job

Summary

We are seeking a Data Engineer to design, develop, and optimize scalable data pipelines supporting advanced analytics and machine learning solutions in a cloud-based environment. The ideal candidate has hands-on experience with Azure Data Services and Databricks, a strong background in data pipeline orchestration, proven expertise in data quality management and process automation, and experience in Procurement or Supply Chain.

Key Responsibilities

  • Data Pipeline Architecture & Development:
  • Design, develop, and maintain robust ETL/ELT pipelines to handle large-scale data ingestion, transformation, and integration.
  • Build and optimize data workflows using Azure Data Factory, Databricks (PySpark, Spark SQL), and Azure Synapse Analytics.
  • Ensure pipeline scalability, fault tolerance, and efficiency across diverse data sources, primarily structured (tabular) datasets.
  • Implement incremental loads, change data capture (CDC), and other advanced data ingestion strategies.
  • Automation & Process Optimization:
  • Develop and maintain automated data pipelines with a focus on performance optimization and cost-efficiency in the Azure environment.
  • Implement CI/CD pipelines for seamless deployment of data solutions, leveraging DevOps tools and Databricks Workflows.
  • Collaborate with cloud architects to optimize resource usage and adhere to cloud governance best practices.
  • Data Management & Quality Assurance:
  • Lead the design and implementation of data quality frameworks to ensure data integrity, consistency, and compliance across systems.
  • Develop monitoring solutions for pipeline health, data freshness, and anomaly detection.
  • Maintain comprehensive documentation covering data models, transformation logic, and operational procedures.
  • Cross-functional Collaboration & Stakeholder Engagement:
  • Partner with Data Scientists, Analysts, and Business Stakeholders to understand data needs and translate them into effective solutions.
  • Facilitate integration of machine learning models into production data pipelines.
  • Provide technical mentorship to junior data engineers and contribute to team knowledge-sharing initiatives.

Required Skills & Qualifications

Education: Bachelor’s degree in Computer Science, Data Engineering, Analytics, Statistics, Mathematics, or a related field. (Master’s degree is a plus.)

Experience

  • 3+ years of hands-on experience in data engineering or a related discipline.
  • Proven experience designing and deploying end-to-end data pipelines in Azure and Databricks environments.

Language: Proficiency in English (written and spoken) is required, with strong English skills being prioritized.

Technical Skills

Programming & Data Processing:

  • Advanced proficiency in SQL and Python for data manipulation, transformation, and analysis.
  • Extensive experience with PySpark and Spark SQL for big data processing in Databricks.

Cloud & Data Services (Azure)

  • In-depth knowledge of Azure services, including:
  • Azure Data Factory (ADF) for pipeline orchestration
  • Azure Data Lake Storage (ADLS) for data storage and management
  • Azure SQL Database for relational data management
  • Experience with Azure Functions and event-driven architectures is a plus

Automation & DevOps

  • Hands-on experience implementing CI/CD pipelines using tools like Azure DevOps, GitHub Actions, or similar.
  • Familiarity with infrastructure-as-code (IaC) tools such as Terraform or ARM templates.
  • Experience with Databricks Workflows and job orchestration tools.

Data Management & Warehousing

  • Strong understanding of data lakehouse architectures and data warehousing solutions (e.g., SQL Server, Redshift, BigQuery).
  • Experience designing and maintaining data models and schema designs for analytical use cases.
  • Familiarity with data governance, security best practices, and compliance standards.

Machine Learning Integration (Preferred)

  • Experience supporting machine learning workflows and integrating models into production pipelines.
  • Understanding of MLOps practices is a plus.

Preferred Qualifications

  • Experience with real-time data processing (e.g., Apache Kafka, Azure Stream Analytics).
  • Familiarity with Power BI data connections and reporting structures.
  • Hands-on experience with Databricks Workflows for complex pipeline orchestration.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job

People also searched: