Job Title: Python Data Engineer
100% Requirement = Good written and verbal ENGLISH communication skills
This project is supporting a US based company/project
Location: Colombia, South America (work remote supporting US based client/project)
Employment Type: Contract Work, 40 hours weekly
Industry: FinTech/FinOps/Tech
About the Role:
We’re looking for an experienced Python Data Engineer to join our data team. This role is ideal for someone who has a strong background in building scalable data pipelines, working with big data technologies like Apache Spark, and is comfortable operating in cloud environments such as AWS. A key requirement for this role is hands-on experience with Databricks for data processing and analytics at scale.
The ideal candidate is both technically proficient and business-savvy, capable of transforming data into actionable insights and ensuring the infrastructure is in place for data-driven decision-making.
Key Responsibilities:
- Design, build, and maintain scalable ETL/ELT data pipelines using Python and Apache Spark.
- Develop and manage workflows and jobs in Databricks to support data transformation and processing at scale.
- Collaborate with data scientists, analysts, and product teams to support their data needs.
- Work with large-scale datasets from various sources, ensuring data quality, integrity, and performance.
- Build data models and optimize data storage in cloud-based environments (e.g., AWS S3, Redshift, Glue, Athena).
- Monitor pipeline performance, troubleshoot issues, and ensure high availability and reliability of data processes.
Required Qualifications:
- 5+ years of experience as a Python Developer
- 5+ years of experience as a Data Engineer or similar role with strong Python development skills.
- Solid hands-on experience with Databricks, including notebooks, clusters, and workspace management.
- Strong understanding of Apache Spark (PySpark preferred) for distributed data processing.
- Proficiency working in AWS environments, including services like S3, Redshift, EMR, Glue, Lambda.
- Experience building robust, scalable, and efficient data pipelines.
- Familiarity with SQL and performance optimization for querying large datasets.
- Experience with version control systems (e.g., Git) and collaborative development practices.
- Strong problem-solving skills, attention to detail, and a proactive mindset.
Preferred Qualifications (Nice to Have):
- Experience with orchestration tools like Airflow or AWS Step Functions.
- Familiarity with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation.
- Knowledge of data warehousing, data lakes, and data mesh concepts.