ConSol Partners

Senior Software Engineer, Data Processing

Austin, TX, US

10 days ago
Save Job

Summary

Software Engineer – Data Platform

Austin, TX


About the Team

Join a team focused on the collection, storage, and processing of large-scale datasets generated by autonomous systems, including vehicles and delivery robots. This work involves managing sensor data from sources like cameras, lidars, and radars. Reliable storage solutions and efficient compute platforms are critical for supporting teams working on machine learning, simulation, and algorithm development. The team’s data processing stack leverages specialized algorithms similar to those used in real-world autonomous systems.


About the Role

As a Software Engineer on the Data Platform team, you will design, build, and maintain core data and machine learning infrastructure with a strong emphasis on software engineering best practices and code quality. You will be responsible for developing systems that ingest, process, and organize petabytes of telemetry and sensor data into a globally distributed data lake, ensuring high-throughput and low-latency data access for both model training and live inference. Your contributions will accelerate the work of machine learning engineers and data scientists, enabling faster iteration and the development of higher-performing systems.


Responsibilities

  • Build and maintain robust data pipelines and foundational datasets to support simulation, analytics, and machine learning workflows, as well as broader business use cases.
  • Design scalable and cost-efficient database architectures for managing massive and complex datasets.
  • Collaborate with cross-functional teams, including Simulation, Perception, Prediction, and Planning, to define data requirements and workflows.
  • Evaluate, extend, and integrate open-source technologies (e.g., Apache Spark, Ray, Apache Beam, Argo Workflows) alongside internal systems.


Requirements

  • Strong proficiency in Python (required); experience with C++ is a strong plus.
  • Demonstrated ability to produce high-quality, maintainable code and design scalable, reliable systems.
  • Hands-on experience deploying and managing distributed systems with Kubernetes.
  • Practical experience with large-scale open-source data technologies (e.g., Kafka, Flink, Cassandra, Redis).
  • Deep understanding of distributed systems and large-scale data platforms, including managing petabyte-scale datasets.


Preferred Qualifications

  • Experience in building and operating large-scale machine learning systems.
  • Understanding of ML/AI workflows and machine learning pipelines.
  • Proven track record of optimizing resource utilization and system performance in distributed environments.
  • Familiarity with data visualization and dashboarding tools (e.g., Grafana, Apache Superset).
  • Experience working with cloud platforms such as AWS, GCP, or Azure.

How strong is your resume?

Upload your resume and get feedback from our expert to help land this job