Position Summary:
We are looking for a highly skilled Data Engineer / Data Scientist with expertise in Generative AI (GenAI), MLOps, and AWS to join our innovative team. In this role, you will be responsible for building and optimizing data pipelines, deploying machine learning models, and developing generative AI solutions at scale. The ideal candidate will have experience with cloud infrastructure, MLOps, and the development of machine learning models, particularly within AWS environments. You will also be involved in managing the full lifecycle of machine learning models, from data collection and preprocessing to deployment, monitoring, and model management.
If you are passionate about leveraging cutting-edge Generative AI techniques, optimizing machine learning workflows, and deploying AI solutions at scale using AWS cloud services, we want to hear from you!
Key Responsibilities:
Data Engineering:
- Design, build, and maintain scalable data pipelines using AWS cloud services (e.g., AWS S3, AWS Glue, AWS Redshift) to process large-scale datasets for machine learning and AI applications.
- Develop and implement data workflows for Generative AI (GenAI) models, ensuring smooth data integration, storage, and access for training and model deployment.
- Collaborate with data science teams to build and automate data pipelines for training data preparation, model evaluation, and predictions.
- Ensure high data quality and integrity by implementing validation, cleaning, and preprocessing steps to prepare data for machine learning models.
Machine Learning & GenAI:
- Develop, implement, and fine-tune Generative AI models (e.g., GANs, transformers, large language models) for various business applications such as content generation, personalization, recommendation systems, or advanced analytics.
- Work closely with data scientists to build, train, and evaluate machine learning models using frameworks such as TensorFlow, PyTorch, and Hugging Face.
- Leverage cloud services like AWS SageMaker to deploy and scale machine learning models in production.
- Develop and optimize generative AI models, ensuring they meet business needs for content generation, simulation, or other use cases.
MLOps (Machine Learning Operations):
- Implement MLOps best practices for model deployment, monitoring, and versioning, ensuring smooth transitions from development to production environments.
- Manage model pipelines, including continuous integration and continuous deployment (CI/CD) for machine learning models in AWS environments.
- Set up automated workflows for retraining and updating models as new data becomes available, ensuring the models remain accurate over time.
- Utilize tools such as AWS CodePipeline, MLFlow, Kubeflow, or similar tools to streamline model deployment and orchestration.
- Monitor the performance of models in production, ensuring they perform optimally and troubleshoot issues as they arise.
Cloud Infrastructure & Management:
- Design and implement scalable and cost-effective cloud architectures in AWS to support AI/ML workloads, utilizing services like AWS Lambda, AWS EC2, AWS S3, AWS SageMaker, AWS Elastic Container Service (ECS), and AWS Elastic Kubernetes Service (EKS).
- Optimize cloud resource utilization and cost-efficiency while maintaining high performance for large-scale machine learning workloads.
- Ensure compliance with data security policies and regulatory standards while managing sensitive data and AI models in the cloud.
Collaboration & Communication:
- Collaborate with cross-functional teams, including data scientists, engineers, product managers, and business stakeholders, to understand business goals and translate them into scalable machine learning solutions.
- Communicate technical concepts related to AI/ML models, cloud infrastructure, and MLOps to both technical and non-technical stakeholders.
- Provide ongoing technical support and guidance to teams working with machine learning models, helping to troubleshoot issues and optimize workflows.
Skills and Qualifications:
Technical Skills:
- Strong experience with AWS cloud services, including S3, EC2, SageMaker, Lambda, Glue, Redshift, ECS, and EKS.
- Expertise in building data pipelines for machine learning, leveraging AWS Glue, AWS Data Pipeline, or similar services.
- Hands-on experience with Generative AI technologies (e.g., Generative Adversarial Networks (GANs), Transformer Models, GPT-3, BERT, or similar models).
- Proficiency with machine learning frameworks such as TensorFlow, PyTorch, Scikit-learn, and Hugging Face.
- Experience with MLOps tools and frameworks like MLFlow, Kubeflow, TensorFlow Extended (TFX), or Seldon.
- Strong programming skills in Python and experience with data manipulation libraries like Pandas, NumPy, and Dask.
- Familiarity with CI/CD pipelines for machine learning model deployment and version control using Git.
Machine Learning & GenAI Expertise:
- Strong background in machine learning algorithms, including supervised, unsupervised, and deep learning techniques.
- Experience building, training, and deploying Generative AI models and other complex AI solutions.
- Knowledge of model evaluation techniques and tools for tracking model performance, such as MLflow or TensorBoard.
Cloud & DevOps Knowledge:
- Deep understanding of cloud infrastructure and the ability to architect machine learning systems in AWS.
- Experience with containerization (e.g., Docker) and orchestration (e.g., Kubernetes) for deploying and scaling models in production.
- Familiarity with infrastructure as code (e.g., Terraform, AWS CloudFormation).