About Rakuten:
Rakuten, Inc. (Tokyo Stock Exchange: 4755), is one of the world’s leading Internet service companies, providing a variety of consumer and business-focused services including e-commerce, travel, banking, securities, credit card, emoney, portal & media, online marketing and professional sports. Rakuten is expanding globally and currently has operations throughout Asia, Western Europe, and the Americas. Founded in 1997, Rakuten is headquartered in Tokyo, with over 10,000 employees worldwide. For more information, visit http://global.rakuten.com/corp/about/.
Rakuten India Development Center is the second largest technology hub outside of Japan that enables & builds platforms for global E commerce, Payments, Digital, AI and Data Science services across the globe. Rakuten India is located in Bengaluru, also known as the Silicon Valley of India, where there is an abundance of talent. The work environment is seamless – same as Tokyo, San Francisco, or Singapore. It has an easy and smooth collaboration across all Rakuten divisions. Rakuten India is an in-house innovation technology centre. Rakuten is ranked in top 20 most innovative companies in the world by Forbes
Job Summary:
We are seeking a highly motivated and experienced Machine Learning Engineer to join our dynamic team. In this role, you will be responsible for building, deploying, and maintaining machine learning models and pipelines that drive key business decisions. You will work closely with data scientists and other engineers to translate research prototypes into robust, scalable, and production-ready solutions. The ideal candidate possesses a strong understanding of machine learning principles, hands-on experience with various ML frameworks, and a passion for building high-quality, reliable systems.
Responsibilities:
ML Model Development & Implementation:
- Develop, train, and evaluate machine learning models using Python and relevant libraries (e.g., TensorFlow, PyTorch, scikit-learn).
- Implement machine learning models into production pipelines, ensuring scalability, reliability, and performance.
- Apply both classical machine learning techniques and deep learning approaches to solve complex problems.
- Optimize model performance and resource utilization.
Data Engineering & Pipeline Development:
- Design, build, and maintain ETL pipelines using PySpark and SQL to ingest, process, and transform large datasets.
- Utilize BigQuery for data warehousing and analysis.
- Ensure data quality and consistency throughout the ML pipeline.
Infrastructure & Deployment:
- Deploy and manage machine learning models using Docker and Kubernetes on GCP.
- Implement CI/CD pipelines for automated model deployment and testing.
- Monitor model performance in production and implement strategies for continuous improvement.
API Development:
- Create and manage API endpoints for model inference and data access.
- Ensure APIs are secure, efficient, and well-documented.
Collaboration & Communication:
- Collaborate with data scientists, engineers, and product managers to define project requirements and deliverables.
- Communicate technical concepts and solutions effectively to both technical and non-technical audiences.
- Participate in code reviews and contribute to the development of best practices.
- Document code, models, and processes thoroughly.
Software Engineering & Tooling:
- Develop and maintain Python packages to support ML development and deployment.
- Utilize Git for version control and collaboration.
- Work with distributed processing systems to handle large-scale data processing.
Exploration & Innovation:
- Stay up-to-date with the latest advancements in machine learning, including GenAI and LLMs.
- Experiment with new algorithms and techniques to improve model performance and efficiency.
- Contribute to the development of innovative ML solutions.
Web Application Development (Desired):
- Basic knowledge of building web applications using React, Flask, or Streamlit for model serving or visualization.
Qualifications:
- Bachelor’s or master’s degree in computer science, Machine Learning, Statistics, or a related field.
- 4-6 years of experience in a Machine Learning Engineering role or similar.
- Expert proficiency in Python and experience with relevant libraries (e.g., TensorFlow, PyTorch, scikit-learn, pandas, NumPy).
- Extensive experience with PySpark for data processing and distributed computing.
- Solid understanding of SQL and experience working with relational databases.
- Experience with BigQuery for data warehousing and analysis.
- Experience with machine learning modeling concepts, including classical modeling and deep learning.
- Proven experience implementing machine learning models into production pipelines.
- Experience with GCP, Docker, and Kubernetes.
- Experience with CI/CD pipelines for automated model deployment and testing.
- Familiarity with ETL pipelines and data warehousing concepts.
- Proficiency with Git for version control.
- Experience with distributed processing systems.
- Some experience building Python packages.
- Basic knowledge of GenAI and LLMs is a plus.
- Basic knowledge of building web applications using React, Flask, or Streamlit is a plus.
- Strong problem-solving and analytical skills.
- Excellent communication and collaboration skills.
Good to Have:
- Experience with specific ML frameworks or tools relevant to your company's work.
- Experience with MLOps practices and tools.
- Contributions to open-source projects.
- Publications in machine learning conferences or journals.
- Experience with specific GenAI/LLM frameworks.