Responsibilities:
• Research, develop, and optimize automatic speech recognition (ASR) models to achieve state-of-the-art performance
• Implement and fine-tune ASR systems for various languages, accents, and acoustic environments
• Collaborate with cross-functional teams to integrate ASR capabilities into products and services
• Stay current with the latest advancements in speech recognition research and technology
• Design and conduct experiments to improve ASR accuracy, latency, and robustness
Requirements:
• Degree in Computer Science, Electrical Engineering, Linguistics, or related field
• Minimum 2 years of experience in developing speech recognition systems or related audio ML technologies
• Strong hands-on development experience with Python and ASR frameworks
• Proficiency in PyTorch, TensorFlow, or other deep learning frameworks
• Experience with transformer-based architectures, CTC, RNN-T, and other modern ASR modeling approaches
• Knowledge of audio signal processing techniques and feature extraction methodologies
• Familiarity with language modeling and acoustic modeling for speech recognition
• Experience with cloud platforms (AWS, GCP, Azure) for model training and deployment
• Strong problem-solving skills and ability to optimize models for production environments
• Excellent communication skills to collaborate effectively with technical and non-technical teams
• Fluent in written and spoken English and Cantonese. Additional language skills are an advantage
Preferred Qualifications:
• Experience with multi-lingual or code-switching ASR systems
• Knowledge of end-to-end ASR architectures and self-supervised learning approaches
• Experience with streaming/real-time ASR implementation
• Familiarity with ASR post-processing techniques and language model integration
• Background in MLOps for deploying and monitoring ASR systems at scale
When applying, please include:
• Your CV
• Link to your GitHub / personal repository
• Links to any relevant publications, demos, or projects demonstrating your ASR expertise
• Brief description of your experience with speech recognition technologies