Job Type

contract

Description

Huawei Canada has an immediate 12-month contract opening for an Engineer.

About the team:

The Software-Hardware System Optimization Lab continuously improves the power efficiency and performance of smartphone products through software-hardware systems optimization and architecture innovation. We keep tracking the trends of cutting-edge technologies, building the competitive strength of mobile AI, graphics, multimedia, and software architecture for mobile phone products.

About the job:

Design and build scalable infrastructure to support Reinforcement Learning, Online Search, Recommendation Systems, large model fine-tuning and evaluation/deployment.
Develop efficient ML solutions for Recommendation Systems and RL problems, including Multi-Armed and Contextual Bandit, Tree Search, and Multi-Agent system orchestration.
Implement and optimize deep learning architectures, including custom Transformers for agentic and decision-making systems.
Apply search and optimization techniques to efficiently fine-tune RL and ML models.
Work with large multimodal models (LLMs, VLMs), analyze their components, and fine-tune them for task-specific applications.
Conduct systematic benchmarking, new papers reading, experimentation, and validation in both simulation and real-world product environments.
Collaborate closely with research teams to scale online RL training capabilities and improve system robustness and accuracy.
Explore and integrate emerging AI methodologies and tools into production platforms.

About the ideal candidate:

Master’s or PhD in Computer Science, Machine Learning, or a related field.
Excellent Python programming skills with strong software engineering practices.
Strong foundation in Reinforcement Learning, Deep Learning, Recommender Systems, and Transformer-based architectures.
Demonstrated experience implementing RL algorithms beyond academic prototypes.
Hands-on experience with PyTorch and distributed training frameworks such as DeepSpeed.
Proven research excellence, including at least one publication in top-tier venues (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, ICRA, RLC).
Familiarity with LLM post-training techniques such as RLHF, PPO/GRPO, SFT, LoRA, or MoE is a strong asset.
Experience with multi-agent RL systems or tool-use agents is an advantage.

This job is found at InterviewStack.io

Skills

software architecturedeep learningtransformersllmsmachine learningpythonalgorithmspytorchllmrlhfreinforcement learningfine tuningexperimentationdistributed training

About Huawei Technologies Canada Co., Ltd.

Huawei Technologies Canada Co., Ltd. is the Canadian subsidiary of Huawei, a global information and communications technology company. The company operates in the information technology industry and has a presence in Ottawa, Ontario, Canada, focusing on areas such as digital IC design and AI-assisted design.

enterprise companytechnology, information_technologyacquired

Engineer - ML & RL

Prepare for this role

Job Type

Description

Skills

About Huawei Technologies Canada Co., Ltd.