Jobs Courses Resources Companies Placements

Home >

Jobs >

Principal Machine Learning Operations Developer - Inference Optimization

Cerence

Quebec, Canada (Hybrid)

Principal Machine Learning Operations Developer - Inference Optimization

3 Months ago • 10 Years +

Job Summary

Job Description

The Senior Machine Learning Operations Developer will design, develop, and implement strategies to optimize AI/ML inference pipelines for performance, scalability, and cost efficiency. They will collaborate with other engineers and cross-functional teams to integrate optimized solutions into production environments, driving innovation in hardware acceleration, quantization, model compression, and distributed inference techniques. The role involves staying up-to-date with LLM hosting frameworks, optimizing systems, conducting performance tuning, and managing model repositories. This position is based in Montreal, with hybrid work opportunities.

Must have:

10+ years of software engineering with focus on AI/ML.
Expertise in AI model optimization techniques.
Proficiency in programming languages such as Python, C++, or Rust.
Experience with AI/ML frameworks like TensorFlow, PyTorch, and ONNX.
Hands-on experience with GPU/TPU acceleration and deployment.
Strong DevOps mindset and experience with Kubernetes, containers, etc.
Strong problem-solving skills and ability to make data-driven decisions.
Excellent communication skills for complex technical concepts.

Good to have:

Experience with Kubernetes, Docker, and CI/CD pipelines.
Familiarity with MLOps practices and tools.
Familiarity with performance tuning of inference engines.
Understanding of LLM architecture and optimization.
Contributions to open-source AI/ML projects.
Familiarity with automotive or transportation industry applications.

Perks:

Annual bonus opportunity
Insurance coverage (medical, dental, vision, life, and disability)
Paid time off
Paid holidays
Company contribution to the RRSP (Registered Retirement Savings Plan)
Equity awards for certain positions and levels
Remote and/or hybrid work available

15 skills required

15 skills required for this role

Add these skills to join the top 1% applicants for this job

tensorflow

ci-cd

kubernetes

containers

rust

python

docker

pytorch

data-science

communication

scalability

data-driven-decisions

customer-centric

innovation

cross-functional

Job Details

A Moving Experience.

English version below

Description de poste

Avez-vous une passion pour repousser les limites de l'innovation ? Êtes-vous enthousiaste à l'idée du potentiel de l'IA pour améliorer l'expérience humaine ? Alors rejoignez-nous dans cette aventure !

Qui est Cerence AI?

Cerence AI est le leader mondial de l'IA pour le transport, spécialisé dans la création de compagnons alimentés par l'IA et la voix pour les voitures, les deux-roues et plus encore, permettant aux utilisateurs de se concentrer sur l'essentiel. Avec plus de 500 millions de voitures équipées de la technologie Cerence AI, nous collaborons avec des constructeurs automobiles de renom tels que Volkswagen, Mercedes, Audi, Toyota, et bien d'autres, des fournisseurs de mobilité et des entreprises technologiques pour offrir des expériences intuitives et intégrées, créant des trajets plus sûrs, plus connectés et plus agréables pour les conducteurs et les passagers.

Notre force motrice

Notre équipe, engagée à repousser les limites de l'innovation en IA, opère à l'échelle mondiale avec un siège social à Burlington, Massachusetts, USA, et 16 bureaux répartis en Europe, en Asie, et en Amérique du Nord. Nous réunissons des talents divers avec l'objectif commun de faire progresser la prochaine génération d'expériences utilisateur pour le transport. Notre culture est axée sur le client, collaborative, stimulante et conviviale, offrant des opportunités constantes d'apprentissage et de développement pour accompagner l'évolution de votre carrière.

Vous aspirez à avoir un impact significatif dans une industrie dynamique au sein d'une équipe internationale performante ? Nous recherchons un(e) Développeur senior d’opérations d’apprentissage automatique , prêt(e) à façonner l'avenir de la mobilité à nos côtés !

Votre impact :

Concevoir, développer et mettre en œuvre des stratégies pour optimiser les pipelines d’inférence IA/ML en termes de performance, d’évolutivité et de rentabilité.
Collaborer étroitement avec d'autres ingénieurs principaux et seniors de l'équipe, en favorisant une culture de partage des connaissances et de résolution commune des problèmes.
Travailler avec des équipes transversales, notamment en MLOps, science des données et ingénierie logicielle, pour intégrer des solutions d’inférence optimisées dans les environnements de production.
Innover dans les domaines de l'accélération matérielle, de la quantification, de la compression des modèles et des techniques d’inférence distribuée.
Se tenir au courant des cadres d’hébergement LLM et de leur configuration au niveau des machines et des clusters (par ex. vLLM, TensorRT, KubeFlow).
Optimiser les systèmes à l’aide de techniques telles que le regroupement, la mise en cache et le décodage spéculatif.
Effectuer le réglage des performances, des benchmarks et des profils pour les systèmes d'inférence, avec expertise en gestion de mémoire, threading, concurrence et optimisation GPU.
Gérer les dépôts de modèles, la livraison des artefacts et les infrastructures associées.
Développer et maintenir des mécanismes de journalisation pour les diagnostics et la recherche.

Qualifications requises :

Plus de 10 ans d'expérience en ingénierie logicielle, avec un accent sur l’IA/ML.
Expertise approfondie dans les techniques d'optimisation des modèles IA, y compris la quantification, l'élagage, la distillation des connaissances et la conception de modèles adaptés au matériel.
Maîtrise des langages de programmation tels que Python, C++ ou Rust.
Expérience avec des cadres IA/ML tels que TensorFlow, PyTorch et ONNX.
Expérience pratique avec l’accélération GPU/TPU et le déploiement dans des environnements cloud et edge.
Forte mentalité DevOps avec expérience en Kubernetes, conteneurs, déploiements, tableaux de bord, haute disponibilité, mise à l'échelle automatique, métriques et journaux.
Solides compétences en résolution de problèmes et capacité à prendre des décisions basées sur des données.
Excellentes compétences en communication et capacité à expliquer des concepts techniques complexes à un public diversifié.

Qualifications préférées :

Expérience avec Kubernetes, Docker et des pipelines CI/CD pour les charges de travail IA/ML.
Familiarité avec les pratiques et outils MLOps, y compris le versioning et la surveillance des modèles.
Familiarité avec l'optimisation des moteurs d'inférence comme vLLM et les techniques telles que les adaptateurs LoRA.
Compréhension de l'architecture et de l'optimisation des LLM.
Contributions à des projets open-source IA/ML.
Connaissance des applications dans les industries automobile ou des transports.
Master ou doctorat en informatique, apprentissage automatique ou domaine connexe.

Ce que nous offrons :

L’opportunité de rejoindre une toute nouvelle équipe axée sur les avancées IA/ML de pointe.
Un environnement de travail collaboratif et inclusif avec un fort accent sur l'innovation.
Un salaire compétitif et un ensemble complet d'avantages sociaux.
Des opportunités de développement professionnel et de progression de carrière.
La possibilité de travailler avec des technologies de pointe et de générer un impact réel.

Lieu :
Ce poste est basé à Montréal, avec des opportunités pour des arrangements de travail hybrides. Les candidats à distance basés aux États-Unis ou au Canada ayant des profils pertinents sont invités à postuler.

Rejoignez-nous :
Si vous êtes passionné par l'IA/ML et désireux de collaborer sur des projets transformateurs en optimisation d'inférence, nous voulons vous entendre. Postulez maintenant et devenez une partie du voyage de Cerence AI pour redéfinir la mobilité connectée !

Ce que nous offrons

Nous offrons un ensemble avantageux de rémunération et de bénéfices, en supplément du salaire de base, comprenant :

Opportunité de bonus annuel
Couverture d'assurance (médicale, dentaire, vision, vie et invalidité)
Congés payés
Jours fériés payés
Contribution de l'entreprise au REER (Régime enregistré d'épargne-retraite)
Attribution d'actions pour certains postes et niveaux
Télétravail et/ou travail hybride disponible selon le poste

Toutes les compensations et avantages sont soumis aux termes et conditions des plans ou programmes sous-jacents, selon le cas, et peuvent être modifiés, résiliés ou remplacés à tout moment.

**************

Do you have a passion for pushing the boundaries of innovation? Are you excited about AI’s potential to improve the human experience? Then come join the ride!

Who is Cerence AI?

Cerence AI is the global leader in AI for transportation, specialized in building AI and voice-powered companions for cars, two-wheelers, and more that enable people to focus on what matters most. With over 500 million cars shipped with Cerence AI technology, we partner with leading automakers (such as Volkswagen, Mercedes, Audi, Toyota and many more), mobility providers, and technology companies to power intuitive, integrated experiences that create safer, more connected, and more enjoyable journeys for drivers and passengers alike.

Our Driving Force

Our team is dedicated to pushing the boundaries of AI innovation, working around the globe with headquarters in Burlington, Massachusetts, USA and 16 other offices across Europe, Asia, and North America. We bring together diverse backgrounds, and varied skill sets with the shared goal of advancing the next generation of transportation user experiences. Our culture is customer-centric, collaborative, fast-paced, and fun, with continuous opportunities for learning and development to support your career growth.

Interested in having a significant impact in a dynamic industry with a high-performing global team? We’re looking for an exceptional Senior Machine Learning Operations Developer who is ready to drive the future of mobility with us!

Your Impact:

Design, develop, and implement strategies to optimize AI/ML inference pipelines for performance, scalability, and cost efficiency.
Collaborate closely with other Principal and Senior Engineers on the team, fostering a culture of knowledge-sharing and joint problem-solving.
Work with cross-functional teams, including MLOps, data science, and software engineering, to integrate optimized inference solutions into production environments.
Drive innovation in hardware acceleration, quantization, model compression, and distributed inference techniques.
Stay up-to-date with LLM hosting frameworks and their configuration on both machine and cluster levels (e.g., vLLM, TensorRT, KubeFlow).
Optimize systems using techniques such as batching, caching, and speculative decoding.
Conduct performance tuning, benchmarking, and profiling for inference systems, with expertise in memory management, threading, concurrency, and GPU optimization.
Manage model repositories, artifact delivery, and related infrastructure.
Develop and maintain logging mechanisms for diagnostics and research purposes.

Required Qualifications:

10+ years of experience in software engineering, with a focus on AI/ML.
Deep expertise in AI model optimization techniques, including quantization, pruning, knowledge distillation, and hardware-aware model design.
Proficiency in programming languages such as Python, C++, or Rust.
Experience with AI/ML frameworks such as TensorFlow, PyTorch, and ONNX.
Hands-on experience with GPU/TPU acceleration and deployment in cloud and edge environments.
Strong DevOps mindset with experience in Kubernetes, containers, deployments, dashboards, high availability, autoscaling, metrics, and logs.
Strong problem-solving skills and the ability to make data-driven decisions.
Excellent communication skills and the ability to articulate complex technical concepts to a diverse audience.

Preferred Qualifications:

Experience with Kubernetes, Docker, and CI/CD pipelines for AI/ML workloads.
Familiarity with MLOps practices and tools, including model versioning and monitoring.
Familiarity with performance tuning of inference engines like vLLM and techniques such as LoRA adapters.
Understanding of LLM architecture and optimization.
Contributions to open-source AI/ML projects.
Familiarity with automotive or transportation industry applications.
Master’s or Ph.D. in Computer Science, Machine Learning, or a related field.

What we offer

We offer a generous compensation and benefits package (in addition to the base salary), including:

Annual bonus opportunity
Insurance coverage (medical, dental, vision, life, and disability)
Paid time off
Paid holidays
Company contribution to the RRSP (Registered Retirement Savings Plan)
Equity awards for certain positions and levels
Remote and/or hybrid work available depending on the position
.

All compensation and benefits are subject to the terms and conditions of the underlying plans or programs, as applicable, and may be amended, terminated, or replaced from time to time

Cerence Inc. (Nasdaq: CRNC and www.cerence.com) is the global industry leader in creating unique, moving experiences for the automotive world. Spun out from Nuance in October 2019, Cerence is a new, independent company that has quickly gained traction as a leader in the automotive voice assistant space, working with all of the world’s leading automakers – from Ford and Fiat Chrysler to Daimler, Audi and BMW to Geely and SAIC – to transform how a car feels, responds and learns. Its track record is built on more than 20 years of industry experience and leadership and more than 500 million cars on the road today across more than 70 languages.

As Cerence looks to the future and continues an ambitious growth agenda, we need someone to join the team and help build the future of voice and AI in cars. This is an exciting opportunity to join Cerence’s passionate, dedicated, global team and be a part of meaningful innovation in a rapidly growing industry.

EQUAL OPPORTUNITY EMPLOYER

Cerence is firmly committed to Equal Employment Opportunity (EEO) and to compliance with all federal, state and local laws that prohibit employment discrimination on the basis of age, race, color, gender, gender identity, gender expression, sex, sex stereotyping, pregnancy, national origin, ancestry, religion, physical or mental disability, medical condition, marital status, citizenship status, sexual orientation, protected military or veteran status, genetic information and other protected classifications. Cerence Equal Employment Opportunity Policy Statement.

All prospective and current Employees need to remain vigilant when it comes to executing security policies in the workplace. This includes:

- Following workplace security protocols and training programs to familiarize with the ways to maintain a safe workplace.
- Following security procedures to report any suspicious activity.
- Having respect for corporate security procedures to allow those procedures to be effective.
- Adhering to company's compliance and regulations.
- Encouraging to follow a zero tolerance for workplace violence.

- Basic knowledge of information security and data privacy requirements (e.g., how to protect data & how to be handling this data).

- Demonstrative knowledge of information security through internal training programs.

Similar Jobs

Research Scientist, Foundation Model, Speech & Audio

bytedance

Seattle, Washington, United States (On-Site)

• 9 Months ago

Software Design Engineer

sony global (Games)

Wuxi, Jiangsu, China (On-Site)

• 3 Months ago

Machine Learning Engineer I

Moloco

Seoul, South Korea (On-Site)

• 1 Month ago

Machine Learning Engineer (CUDA)

Hedra

San Francisco, California, United States (On-Site)

• 4 Months ago

Software Engineer (AI/ML)

Mindstorm studios

Lahore, Punjab, Pakistan (On-Site)

• 2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Développeuse, Développeur de jeux sénior.e - IA/ML / Senior Game Developer - AI/ML, Amazon Games AI Research

Amazon games

Montreal, Quebec, Canada (On-Site)

• 7 Months ago

Research Scientist, Multimodal Foundation Model

bytedance

Singapore (On-Site)

• 9 Months ago

Research Scientist, Reinforcement Learning

bytedance

Seattle, Washington, United States (On-Site)

• 9 Months ago

Software Engineer (Python)

Outlier

Faridabad, Haryana, India (Hybrid)

• 9 Months ago

Applied Computer Vision Engineer- Automated Driving

Bosch Group India

Bengaluru, Karnataka, India (On-Site)

• 2 Months ago

Feature Store Architect

Applike

Hamburg, Hamburg, Germany (Hybrid)

• 2 Months ago

Senior AI-HPC Cluster Engineer

NVIDIA

Westford, Massachusetts, United States (Hybrid)

• 4 Months ago

Machine Learning Co-op (PhD Student)

Electronic Arts

Vancouver, British Columbia, Canada (Hybrid)

• 2 Months ago

Lead Engineer, Senior - Model Orchestration and Accuracy Tools

Qualcomm

Bengaluru, Karnataka, India (On-Site)

• 3 Months ago

Senior Machine Learning/MLOps Engineer

Unity

San Francisco, California, United States (On-Site)

• 3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Montreal, Quebec, Canada

HR Technology Senior Analyst

CAE

Montreal, Quebec, Canada (On-Site)

• 1 Year ago

Senior Artist, Visual Effects (VFX)

Epic Games

Montreal, Quebec, Canada (On-Site)

• 2 Months ago

Specialist - Architecture

LTI Mindtree

Toronto, Ontario, Canada (On-Site)

• 2 Months ago

Staff Machine Learning Engineer, Gen AI

Mozilla

Canada (Remote)

• 9 Months ago

Weapons and Cosmetics 3D Generalist

Digital extremes

London, Ontario, Canada (Remote)

• 3 Months ago

Revenue Enablement Program Manager, CoE

Boomi

Vancouver, British Columbia, Canada (On-Site)

• 1 Month ago

Freelance Designer

Critical mass

Calgary, Alberta, Canada (On-Site)

• 1 Month ago

Graphics Programmer

Rockstar Games

Oakville, Ontario, Canada (On-Site)

• 4 Months ago

Environment Supervisor

Scanline VFX

Vancouver, British Columbia, Canada (Hybrid)

• 7 Months ago

DevOps Engineer II

Ansys

Vancouver, British Columbia, Canada (On-Site)

• 2 Months ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Cerence

7 Active Jobs

We’re creating moving experiences for vehicles around the world. We’re Cerence. We utilize sophisticated A.I. and sensor data to entertain, inform and delight drivers and passengers. Whether it’s voice, gesture, gaze or touch technologies, the experience is the sum of the parts. Raise windows with a quick glance, hear a restaurant review with the point of a finger, display an augmented reality cityscape on a windshield, drive with just the sound of your voice.The future is connected cars, autonomous driving, ride sharing and e-vehicles.

Get notified when new jobs are added by Cerence

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

A global community of game builders. Helping people upskill and land jobs in the best gaming studios.

Company

Key Links

hello@outscal.com

Made in INDIA 💛💙

Principal Machine Learning Operations Developer - Inference Optimization

Job Summary

Job Description

15 skills required

15 skills required for this role

Job Details

A Moving Experience.

EQUAL OPPORTUNITY EMPLOYER

Similar Jobs

Research Scientist, Foundation Model, Speech & Audio

Software Design Engineer

Machine Learning Engineer I

Machine Learning Engineer (CUDA)

Software Engineer (AI/ML)

Similar Skill Jobs

Développeuse, Développeur de jeux sénior.e - IA/ML / Senior Game Developer - AI/ML, Amazon Games AI Research

Research Scientist, Multimodal Foundation Model

Research Scientist, Reinforcement Learning

Software Engineer (Python)

Applied Computer Vision Engineer- Automated Driving

Feature Store Architect

Senior AI-HPC Cluster Engineer

Machine Learning Co-op (PhD Student)

Lead Engineer, Senior - Model Orchestration and Accuracy Tools

Senior Machine Learning/MLOps Engineer

Jobs in Montreal, Quebec, Canada

HR Technology Senior Analyst

Senior Artist, Visual Effects (VFX)

Specialist - Architecture

Staff Machine Learning Engineer, Gen AI

Weapons and Cosmetics 3D Generalist

Revenue Enablement Program Manager, CoE

Freelance Designer

Graphics Programmer

Environment Supervisor

DevOps Engineer II

Similar Category Jobs

Looks like we're out of matches

About The Company

Senior Software Engineer

Principal AI Researcher, LLM

IT End User Services Specialist

Senior Benefits Analyst

Information Security and Compliance Manager

Business Development Manager (Partnerships)

Business Development Manager (Partnerships)

Level Up Your Career in Game Development!