Machine Learning Engineer/SRE

3 Months ago • All levels • DevOps

Job Summary

Job Description

This role combines machine learning expertise with SRE responsibilities. You will manage Azure infrastructure for AI model development and deployment, ensure model performance, and respond to incidents related to model operations. This position requires strong Azure infrastructure, CI/CD, containerization, and machine learning knowledge.
Must have:
  • Azure Infrastructure Experience
  • CI/CD Pipeline Experience
  • Containerization in the Cloud
  • Machine Learning Expertise
  • Programming Skills
  • Data Management
  • Collaborative Team Player
  • Documentation

Job Details

Description

  • Manage Azure Infrastructure: Configure, maintain, and optimize Azure infrastructure for AI model development and deployment, ensuring scalability and performance.
  • Model Performance Monitoring: Implement and maintain monitoring systems to track model performance, proactively identifying and addressing issues as they arise.
  • Incident Response: Collaborate with the SRE team to respond promptly to outages and incidents related to model operations, ensuring minimal downtime and rapid issue resolution.

Requirements

  • Azure Infrastructure Experience: Proficiency in managing Azure infrastructure components, including virtual machines, storage, and networking, to support AI model development and deployment.
  • CI/CD Pipeline Experience: Experience with Continuous Integration/Continuous Deployment (CI/CD) pipelines, including the automation of model deployment processes.
  • Containerization in the Cloud: Strong knowledge of containerization technologies in the cloud, such as Docker and Kubernetes, for efficient deployment and scaling of machine learning models.
  • Machine Learning Expertise: Proficient in building and optimizing machine learning models, with a deep understanding of various ML algorithms and frameworks.
  • Programming Skills: Proficiency in programming languages commonly used in machine learning, such as Python and libraries like TensorFlow and PyTorch.
  • Data Management: Experience in data preprocessing, feature engineering, and data pipeline development for machine learning.
  • Collaborative Team Player: Excellent communication skills and the ability to work collaboratively with cross-functional teams, including AI engineers and SREs.
  • Documentation: Effective documentation skills to maintain clear and organized records of models, infrastructure configurations, and incident responses.
  •  

Similar Jobs

Autodesk - Intern, AI Research Scientist - 3D Generation

Autodesk

(On-Site)
4 Months ago
Paypal - Lead Principal ML Engineer, AI Solutions

Paypal

San Jose, California, United States (On-Site)
4 Months ago
AiDash - Senior Data Scientist

AiDash

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Match Group - Machine Learning Engineer (MG AI)

Match Group

Seoul, South Korea (On-Site)
3 Months ago
The Walt Disney Company - Manager, Machine Learning - Ad Platforms

The Walt Disney Company

Santa Monica, California, United States (On-Site)
3 Months ago
Google - System Development Engineer

Google

(On-Site)
3 Months ago
Google - Partner Operations Manager

Google

Tokyo, Japan (On-Site)
3 Months ago
The Walt Disney Company - Lead Software Engineer, Scala

The Walt Disney Company

Bristol, Connecticut, United States (On-Site)
3 Months ago
PublicisGroupe - Senior Associate Infrastructure L1_DevOps AWS

PublicisGroupe

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Extreme Network - SR PROGRAMMER - Oracle Fusion Cloud- VBCS/ BI Reports/ OTBI/FRS & SmartView

Extreme Network

Chennai, Tamil Nadu, India (Hybrid)
4 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Netflix - Research Engineer (L5) - Identity Algorithms

Netflix

United States (Remote)
3 Months ago
ByteDance - Student Researcher (Doubao (Seed) - Foundation Model - Speech & Audio) - 2025 Start (PhD)

ByteDance

San Jose, California, United States (On-Site)
3 Months ago
Google - Head of Innovation Acceleration, Rapid Innovation, Public Sector

Google

Reston, Virginia, United States (On-Site)
3 Months ago
ByteDance - Senior Research Scientist, Foundation Model, Speech Understanding

ByteDance

Seattle, Washington, United States (On-Site)
3 Months ago
Dream Game Studios - Lead ML Scientist

Dream Game Studios

Mumbai, Maharashtra, India (On-Site)
6 Months ago
Tencent - Game Research & Development Intern, Engine Research 102577

Tencent

Bellevue, Washington, United States (On-Site)
4 Months ago
Niantic - 2025 R&D Software Engineering Intern (Masters Degree or PhD)

Niantic

London, England, United Kingdom (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Mumbai, Maharashtra, India

Dentsu - Senior Art Director

Dentsu

Bengaluru, Karnataka, India (On-Site)
5 Months ago
Saviynt - Director, Support Operations

Saviynt

Bengaluru, Karnataka, India (Hybrid)
4 Months ago
PwC - Conversational AI Architect-Senior Associate

PwC

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Nagarro - Associate Staff Consultant ,Sap Development BTP

Nagarro

India (Remote)
4 Months ago
Google - Technical Solutions Consultant, Google Distributed Cloud Solutions

Google

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Blue Yonder - Lead Software Engineer - PLSQL

Blue Yonder

Hyderabad, Telangana, India (On-Site)
4 Months ago
Visa - Director (RFP/Pre-Sales/Consulting/CxO Advisory)

Visa

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Pixenite   - Motion Graphic Artist

Pixenite

Ahmedabad, Gujarat, India (On-Site)
7 Months ago
Actian - Technical Writer - Bangalore

Actian

Bengaluru, Karnataka, India (On-Site)
4 Months ago
Luxoft - QA Automation Engineer (.Net)

Luxoft

Bengaluru, Karnataka, India (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

DevOps Jobs

Rackspace Technology - Senior Streaming Engineer (GCP) - Canada

Rackspace Technology

Toronto, Ontario, Canada (Remote)
4 Months ago
Codeninja - Senior Azure DevOps Engineer - LATAM

Codeninja

Chile (Remote)
4 Months ago
The Walt Disney Company - Director, Systems Reliability Engineering

The Walt Disney Company

Burbank, California, United States (On-Site)
3 Months ago
Egnyte - Staff Software Engineer - C++

Egnyte

Poznań, Greater Poland Voivodeship, Poland (On-Site)
3 Months ago
Ajmera Infotech - SENIOR ASP.NET DEVELOPER

Ajmera Infotech

Hyderabad, Telangana, India (On-Site)
7 Months ago
Rackspace Technology - Principal Java Engineer (GCP)

Rackspace Technology

Canada (Remote)
3 Months ago
Ubisoft - Senior Online Game Programmer (M/W/NB) – Mobile AAA RPG

Ubisoft

Accons, Auvergne-Rhône-Alpes, France (Hybrid)
4 Months ago
Dream11 - Lead Engineer - Cloud Security

Dream11

Mumbai, Maharashtra, India (On-Site)
4 Months ago
Hitachi - Solution Architect

Hitachi

San José, San José Province, Costa Rica (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded