Sr. AI Ops Engineer

1 Month ago • 3 Years + • $116,000 PA - $226,600 PA

Job Summary

Job Description

Calix is seeking a highly skilled AI Ops Engineer to join their AI/ML team. This role involves building, scaling, and maintaining the infrastructure for machine learning and generative AI applications. The engineer will collaborate with data scientists and developers to ensure the ML/AI systems are robust and production-ready. Key responsibilities include designing and maintaining scalable infrastructure, deploying and troubleshooting pipelines, building CI/CD pipelines, scaling compute resources, implementing container orchestration with Kubernetes, and optimizing cloud resources on GCP. The engineer will also set up and maintain runtime frameworks, establish monitoring and alerting systems, optimize system performance, and develop AIOps best practices.
Must have:
  • 5+ years of software engineering experience
  • 3+ years of experience in DevOps/AIOps
  • Experience with Docker and Kubernetes
  • Expertise in cloud infrastructure management
  • Proficiency with workflow management like Airflow
Good to have:
  • Knowledge of monitoring and logging tools
  • Proficiency in Python
  • Proficiency in performance-oriented languages
  • Familiarity with ML frameworks and platforms

Job Details

Calix provides the cloud, software platforms, systems and services required for communications service providers to simplify their businesses, excite their subscribers and grow their value.

Calix is seeking a highly skilled AI Ops Engineer to join our cutting-edge AI/ML team. In this role, you will be responsible for building, scaling, and maintaining the infrastructure that powers our machine learning and generative AI applications. You will work closely with data scientists, ML engineers, and software developers to ensure our ML/AI systems are robust, efficient, and production ready.

This is a remote-based position that can be located anywhere in the United States or Canada.

Key Responsibilities:

  • Design, implement, and maintain scalable infrastructure for ML and GenAI applications

  • Deploy, operate, and troubleshoot production ML/GenAI pipelines/services

  • Build and optimize CI/CD pipelines for ML model deployment and serving

  • Scale compute resources across CPU/GPU architectures to meet performance requirements

  • Implement container orchestration with Kubernetes

  • Architect and optimize cloud resources on GCP for ML training and inference

  • Setup and maintain runtime frameworks and job management systems (Airflow, KubeFlow, MLflow, etc.)

  • Establish monitoring, logging and alerting for systems observability

  • Optimize system performance and resource utilization for cost efficiency

  • Develop and enforce AIOps best practices across the organization

Qualifications:

  • Bachelor's degree in computer science, Information Technology, or a related field (or equivalent experience). 

  • 5+ years of overall software engineering experience

  • 3+ years of focused experience in DevOps/AIOps or similar ML infrastructure roles

  • Strong experience with containerization and orchestration using Docker and Kubernetes

  • Demonstrated expertise in cloud infrastructure management, preferably on GCP (AWS or Azure experience also valued)

  • Proficiency with workflow management such as Airflow & Kubeflow

  • Strong CI/CD expertise with experience implementing automated testing and deployment pipelines

  • Experience with scaling distributed compute architectures utilizing various accelerators (CPU/GPU/)

  • Solid understanding of system performance optimization techniques

  • Experience implementing comprehensive observability solutions for complex systems

  • Knowledge of monitoring and logging tools (Prometheus, Grafana, ELK stack).

  • Strong proficiency in Python

  • Proficient in at least one of the following performance-oriented programming languages: C, C++, Go, Rust

  • Familiarity with ML frameworks such as PyTorch and ML platforms like SageMaker or Vertex AI

  • Excellent problem-solving skills and ability to work independently

  • Strong communication skills and ability to work effectively in cross-functional teams

#LI-Remote

Compensation will vary based on geographical location (see below) within the United States. Individual pay is determined by the candidate's location of residence and multiple factors, including job-related skills, experience, and education.

For more information on our benefits click here.

There are different ranges applied to specific locations. The average base pay range (or OTE range for sales) in the U.S. for the position is listed below.

San Francisco Bay Area Only:

133,400.00 - 226,600.00 USD Annual

All Other Locations:

116,000.00 - 197,000.00 USD Annual

Similar Jobs

Neolytix - Full Stack Developer - Healthcare Tech

Neolytix

Gurugram, Haryana, India (Hybrid)
1 Month ago
Neolytix - Full Stack Developer

Neolytix

Gurugram, Haryana, India (Hybrid)
1 Month ago
SimCorp - DevOps Engineer

SimCorp

Manila, Metro Manila, Philippines (Hybrid)
3 Weeks ago
Boomi - Software Senior Engineer - DevOps

Boomi

India (On-Site)
1 Month ago
ION - Senior DevSecOps Engineer, Italy

ION

Milan, Lombardy, Italy (On-Site)
8 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Playrix - Senior Release Engineer

Playrix

Almaty, Almaty Region, Kazakhstan (Remote)
7 Months ago
Krafton  - Software Engineer, Data Engineering

Krafton

Seoul, South Korea (On-Site)
3 Weeks ago
Fluxon - Staff Software Engineer

Fluxon

Hyderabad, Telangana, India (Remote)
8 Months ago
Limit Break - Senior Backend Engineer, Core Services (Blockchain focused)

Limit Break

United States (Remote)
3 Months ago
PlayStation Global - Associate Build Service Analyst

PlayStation Global

Petaling Jaya, Selangor, Malaysia (On-Site)
2 Months ago
Bitlane - Senior Frontend Developer

Bitlane

Berlin, Berlin, Germany (On-Site)
4 Years ago
The Walt Disney Company - Senior Software Engineer - Full Stack

The Walt Disney Company

Glendale, California, United States (On-Site)
1 Month ago
Omnissa - Member of technical staff (C++,iOS)

Omnissa

Bengaluru, Karnataka, India (Hybrid)
8 Months ago
SOFTGAMES - Senior HTML5 Game Developer - Fully Remote

SOFTGAMES

Berlin, Berlin, Germany (Remote)
2 Months ago
Integrant - Senior Android Developer

Integrant

Cairo Governorate, Egypt (Hybrid)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Worldwide

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Calix delivers a broadband platform and managed services that enable our customers to improve life one community at a time. We’re at the forefront of a once in a generational change in the broadband industry. Join us as we innovate, help our customers reach their potential, and connect underserved communities with unrivaled digital experiences. This is the Calix mission - to enable CSPs of all sizes to Simplify. Innovate. Grow. To learn more, visit the Calix web site at www.calix.com To learn more about our international job opportunities, please visit our International Careers Page If you are a person with a disability needing assistance with the application process please: Email us at calix.interview@calix.com; or Call us at +1(408) 514-3000.

Mexico City, Mexico (Remote)

Mexico City, Mexico (Remote)

Minneapolis, Minnesota, United States (Remote)

View All Jobs

Get notified when new jobs are added by Calix

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug