Full Stack LLM Engineer

Cerebras Systems

| Toronto, Ontario, Canada (On Site) | Full Time | 3 weeks ago

Apply Now

Job Summary

Cerebras Systems is seeking a versatile and experienced Full Stack LLM Engineer to join their Inference Core Model Bringup team. This role involves rapidly bringing up state-of-the-art open-source and proprietary models on Cerebras CSX systems. The engineer will work across the entire software stack, contributing to model architecture translation, compiler optimizations, runtime integration, and performance tuning to achieve unprecedented AI application performance, efficiency, and scalability.

Must Have

Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.
Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.

Perks & Benefits

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Job Description

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. Cerebras' current customers include global corporations across multiple industries, national labs, and top-tier healthcare systems. In January, we announced a multi-year, multi-million-dollar partnership with Mayo Clinic, underscoring our commitment to transforming AI applications across various fields. In August, we launched Cerebras Inference, the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services.

About the Role

We are seeking a versatile and experienced engineer to join our Inference Core Model Bringup team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire software stack. Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Responsibilities

Contribute to the end-to-end bring up of ML models on CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.

Skills & Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.

What We Offer

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

1. Build a breakthrough AI platform beyond the constraints of the GPU.

2. Publish and open source their cutting-edge AI research.

3. Work on one of the fastest AI supercomputers in the world.

4. Enjoy job stability with startup vitality.

5. Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2025.

Apply today and become part of the forefront of groundbreaking advancements in AI!

---

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

---

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice._

9 Skills Required For This Role

Problem Solving Cpp Data Structures Game Texts Pytorch Deep Learning Python Tensorflow Machine Learning

Similar Jobs

Full Stack Development

Senior Technical Consultant - Full Stack

Autodesk • Kraków, Poland (On Site)

1 day ago

Manager, Engineering (Java Fullstack)

Insight Software • Bangalore, Karnataka, India (On Site)

1 day ago

Full-Stack Developer

Overwolf • On Site

1 day ago

Lead Full Stack Engineer

HHA Exchange • United States (Remote)

1 day ago

Fullstack Software Engineer (m/f/d) - Price Alert

Axel springer • Berlin, Germany (On Site)

1 day ago

Full Stack Engineer

Motorola solutions • Ware, United Kingdom (Hybrid)

1 day ago

Software Engineer - Full-Stack at Sixtyfour (X25)

Sixtyfour • San Francisco, California, United States (Remote)

2 days ago

Full Stack Engineer

Informa Group • Bengaluru, Karnataka, India (Hybrid)

2 days ago

Lead Full Stack Engineer (.NET + Angular)

N-ix • Buenos Aires, Ciudad Autónoma de Buenos Aires, Argentina (Remote)

3 days ago

Full Stack Developer II

smarsh • Bangalore, Karnataka, India (Hybrid)

3 days ago

Software Development & Engineering

Experienced - Senior Python Developer

ness digital • Košice, Slovakia (On Site)

1 day ago

Principal Software Dev Eng - Media Platform

Yahoo • United States (Hybrid)

1 day ago

Senior Engineering Manager, MCP and Agentic AI

Autodesk • Canada (Remote)

1 day ago

PhD Intern, AI Research - Datasets & Benchmarks

Autodesk • San Francisco, California, United States (Remote)

1 day ago

Senior Technical Artist

People Can Fly • Montreal, Québec, Canada (Remote)

1 day ago

Sr. DevSecOps Engineer

Trend Micro • Taipei City, Taiwan (On Site)

1 day ago

SW Engineer

broadcom • San Jose, California, United States (On Site)

1 day ago

FBAR/filter Design Engineer

broadcom • Seoul, South Korea (On Site)

1 day ago

Software Engineer

broadcom • Hyderabad, Telangana, India (On Site)

1 day ago

FBAR design engineer

broadcom • Seoul, South Korea (On Site)

1 day ago

View All Jobs

Cerebras Systems

60 Active Jobs

Infrastructure Engineer

Sunnyvale, California, United States (On Site) 1 day ago

Senior Manufacturing Process Engineer

Sunnyvale, California, United States (On Site) 1 day ago

Manufacturing Technical Program Manager

Sunnyvale, California, United States (On Site) 1 day ago

Principal Technical Program Manager

Sunnyvale, California, United States (Hybrid) 2 days ago

Reliability and Failure Analysis Technician

Sunnyvale, California, United States (On Site) 2 days ago

Reliability Testing Engineer

Sunnyvale, California, United States (On Site) 2 days ago

Developer Experience Engineer

Sunnyvale, California, United States (On Site) 2 days ago

Senior Manager - Stock Administration

Sunnyvale, California, United States (On Site) 1 weeks ago

Python / PyTorch Developer — Frontend Inference Compiler – Dubai

Dubai, Dubai, United Arab Emirates (On Site) 1 weeks ago

Applied Data Center Design Engineer

Toronto, Ontario, Canada (On Site) 1 weeks ago

View All Jobs

Free Game Dev Courses

Built by game devs, for game devs. Learn in 15-minute lessons. From AI workflow to iconic game mechanics - level up your skills with browser-based learning. Zero setup required.

Start Learning Now