Software Engineer (AI Performance)

6 Minutes ago • All levels • $150,000 PA - $350,000 PA
Research Development

Job Description

Gimlet Labs is building the foundation for the next generation of AI applications, focusing on redefining AI inference for breakthrough performance, efficiency, and model quality. They combine cutting-edge research with an integrated hardware-software stack and a seamless developer experience for deploying, managing, and monitoring AI workloads. The company is seeking a Software Engineer focused on AI Performance to research and implement techniques for performance and quality optimizations across the latest AI models, including quantization, KV caching, FlashAttention, and designing parallelism strategies.
Good To Have:
  • Graduate degree in computer science, engineering, applied mathematics or comparable area of study
  • Familiarity with compilers and compiler frameworks such as MLIR
  • Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks
  • Software development experience with Python, C++, and CUDA
Must Have:
  • Evaluating and implementing cutting-edge AI research for model performance and efficiency
  • Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers
  • Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems
  • Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study
  • Experience with performance optimization

Add these skills to join the top 1% applicants for this job

cpp
game-texts
cuda
pytorch
python
tensorflow

Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting-edge research with an integrated hardware-software stack that delivers breakthrough performance, efficiency, and model quality. Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.

Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.

Gimlet Labs is seeking a Software Engineer focused on AI Performance. You will be researching and implementing techniques to drive performance and quality optimizations across the latest AI models. You will implement techniques such as quantization, KV caching, and FlashAttention to enable inference efficiency. You will design parallelism strategies to distribute data and workloads across compute nodes at production scale. You will dive deep into GPU code and kernel optimizations to accelerate AI workloads.

Responsibilities:

  • Evaluating and implementing cutting-edge AI research for model performance and efficiency
  • Architecting infrastructure for distributed AI workloads across both the software stack and GPU kernel layers
  • Profiling, benchmarking, and analyzing system performance, identifying bottlenecks and optimization opportunities in execution runtimes targeting various hardware systems

Qualifications:

  • Bachelor’s degree in computer science, engineering, applied mathematics or comparable area of study
  • Experience with performance optimization

Preferred Qualifications:

  • Graduate degree in computer science, engineering, applied mathematics or comparable area of study
  • Familiarity with compilers and compiler frameworks such as MLIR
  • Experience with PyTorch, TensorFlow, vLLM, ONNX and other AI frameworks
  • Software development experience with Python, C++, and CUDA

Set alerts for more jobs like Software Engineer (AI Performance)
Set alerts for new jobs by Gimlet Labs
Set alerts for new Research Development jobs in United States
Set alerts for new jobs in United States
Set alerts for Research Development (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙