fal is pioneering the next generation of generative-media infrastructure. We're pushing the boundaries of model inference performance to power seamless creative experiences at unprecedented scale. We're looking for a Staff Technical Lead for Inference & ML Performance, someone who blends deep technical expertise with strategic vision, guiding a team to build and optimize state-of-the-art inference systems. This role is intense yet deeply impactful. Apply if you're ready to lead the future of inference performance at a fast-paced, high-growth frontier.
You’ll shape the future of fal’s inference engine and ensure our generative models achieve best-in-class performance. Your work directly impacts our ability to rapidly deliver cutting-edge creative solutions to users, from individual creators to global brands.
| Day-to-day | What success looks like |
| --- | --- |
| Set technical direction. Guide your team (kernels, applied performance, ML compilers, distributed inference) to build high-performance inference solutions. | fal’s inference engine consistently outperforms industry benchmarks in throughput, latency, and efficiency. |
| Hands-on IC leadership. Personally contribute to critical inference performance enhancements and optimizations. | You regularly ship code that significantly improves model serving performance. |
| Collaborate closely with research & applied ML teams. Influence model inference strategies and deployment techniques. | Seamless integration of inference innovations rapidly moves from research to production deployment. |
| Drive advanced performance optimizations. Implement model parallelism, kernel optimization, and compiler strategies. | Performance bottlenecks are quickly identified and eliminated, dramatically enhancing inference speed and scalability. |
| Mentor and scale your team. Coach and expand your team of performance-focused engineers. | Your team independently innovates, proactively solves complex performance challenges, and consistently levels up their skills. |
One of the highest impact roles at one of the fastest growing companies (revenue is growing 40% MoM, we are 60x+ RR compared to last year, raised Series A/B/C within the last 12 months) with a world changing vision: hyperscaling human creativity.
Sound like your calling? Share your proudest optimization breakthrough, open-source contribution, or performance milestone with us. Let's set new standards for inference performance, together.