Member of Technical Staff, Multimedia (Vision)

1 Hour ago • 3 Years + • Software Development & Engineering

Job Summary

Job Description

Fireworks AI is seeking a Member of Technical Staff specializing in vision-language modeling to advance their generative AI platform. This role involves leading research and development in multimodal models, from data preparation to deployment, and building production-quality systems. Responsibilities include designing and implementing scalable machine learning systems for tasks like image captioning and visual question answering, training large-scale VLMs using advanced techniques like LoRA/QLoRA and distributed training, writing production-ready Python code, and analyzing model performance. Collaboration with engineering, product, and design teams, as well as direct customer interaction, is crucial for translating VLM capabilities into real-world applications and contributing to the platform roadmap.
Must have:
  • 3 years of ML experience
  • Focus on computer vision, NLP, or multimodal systems
  • Proficiency in Python and deep learning frameworks
  • Experience training/deploying large models
  • Ability to write production-quality code
  • Customer interaction experience
Good to have:
  • Master's or PhD in a relevant field
  • Research experience in VLM or multimodal modeling
  • Experience with multimodal training/fine-tuning
  • Familiarity with LLMs and visual encoders
  • Open-source contributions or top-tier publications
Perks:
  • Solve hard problems at the forefront of AI infrastructure
  • Build cutting-edge technology impacting AI adoption globally
  • Ownership and impact in a fast-growing team
  • Collaborate with world-class engineers and researchers

Job Details

About Us:

Here at Fireworks, we’re building the future of generative AI infrastructure. Fireworks offers the generative AI platform with the highest-quality models and the fastest, most scalable inference. We’ve been independently benchmarked to have the fastest LLM inference and have been getting great traction with innovative research projects, like our own function calling and multi-modal models. Fireworks is funded by top investors, like Benchmark and Sequoia, and we’re an ambitious, fun team composed primarily of veterans from Pytorch and Google Vertex AI.

The Role: 

We are looking for a highly motivated Member of Technical Staff with expertise in vision-language modeling to join our research and engineering team. This role will drive advancements in our multimodal models and applications that combine visual understanding with natural language. You’ll be responsible for conducting cutting-edge research and building production-quality systems that bring state-of-the-art VLM capabilities into real-world products.

Key Responsibilities: 

  • Lead research and development efforts in vision-language models, including data preparation, model training, evaluation, and deployment.
  • Collaborate with teams in engineering, product, and design, and work directly with customers to understand their needs and translate VLM capabilities into real-world applications.
  • Design and implement scalable machine learning systems for tasks such as image captioning, visual question answering, retrieval, grounding, and multimodal reasoning.
  • Train large-scale VLMs using techniques such as  parameter-efficient fine-tuning (LoRA/QLoRA), reinforcement learning approaches, dataset curation and preparation, distributed training (DDP/FSDP), hyperparameter optimization.
  • Build robust, maintainable code in Python for both experimentation and production use.
  • Analyze model performance, conduct rigorous evaluations, and experiment based on empirical insights.
  • Contribute to the platform roadmap by providing technical insights into quality improvements, integrating latest multi-modal research, and identifying and proposing new platform capabilities with significant commercial potential.

Minimum Qualifications: 

  • Bachelor’s degree in Computer Science, Electrical Engineering, or a related field.
  • 3 years of experience in machine learning, with a focus on computer vision, NLP, or multimodal systems.
  • Strong proficiency in Python and deep learning frameworks such as PyTorch or TensorFlow.
  • Experience training and deploying large-scale models and working with distributed computing environments.
  • Demonstrated ability to write production-quality code and collaborate across teams.
  • Experience working directly with customers, partners, or external stakeholders to define use cases or requirements.

Preferred Qualifications: 

  • Master’s or PhD in a relevant technical field with research experience in vision-language or multimodal modeling.
  • Experience with multimodal training/fine-tuning and downstream tasks like VQA, captioning, or retrieval.
  • Familiarity with large language models (LLMs) and their integration with visual encoders.
  • Contributions to open-source projects or publications in top-tier ML/AI conferences (e.g., CVPR, ICCV, NeurIPS, ICML, ACL).
  • Comfortable working in fast-paced, cross-disciplinary environments and shipping research into production.

Why Fireworks AI?

  • Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
  • Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
  • Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
  • Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.

Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Redwood City, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Software Development & Engineering Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (On-Site)

Redwood City, California, United States (On-Site)

New York, United States (Hybrid)

Redwood City, California, United States (On-Site)

Redwood City, California, United States (On-Site)

Redwood City, California, United States (On-Site)

Redwood City, California, United States (Hybrid)

Redwood City, California, United States (On-Site)

Redwood City, California, United States (Hybrid)

View All Jobs

Get notified when new jobs are added by Fireworks AI

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug