Member of Technical Staff, AI Data
Microsoft
Job Summary
Microsoft AI is seeking a Member of Technical Staff to build the world's most advanced multimodal dataset. This role involves designing and developing data pipelines for ingesting massive amounts of multi-modal data (text, audio, images, video), building and maintaining infrastructure to store and process petabytes of data, and collaborating with pre-training and post-training teams to improve data quality through experimentation. The ideal candidate will partner with product teams and researchers to identify model gaps and possess strong data engineering skills, a proactive attitude, and an ability to manage multiple responsibilities in a fast-paced environment. Responsibilities include designing data pipelines, building and maintaining data infrastructure, partnering with other teams to improve data, and collaborating to identify model gaps.
Must Have
- Design and develop data pipelines
- Build and maintain data infrastructure
- Collaborate with pre-training teams
- Identify model gaps
- Experience in data engineering
Job Description
Job Description
- Are passionate about the role of data in large-scale AI model training
- Will thrive in a highly collaborative, fast-paced environment
- Have a high degree of craftsmanship and pay close attention to details
- Demonstrate a proactive attitude and enthusiasm for exploring new methods and technologies
- Effectively manage multiple responsibilities and can adjust to shifting priorities.
- Design and develop data pipelines that ingest enormous amounts of multi-modal training data (text, audio, images, video).
- Build and maintain cutting-edge infrastructure that can store and process the petabytes of data needed to power models.
- Partner with the pretraining and post-training teams to improve our data recipe by rigorous and careful experimentation.
- Collaborate with the product team and other engineers and researchers across Microsoft AI to identify gaps in the current generation of models.
- Embody our and .
- Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6 years experience in business analytics, data science, software development, data modeling or data engineering work
- OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 4 year(s) experience in business analytics, data science, software development, or data engineering work
- OR equivalent experience.
- Bachelor's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 8 years experience in business analytics, data science, software development, data modeling or data engineering work
- OR Master's Degree in Computer Science, Math, Software Engineering, Computer Engineering, or related field AND 6 years of business analytics, data science, software development, data modeling or data engineering work experience
- OR equivalent experience.