Contribute to developing one of the world's best foundational AI models at Microsoft AI. The Pre-Training team focuses on challenging deep learning problems at scale. Responsibilities include developing algorithms, model architectures, data mixtures, and scaling laws for large-scale training using a data-driven approach. This involves algorithmic implementation, experimentation, overseeing flagship training runs on a large-scale distributed stack, and close collaboration with infrastructure, data, post-training, and multimodality teams. The ideal candidate will have proven expertise in pre-training, demonstrated by a strong publication record and technical leadership in high-impact projects. Strong analytical skills, attention to detail, and experience with large-scale distributed systems are essential.