This role involves driving forward-looking Generative AI (GenAI) Machine Learning architecture exploration for Tensor mobile SoCs. Collaboration with research, system architecture, and compiler engineering teams is crucial to optimize future workloads across the entire tech stack (hardware, software, use cases, network, and external components). Responsibilities include working with researchers and program management to define system architecture requirements for future GenAI use cases, applying advanced research to achieve breakthrough power and performance improvements on GenAI workloads, and optimizing GenAI performance by defining optimal model scheduling on TPU compute engines. The ideal candidate possesses expertise in computer architecture, performance, and compilers, with experience in GenAI model architectures (LLMs, Vision Transformers, etc.) and programming languages (C/C++, Python) and deep learning frameworks (TensorFlow/Jax/PyTorch).