Posted 1w ago

Research Engineer - The Diffusion LLM Team

@ Institute of Foundation Models
Sunnyvale, California, United States
OnsiteFull Time
Responsibilities:Designing models, Training models, Scaling models
Requirements Summary:MSc/PhD in ML or CS; hands-on large-model training; transformer architectures; diffusion models knowledge a plus; publications/open-source contributions; independent yet collaborative.
Technical Tools Mentioned:PyTorch, TensorFlow, JAX, Distributed Training
Save
Mark Applied
Hide Job
Report & Hide
Job Description

About the Institute of Foundation Models


We are a dedicated research lab for building, understanding, using, and risk-managing foundation models. Our mandate is to advance research, nurture the next generation of AI builders, and drive transformative contributions to a knowledge-driven economy.

 

As part of our team, you’ll have the opportunity to work on the core of cutting-edge foundation model training, alongside world-class researchers, data scientists, and engineers, tackling the most fundamental and impactful challenges in AI development. You will participate in the development of groundbreaking AI solutions that have the potential to reshape entire industries. Strategic and innovative problem-solving skills will be instrumental in establishing MBZUAI as a global hub for high-performance computing in deep learning, driving impactful discoveries that inspire the next generation of AI pioneers.



The Role

As a member of the Diffusion LLM Team at MBZUAI, you will play a central role in designing, building, and releasing industrial-scale Diffusion Large Language Models. Our team has two core missions. First, we develop and release diffusion-based LLMs that push the speed–quality frontier at scale by matching autoregressive model quality while enabling faster generation. Second, we improve inference-time scaling relative to standard LLMs, so that additional test-time compute translates into higher-quality samples. 

You will work closely with researchers and engineers across architecture, training, and infrastructure to turn research ideas into high-impact model releases for next-generation LLMs.



Key Responsibilities

  • Design, train, and scale large language models for research and real-world deployment.

  • Lead or contribute to the release of industrial-scale diffusion language models.

  • Develop and evaluate training strategies and objectives for efficient model scaling.

  • Publish research findings and contribute to open-source model and code releases.


Academic Qualifications

MSc or PhD in Machine Learning or Computer Science, or equivalent industry experience.



Professional Experience

  • Hands-on experience training large models using modern deep learning frameworks at scale.

  • Strong background in transformer architectures and large-scale optimization techniques.

  • Demonstrated expertise in LLM pre-training or post-training, with a strong focus on model scaling.

  • Research track record evidenced by publications, open-source contributions, or released models.

  • Knowledge of diffusion models or discrete diffusion methods is a plus, but not required.

  • Ability to work independently while contributing effectively to a collaborative research team.