Loading...

Abstracts: NeurIPS 2024 with Weizhu Chen 

Abstracts: NeurIPS 2024 with Weizhu Chen 

Next-token prediction trains a language model on all tokens in a sequence. VP Weizhu Chen discusses his team’s 2024 NeurIPS paper on how distinguishing between useful and “noisy” tokens in pretraining can improve token efficiency and model performance.

Read the paper

Get the code

Published on:

Learn more
Microsoft Research Podcast
Microsoft Research Podcast

An ongoing series of conversations bringing you right up to the cutting edge of Microsoft Research.

Share post:

Related posts

Stay up to date with latest Microsoft Dynamics 365 and Power Platform news!
* Yes, I agree to the privacy policy