0:00
/
0:00
Preview

Alibaba's ZeroSearch - cheaper RL using synthetic data

New models that include search capabilities are expensive to train, especially if you leverage reinforcement learning using API calls to search engines.

With ZeroSearch, Alibaba shows how you can train smaller LLMs to generate synthetic search results that can be used to train LLMs to learn search capabilities.

Listen to this episode with a 7-day free trial

Subscribe to Startup Monologues with @dianasaurbytes to listen to this post and get 7 days of free access to the full post archives.