0:00
/
0:00
Preview

The issue with just trusting the AI benchmarks

I’m not sure why we all get super excited when a new model comes out that beats another model by 2%. What does that even mean anymore?

Benchmarks are not necessarily a good measure because:

  • they don’t reflect your use case, so there’s no saying that the benchmarks they report will generalize to your use case

  • even the chatbot arena has shown interesting bia…

Listen to this episode with a 7-day free trial

Subscribe to Startup Monologues with @dianasaurbytes to listen to this post and get 7 days of free access to the full post archives.