Benchmarks

This section catalogues benchmarks that measure how well artificial intelligence systems understand and operate in the wine domain. Each benchmark documented here is a transparent, reproducible test: the question set is published, the scoring rubric is open, and model results are versioned alongside the underlying data so that the leaderboard can be re-run as new models appear.

The first benchmark hosted on the site is OenoBench, a wine knowledge evaluation built from roughly five thousand questions spanning viticulture, winemaking, the wine business, and the world’s wine regions. Additional benchmarks — covering vineyard imagery, sensory description, and sommelier-style reasoning — will be added as they are designed and validated.