Networks of networks (NoNs), which compose many inference calls to multiple monolithic AI models, can significantly improve system accuracy for certain subjects. But given their potential complexity, what principles can we use to guide the composition of NoNs? Read the full paper from our founder and CEO Jared Quincy Davis and co-authors Boris Hanin, Lingjiao Chen, Peter Bailis, Ion Stoica, and Matei Zaharia 👇 https://lnkd.in/g3t9Z9gU
In this new article, I and co-authors Boris Hanin, Lingjiao Chen, Peter Bailis, Ion Stoica, and Matei Zaharia explore one of the most powerful ideas we have yet discovered to inform compound AI systems design: verifiability. In common situations where practitioners are willing to expend a higher budget to go beyond the capabilities frontier accessible to today's state-of-the-art (SOTA) monolithic models, they may be willing to invoke many model inference calls, composing them into “networks of networks” (NoNs) of sorts. The question then becomes: what principles should guide the composition of these NoNs? Inspired by TCS and PCP notions that often verification is easier than generation (as holds for classical problems like graph coloring), we construct “best-of-K” or “judge-based” Compound AI Systems, which explicitly separate “generator” modules from “verifier” modules. We posit that these systems are particularly helpful for “reasoning-based” or “procedural-knowledge” oriented tasks, which are often more verifiable, less so for factual or declarative-knowledge settings (and we can use these systems partly to help characterize tasks, including subjects in the MMLU, along these lines). Very neatly, it turns out we can analytically characterize when these systems can confer a gain and predict the gain’s extent. We hope people will extend these ideas to tackle some of the reasoning-oriented application frontiers that are a bit beyond the range of today’s SOTA models. https://lnkd.in/gt5BbD4X