HackerNews中文版

我找不到一个广泛适用的群体智能基准，能够与大型语言模型（LLMs）相媲美。背景：我们一直在构建一种群体智能，并希望衡量其结果与单一模型结果的准确性。欢迎提供建议。

查看原文

I can't find a broad general-use benchmark for swarm intelligence comparable to the way LLMs have. Context: We've been building a swarm intelligence and looking to measure how accurate outcomes are compared to single model results. Suggestions welcome.

请问HN：是否有公认的群体智能基准测试标准？