Does discussion make crowds any wiser?

https://www.sciencedirect.com/science/article/abs/pii/S0010027721003358

Here’s a research article that I had laying around for several years (pre-LLM). I dusted it off and asked Grok to summarize it. Here is what Grok says:

Summary

This 2021 paper by Hugo Mercier and Nicolas Claidière, published in Cognition, investigates whether discussion in large groups enhances or undermines the “wisdom of crowds” phenomenon, where aggregating independent opinions (e.g., via averaging or majority voting) often yields surprisingly accurate results. The authors review theoretical arguments: discussion might reduce diversity and independence (potentially harming aggregation), but it could also improve individual accuracy (as seen in small-group studies). To test this in larger groups, they conducted an experiment during the 2017 European Researchers’ Night across 11 French cities, involving 1,958 participants in 33 groups (sizes 22–212, mean 59).

Participants tackled one of three problem types—demonstrative (logical/mathematical, e.g., “bat and ball” puzzle), factual (numerical estimates, e.g., goals in the 2010 World Cup), or ethical (value judgments, e.g., compensation for losing a finger)—first individually (5 minutes of silent thinking, answers recorded every minute), then via discussion with neighbors (10 minutes, answers recorded every minute). Performance was measured against objective benchmarks where possible (demonstrative and factual problems). Ethical problems served as controls, with results relegated to supplementary materials due to interpretive challenges.

Key findings: Discussion improved individual accuracy across problems with correct answers, and it either enhanced or did not harm aggregated group answers. The paper concludes that, for solvable problems, discussion in large groups tends to boost rather than detract from collective wisdom.

Breakdown of Most Important Conclusions

The authors tested specific hypotheses (H1a–c, H2) and a research question (RQ1). Here’s a table summarizing the key conclusions, supported by evidence from the study:

Conclusion Problem Type Key Evidence Implications
Discussion improves individual performance more than solitary thinking (H1a, H1b confirmed; H1c partially confirmed) Demonstrative and Factual - For demonstrative problems: Correct answers rose from ~12–41% (post-silence) to 84–91% (post-discussion).- For factual problems: Mean error decreased (e.g., World Cup error halved from 96 to 52; Elevators from 55 to 48).- For ethical problems: No systematic improvement (results in ESM, no strong conclusions drawn). Extends small-group benefits to large, diverse crowds; participants converged toward correct answers via arguments, not just majority pressure.
Discussion improves aggregated answers via majority voting (H2 confirmed) Demonstrative - Post-silence: Majority correct in only 2/13 groups.- Post-discussion: Majority correct in all 13 groups (significant, p=0.003). “Truth wins” in large groups: One correct individual can convince many through discussion.
Discussion improves or has no effect on aggregated answers via averaging (RQ1 answered positively/neutrally) Factual - Elevators: All 6 groups reduced mean error post-discussion (p=0.03).- World Cup: No consistent change in mean error (2 groups improved, 2 worsened), due to variance reduction without directional shift toward the correct answer. Loss of diversity is offset by accuracy gains; no detriment to wisdom of crowds. Overall, discussion enhanced aggregates in 3/4 cases with benchmarks and was neutral in 1/4.
No detrimental effects overall All (with benchmarks) - Individual answers always improved.- Aggregates improved or stayed the same; never consistently worsened.- Participants moved toward correct answers more than initial group averages. Challenges theories predicting discussion harms diversity/independence (e.g., Lorenz et al., 2011); supports argumentation theory (Mercier & Sperber, 2017).

The study limitations include modest group sizes relative to massive crowds, homogeneous problems, and constrained discussion (only with neighbors). Ethical problems were inconclusive, and results are strongest for problems with objective truths.

What Should Be Done in Light of This Research

Based on the findings, I think organizations and decision-makers should prioritize incorporating discussion into large-group processes for problems with verifiable answers, like policy evaluations, forecasting, or problem-solving in business/scientific teams. For instance, citizens’ assemblies or corporate brainstorming could allocate time for structured neighbor-based discussions before voting or averaging, as this setup proved effective here. It could be scaled via online platforms (e.g., forums or breakout rooms) to mimic the neighbor constraint while allowing flexibility.

Further research should test even larger crowds (e.g., thousands) and varied network structures—maybe optimizing connections to spread accurate info faster, as hinted by related studies. Also, explore non-objective problems (beyond ethics) like creative ideation, where diversity loss might hurt more. Practically, avoid over-relying on silent aggregation in high-stakes scenarios; the gains in accuracy seem worth the time investment. If anything, this suggests ancient Athens might have benefited from allowing some discussion before votes, at least for factual matters.

So even though Dr. Rosenberg used the hockey puck method to correctly predict the superfecta of the Kentucky derby, he is now pivoting strongly towards conversational collective intelligence - done with typing.

This seems like the best way humans can communicate so it doesn’t surprise me.

To me we must be careful and use technology to keep people anonymous, because if the decisions are being made and you can see group labels (see, race, religion, etc) it can be a problem once there is power behind the decisions.

Tyranny requires groups labels. It goes:

group labels - > echo chambers → groupthink - > tyranny

Dr Rosenberg has a bunch more modern papers written about his conversational AI much more recently than 2021:

2023 was hs pilot:

Conversational Swarms:

More:

With University of Pitt:

Using conversational swarms vs simple chat:

Using AI hybrids with conversational swarming:

Scaled up tests:

And here are 126 other papers on Swarming:

It no doubt works. We need an eco system platform to implement it with user accountability in businesses and networks.