When Sources Multiply but Positions Don’t

To form an opinion on a serious question, I almost always start with a search engine. So do you, probably. This is about what happens in that moment, and what we're building to fix it.

Apr 30, 2026

Unbubble Hub is an Open Research Initiative that aims to provide a space for researchers and engineers to come together and collaborate in developing tools to fight social polarization.

Sources is a GitHub repository (a piece of code) that takes a news event and returns sources, categorized and ranked, representing a range of diverse viewpoints.

David La Barbera (find him on LinkedIn) is a researcher at the University of Udine, with a PhD in Artificial Intelligence and a focus on misinformation and information retrieval. With Unbubble Hub he works on what he sees as one of the central problems of our information environment: the slow, structural way polarization and disinformation shape what people get to read, often without anyone noticing. Sources is part of that effort. This is his first article for Unbubble Hub.

Consider this scenario. It’s spring 2026, you live in Europe, and you’re thinking about buying an electric car. You read something about the Chinese ones. They’re cheap and the reviews are decent. But you heard something about EU tariffs, and before spending twenty or thirty thousand euros, you want to understand what’s going on: why is the EU trying to make these cars more expensive?

So you do what anyone would do: you ask a search engine. You read the first article. Then maybe the second. They’re pieces from reputable outlets explaining the EU’s reasoning, the Chinese government’s response, and some analysts’ views. In fifteen minutes you have an opinion: tariffs protect European jobs from subsidized Chinese competition. China objects, talks are ongoing. You close the tab.

This is what most searches look like: users typically read one or two of the top results, end without any click at all: the user finds an answer in the snippet at the top of the page, or simply moves on.

But here’s the catch: there are at least seven or eight serious positions on this issue, held by people who have thought about it carefully and arrived at different conclusions. And you probably just formed your opinion based on one or two of them.

black and gray automatic motor scooter — Photo by Ernest Ojeh on Unsplash

What you didn’t read

Let me name three perspectives on the matter, just to make this concrete.

The first one you almost certainly encountered: European carmakers can’t fairly compete with Chinese manufacturers backed by massive state subsidies. Without tariffs, you lose factories, jobs, industrial know-how that took decades to build.

There’s a second perspective: the energy transition is urgent, Chinese EVs are good and cheap, tariffs make electric cars more expensive which slows electrification and means more emissions for longer. The right frame, from this angle, isn’t “Europe vs. China” but “how fast can we electrify the European fleet.” This perspective tends to be nearly invisible. Not censored, just not foregrounded by the editorial logic of business and political coverage.

The same can be told about a third perspective, at least for queries in any Italian, French, or Spanish language, even though it’s been central to the German debate. The fact is that the European auto industry is not a monolith and carmakers like BMW, Mercedes, and Volkswagen have decades of investment in China, to the point that they now sell more cars in China than in Europe. For them, EU tariffs are a threat: China retaliates and their exports get hit, and some of the targeted vehicles are partly their own. This is why a significant part of the European industry actively lobbied against the tariffs.

Did you know these three perspectives? I did not, before writing this piece. And there are others I’m not naming here, cause this isn’t a piece about EV tariffs: it’s a piece about what happens when you try to inform yourself about questions like this one.

Why this kind of question is different

But what kind of question is this one, exactly?

For some questions, the facts alone settle the answer, and the job of a search engine is just to surface those facts accurately. Ten results saying the same thing means the system did its job.

Take “when did the French Revolution begin?” There is a date, July 14, 1789. Expecting “multiple perspectives” on that would be nonsensical. Or take “is human activity causing the Earth to warm?” Here too, there is a scientific consensus built over decades of evidence. This doesn’t mean objections are pointless (scientific claims, unlike calendar dates, are sharpened by scrutiny) but scrutiny is not the same as balance. Treating climate denial as “the other side” is not diversity. It’s misinformation dressed up as fairness.

But the EV tariff question is in a different category entirely. There are shared facts here too: Chinese EVs cost less, the EU has imposed tariffs, and so on. Nobody contests these. What’s contested is what to make of them. Different people, looking at the same facts, might reach different conclusions because they weigh different considerations: jobs now versus climate acceleration, industrial sovereignty versus consumer welfare, short-term protection versus long-term competitiveness. The disagreement isn’t a gap in the facts. It’s that the same facts admit more than one legitimate reading.

We can call these substantively contested questions. Their defining feature is that reasonable people, looking at all the available evidence, can and do legitimately arrive at different positions. The disagreement isn’t a bug to be fixed by collecting more data. It’s a structural feature of the question.

This is worth pausing on, because there’s a subtle point here that’s easy to miss. Saying that more data won’t resolve the disagreement is not the same as saying that nothing is missing from your view. Something usually is missing, but what’s missing isn’t facts. It’s exposure to the range of substantive positions through which those facts get interpreted. You can have all the data on Chinese subsidies, European jobs, and emission targets, and still hold a partial view if you’ve only encountered one or two of the positions that read those facts. The gap isn’t in the evidence, it’s in the conversation.

This matters because questions like this are at the core of democratic life. Immigration policy. Energy strategy. Taxation. Urban planning. Foreign policy. The governance of technology. These aren’t questions we’re going to solve by collecting more data. They’re questions we answer, and re-answer, by working through the competing positions that real people hold for real reasons.

Why retrieval systems work this way

It’s worth a brief word on why this happens, because the mechanism is not malicious and not fixable in the obvious ways. Any retrieval system such as search engine, recommendation feed, AI assistant, has to make three kinds of choices. First, it chooses what to index in the first place (some content is harder to collect, some is behind paywalls, some is in languages the index handles poorly). Second, it chooses how to rank what it has indexed (and ranking algorithms tend to reward already-popular sources). And third, it has to choose how to interpret your query. None of these choices is avoidable. Showing you everything is not an option; everything is of course too much. So someone has to decide what to put in front of you, and someone has, algorithmically. But every choice produces a boundary, and the boundary is not a flaw of the system, it is a structural property of any system that selects a finite subset from an effectively infinite informational reality.

This is worth distinguishing from the filter bubble argument that dominated the 2010s: the idea that algorithmic personalization shows each user a world tailored to their past preferences. That problem exists. But the problem I’m describing sits upstream of personalization. Even for a user the system knows nothing about, with no browsing history, the result set is structurally truncated. Personalization can shift the boundary; it doesn’t create it. The consequence is subtle. If I form an opinion based on systematically partial input that opinion isn’t necessarily wrong. The things I read may all be true. The facts I cite may all be accurate. But my opinion is incomplete in a way I cannot assess from within it. I don’t have the tools to know what I’m missing, and I can’t even suspect it, because the missing dimension doesn’t present itself to me as “missing”, it simply isn’t in my space of considered alternatives.

This is the specific harm: not error, but blindness through non-exposure. I don’t believe the false. I just don’t know part of the true.

Sources: what we’re building

At Unbubble Hub we’re working on tools that try to address the problem I just described: to make this boundary, at least for the questions where it matters, partially visible. The first of these tools is called Sources, and a first version is already online.

The goal is straightforward to state: given a query on a substantively contested topic, Sources tries to maximize the diversity of substantive perspectives covered, with the smallest set of sources possible. The underlying assumption is that on this kind of question, diversity itself is a proxy for quality. A small set of sources that spans the range of substantive positions is more useful, for someone trying to form an opinion, than a large set that repeats the same one or two positions in different voices. Sources expands the retrieval to deliberately surface positions that default ranking does not foreground, annotates what was found, and reports transparently which axes of diversity it tried to cover.

We don’t know yet how well this works. The first version is a starting point, not a solution. We’ll publish what we learn here as it comes, including, and especially, the parts that don’t work. If you have critiques, corrections, or want to collaborate, we’re listening and very open to feedback!

What we still don’t know how to do

The hard part of this work though, isn’t technical: it’s conceptual.

One might ask: who decides which perspectives count as substantive? Any answer is a normative choice, and different perspectives might not carry equal weight in public discourse. Any system that treats all perspectives as equivalent is making a claim it should be forced to defend. We haven’t solved this. We’re not sure it can be fully solved by an algorithm. What we believe is that the criteria a system uses have to be declared openly and made contestable, not buried inside the ranking.

One might also ask: how do we even identify the actors in a given debate? In some cases the actors are obvious while in others they emerge over time. A retrieval system that decides this question silently is making editorial choices without naming them.

Also: how do we formalize “perspective” in a way that is usable computationally without flattening what makes perspectives different in the first place? Reducing a position to a set of tags risks losing the texture that makes it a position rather than a slogan. But not formalizing it at all means the system can’t operate at scale.

Our working principle, across all of these, is that the meta-level has to be made explicit. A system trying to offer position diversity should declare openly which axes of diversity it is trying to cover, by what criteria, and with what known limits. Otherwise it reproduces, at a higher level, the same invisibility the whole exercise was trying to address.

If we don’t manage to make this boundary at least partially visible, for the questions where it actually matters, we’ll keep building solid opinions on foundations we can’t inspect. Not because those opinions are wrong, but because they’re incomplete in ways we can’t see, on questions where incompleteness isn’t a technical detail. It’s the substance of the problem.

That’s what Sources is trying to change, one (contested) question at a time.

References

Fishkin, R. (2024). 2024 Zero-Click Search Study. SparkToro, on Datos/Semrush panel data. sparktoro.com

Pan, B., Hembrooke, H., Joachims, T., Lorigo, L., Gay, G., & Granka, L. (2007). In Google We Trust: Users’ Decisions on Rank, Position, and Relevance. Journal of Computer-Mediated Communication, 12(3), 801–823.

Pandey, S., Roy, S., Olston, C., Cho, J., & Chakrabarti, S. (2005). Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results. Proceedings of VLDB.

Pariser, E. (2011). The Filter Bubble: What the Internet Is Hiding from You. Penguin Press.

Methodological note

I used Claude to refine sentences, test phrasings, and pressure-test the structure of the argument.

A guest post by

David La Barbera

Unbubble Hub - Open Research

Discussion about this post

Ready for more?