Research-only companion

How queries move through semantic search.

This research companion follows a simple sequence: start with a short topic query, add a more specific phrase, then adjust that phrase to see whether languages converge more clearly.

The answer is useful but not automatic. A longer phrase can improve relevance while different languages drift toward neighboring branches of the same topic. Each topic below tells that movement as a small, auditable experiment.

Review topics Read methodology

The experiment, in plain language.

This is not a generic benchmark. It is a guided reading of four topics, using captured rankings, editorial judgments, coincidence matrices and abstracts to explain what changed from one query formulation to the next.

Short topic query

The initial query is intentionally compact. It gives a broad, high-recall view of the topic across languages.

Long phrase 1

The first long phrase adds subfield detail. It can improve focus, but may also send some languages toward different semantic branches.

Long phrase 2

The second long phrase keeps the specificity but restores a clearer shared anchor. We read the impact through hits, matrices and abstract clues.

Four topic narratives.

Each block follows the same story: start with the original short topic query, test a specific long phrase, inspect where languages separate, then try a more anchored version.

Methodology

How to read this page.

The experiment compares three query formulations for each topic and language: the short topic query from the public demo, a first long phrase, and a second anchored long phrase. Editorial judgments count both clearly relevant and partially relevant records as hits; only clearly unrelated records count as misses.

Matrix cells count shared semantic record identifiers between two language rankings. This measures convergence across languages, not relevance by itself. The diagonal is omitted because comparing a language with itself adds no information.

Arabic showed a distinct behavior in several runs, so this companion reports two convergence readings: the full multilingual matrix average, including Arabic, and a complementary average calculated without Arabic. The second metric does not remove Arabic from the experiment; it helps inspect whether the remaining languages converge differently once the Arabic-specific behavior is read separately.

The semantic-separation panels highlight the language pairs with the fewest shared records, then show abstract snippets from records found on only one side of that pair. This is the evidence used to explain why a phrase may split across languages.

Observed date is stored in the static payload generated by the research script.
The original matrix is copied from the current main demo during generation.
This page loads only long-query-demo-candidate.js.
Rankings can change as the index, model, or ranking logic evolves.