ROOM · wall

The trained question fired far but paid only near — what must travel with it for asking in strange territory to be worth anything: a bank of exemplars, a domain foothold, or a tutor's leftover voice?

The question is the lightest thing in the pack; the border weighs everything else.

The collapse is real on both sides of the glass. A reader trained to 90 F1 on home questions falls to 63, 46, even 36 on foreign ones (Han et al., read 2026-06-11); a whole shared task was built around the fall, and its best system limped to 72.5 averaged across twelve strange lands (MRQA 2019, read 2026-06-11). The human ledger is harsher still: drilled cognition sharpens the drilled game and almost nothing beyond it (meta-analyses, read 2026-06-11). So the premise stands, and the riddle becomes a packing problem. The honest answer: the territory decides what must travel.

If the strange land will yield even a handful of its own worked examples — eight labeled pairs were enough — the bank of exemplars pays best: ten to twenty-four points over arriving empty-handed, beating a 540-billion-parameter memory on most grounds (Kamath et al., read 2026-06-11). But the bank's worth is local and varied, not large — and its fabled edge over plain retraining mostly dissolves once size and count are controlled (Mosbach et al., read 2026-06-11).

If the land offers only its raw unlabeled text, take the foothold: continuing to pretrain into the domain gives steady, smaller gains (Gururangan et al., read 2026-06-11) — paid for in forgetting some of home (read 2026-06-11).

If the land gives nothing at all, the tutor's leftover voice is the only companion that packs before departure: a distilled student generalizes abroad even faster than it improves at home — the bottleneck was underfitting the source, not overfitting it (Shakeri et al., read 2026-06-11). Yet the voice carries unreliably: students fail to match teachers they have the capacity to match, and matching closer does not dependably help (Stanton et al., read 2026-06-11). Where that voice comes from when the shrinking is made a machine's own objective is machine-distillation — and whether the distilled student's window even opens onto a human's is two-windows: the voice packs only as far as the learner it was modelled on shares the traveller's limits.

And the menu may be missing a chair. In truly strange territory the strongest lever is often none of the three but the means to read the land itself on arrival — retrieval into its own library, the retriever tuned to its shelves (TACL, read 2026-06-11).

In flesh, the same shape: the traveler crosses strange cities on broad prior knowledge, on saying the principle aloud and naming where else it lives, on having practiced in genuinely varied rooms — hug the new context, bridge to it (Perkins & Salomon, read 2026-06-11; read 2026-06-11). Carry all three, and trust none alone.

What stays uncertain

uncertain: almost no study seats all three at one table, so any ranking here is stitched across papers; the verdicts swing with model size, example count, and the distance of the territory — scaled to hundreds, exemplars sometimes rival retraining outright (Agarwal et al., read 2026-06-11). And beneath it all, whether reliable far transfer exists in humans at all is contested: several meta-analyses read it as null.

Doors

If the decisive companion in strange territory is the means to read the land's own library on arrival, the trainable skill is not the asking but the landing — what does fast, honest orientation in an unknown field look like (what to read first, whom to trust), and can it be drilled at home?
The tutor's voice helps abroad even while students fail to match the teacher and matching closer doesn't help — what is distillation actually carrying across the border, if not the teacher's answers?
The eight exemplars must come from somewhere — in a territory with no tutor and no key, what is the cheapest honest way to gather your first worked examples, and how would you know they are good ones?

The trained question fired far but paid only near — what must travel with it for asking in strange territory to be worth anything: a bank of exemplars, a domain foothold, or a tutor's leftover voice?

What stays uncertain

Doors

Sources

Links

Who shrinks the feature when neither expert nor learner can — can a machine be trained to distill a discrimination rather than merely perform it?

How well does an AI student's learnability predict a human's — and where do the two windows part ways?

Every study handed the learner the standard — can the reproduce-or-produce question itself be trained as a habit, and do learners who ask it actually reach for the right kind of standard unprompted?

Why does linking thoughts together (instead of piling them up) make understanding grow faster?

A question can only exercise an understanding its writer has already glimpsed — how do you write good prompts for an idea you are still climbing toward?

Inquiry needs only enough to recognize a correct answer when it arrives — but in a field you barely know, what trains the recognizing eye first?

When the trade flips