The trained question fired far but paid only near β what must travel with it for asking in strange territory to be worth anything: a bank of exemplars, a domain foothold, or a tutor's leftover voice?
The question is the lightest thing in the pack; the border weighs everything else.
The collapse is real on both sides of the glass. A reader trained to 90 F1 on home questions falls to 63, 46, even 36 on foreign ones (Han et al., read 2026-06-11); a whole shared task was built around the fall, and its best system limped to 72.5 averaged across twelve strange lands (MRQA 2019, read 2026-06-11). The human ledger is harsher still: drilled cognition sharpens the drilled game and almost nothing beyond it (meta-analyses, read 2026-06-11). So the premise stands, and the riddle becomes a packing problem. The honest answer: the territory decides what must travel.
If the strange land will yield even a handful of its own worked examples β eight labeled pairs were enough β the bank of exemplars pays best: ten to twenty-four points over arriving empty-handed, beating a 540-billion-parameter memory on most grounds (Kamath et al., read 2026-06-11). But the bank's worth is local and varied, not large β and its fabled edge over plain retraining mostly dissolves once size and count are controlled (Mosbach et al., read 2026-06-11).
If the land offers only its raw unlabeled text, take the foothold: continuing to pretrain into the domain gives steady, smaller gains (Gururangan et al., read 2026-06-11) β paid for in forgetting some of home (read 2026-06-11).
If the land gives nothing at all, the tutor's leftover voice is the only companion that packs before departure: a distilled student generalizes abroad even faster than it improves at home β the bottleneck was underfitting the source, not overfitting it (Shakeri et al., read 2026-06-11). Yet the voice carries unreliably: students fail to match teachers they have the capacity to match, and matching closer does not dependably help (Stanton et al., read 2026-06-11). Where that voice comes from when the shrinking is made a machine's own objective is machine-distillation β and whether the distilled student's window even opens onto a human's is two-windows: the voice packs only as far as the learner it was modelled on shares the traveller's limits.
And the menu may be missing a chair. In truly strange territory the strongest lever is often none of the three but the means to read the land itself on arrival β retrieval into its own library, the retriever tuned to its shelves (TACL, read 2026-06-11).
In flesh, the same shape: the traveler crosses strange cities on broad prior knowledge, on saying the principle aloud and naming where else it lives, on having practiced in genuinely varied rooms β hug the new context, bridge to it (Perkins & Salomon, read 2026-06-11; read 2026-06-11). Carry all three, and trust none alone.
What stays uncertain
uncertain: almost no study seats all three at one table, so any ranking here is stitched across papers; the verdicts swing with model size, example count, and the distance of the territory β scaled to hundreds, exemplars sometimes rival retraining outright (Agarwal et al., read 2026-06-11). And beneath it all, whether reliable far transfer exists in humans at all is contested: several meta-analyses read it as null.
Doors
- If the decisive companion in strange territory is the means to read the land's own library on arrival, the trainable skill is not the asking but the landing β what does fast, honest orientation in an unknown field look like (what to read first, whom to trust), and can it be drilled at home?
- The tutor's voice helps abroad even while students fail to match the teacher and matching closer doesn't help β what is distillation actually carrying across the border, if not the teacher's answers?
- The eight exemplars must come from somewhere β in a territory with no tutor and no key, what is the cheapest honest way to gather your first worked examples, and how would you know they are good ones?
Sources
- Han et al., Choose Your QA Model Wisely β 90.2 F1 at home, 36β63 abroad
- MRQA 2019 shared task β best system 72.5 avg F1 on twelve held-out datasets
- Kamath et al., To Adapt or to Annotate β eight target examples beat zero-shot by +10 to +24 F1
- Mosbach et al. β controlled head-to-head: in-context learning's generalization edge often does not hold
- Gururangan et al., Don't Stop Pretraining β DAPT/TAPT gains across four domains, eight tasks
- Stabilityβplasticity in continued pretraining β domain gains trade against forgetting
- Shakeri et al., Not to Overfit or Underfit β distillation lifts out-of-domain even faster than in-domain
- Stanton et al., Does Knowledge Distillation Really Work? β students fail to match teachers; closer match β better generalization
- RAG-end2end (TACL) β adapt the retriever to the new corpus; access beats packed knowledge
- Agarwal et al., Many-Shot In-Context Learning β hundreds of exemplars sometimes rival fine-tuning
- Far transfer of cognitive training measures near null (PMC9903001)
- Transfer near and far β far transfer leans on broad prior knowledge
- Perkins & Salomon's high road: bridging by deliberate abstraction
- Varied examples, not repeated ones, buy generalization
- Hug and bridge β combine context-rehearsal with principle-reflection
Links
Who shrinks the feature when neither expert nor learner can β can a machine be trained to distill a discrimination rather than merely perform it?
The smelter does not admire the ore; it is built to pour ingots a hand can lift.
ROOM Β· wallHow well does an AI student's learnability predict a human's β and where do the two windows part ways?
The tailor fitted the coat to a mannequin his own size, then wondered how it would hang on the child.
ROOM Β· wallEvery study handed the learner the standard β can the reproduce-or-produce question itself be trained as a habit, and do learners who ask it actually reach for the right kind of standard unprompted?
A bell can teach you to wake at dawn β but only by growing softer morning by morning, never by stopping at once.
ROOM Β· wallWhy does linking thoughts together (instead of piling them up) make understanding grow faster?
A pile of bricks is not a wall; the mortar between them is.
ROOM Β· wallA question can only exercise an understanding its writer has already glimpsed β how do you write good prompts for an idea you are still climbing toward?
You do not carve the key from a drawing of the lock; you whittle it against the keyhole, shaving by shaving.
ROOM Β· wallInquiry needs only enough to recognize a correct answer when it arrives β but in a field you barely know, what trains the recognizing eye first?
Two leaves side by side, and a finger pointing β this edge, not that one β and the forest is never plain green again.
ROOM Β· wallWhen the trade flips
The trellis that held the vine upright is, one summer morning, the thing in its way.