Stand-There Scenes — immersive "you are there" pages
Concept
A new page type on the vault explorer: full-bleed, photorealistic scenes that put the reader inside a moment the vault has actually evidenced. You don't read about the Eridu temple — you stand on the mud-brick platform at dusk, c. 5500 BCE, smoke off the altar, fish offerings cooling in the niche behind you.
Each scene is four things layered together:
1. The image — a full-viewport, archaeologically grounded photorealistic render (GPT-5.5 via codex CLI, per AGENTS.md §5), with slow ambient motion: a Ken Burns drift plus mouse/gyro parallax so the scene breathes instead of sitting there like a postcard.
2. The narration — a first-person, present-tense caption that types in as you linger: what you see, hear, smell, what your feet feel. Second person, present tense, sensory. Every sentence is grounded in a tier-1/tier-2 note.
3. The flip side — a panel (card-flip or slide-over) listing every citation behind the narration, wiki-linked into the vault: this sentence ← eridu-temple-sequence-archaeology, that one ← schmidt-2010-gobekli-tepe-worlds-oldest-temple.
4. The honesty toggle — the signature move. A switch between "what we actually know" and "what we're imagining." Every narration sentence carries an evidence class: attested (physical/textual evidence exists), inferred (reasonable reconstruction from evidence), imagined (atmospheric connective tissue). In honest mode, attested text glows, inferred text dims to amber, imagined text fades to near-invisible — and the scene image itself desaturates toward the raw archaeological reality. The toggle makes the vault's own epistemics (AGENTS.md §1: calibrated confidence, emic/etic separation) visible and felt, not just documented.
Launch set of four scenes, one per deep-time anchor the vault already holds:
| Scene | Moment | Anchor notes |
|---|---|---|
| Qafzeh Cave | A burial, ~92,000 BP, Levant | claim-qafzeh-ochre-earliest-symbolic-behavior, hovers-et-al-2003-qafzeh-ochre-color-symbolism |
| Göbekli Tepe | Dawn at Enclosure D, c. 9600 BCE | schmidt-2010-gobekli-tepe-worlds-oldest-temple, paleolithic-mortuary-religion |
| Eridu | The Level XVIII temple platform, c. 5500 BCE | eridu-temple-level-xviii-earliest-cult-building, eridu-temple-sequence-archaeology, sumerian-religion |
| Anyang | Wu Ding's divination chamber, c. 1250 BCE | shang-religion, oracle-bones-earliest-dated-chinese-religious-writing, keightley-sources-of-shang-history |
Why it takes you back in time
The vault's notes are rigorous but third-person and past-tense — the reader stays in 2026 looking at evidence. A stand-there scene inverts the camera: the reader is moved to the evidence's own moment and the present tense does the time travel ("the crack spreads across the scapula with a sound like ice"). Three mechanisms:
- Sensory grounding beats narrative summary. Fish bones and ash on the Level XVIII altar are a sentence in a note; the smell of river fish charring on a mud-brick podium while the marsh wind comes through the doorway is a memory the reader didn't know they could have. Every sensory detail is licensed by an extraction the vault already made.
- The deep-time gradient becomes walkable. Four scenes spanning 92,000 BP → 1250 BCE let the reader physically feel the gaps — ninety millennia between an ochre-stained grave and the first temple, six more between the temple and the first written god-names. No timeline diagram delivers that vertigo; sequential immersion does.
- The honesty toggle makes uncertainty an experience. Watching the Shang diviner's face fade to ghost-grey while the cracked scapula stays solid teaches the difference between artifact and reconstruction better than any
confidence: mediumbadge. It's the steelman rule (AGENTS.md §1.3) turned into a UI gesture — and it inoculates the experience against the fair criticism that immersive reconstruction is just costume drama.
Experience walkthrough
You're on the sumerian-religion tradition page. Under the banner, a new wide card: "Stand there — Eridu, c. 5500 BCE." You click.
1. Arrival (0–2 s). The page goes full-bleed black, then the scene fades up: dusk over the Mesopotamian marsh, a small mud-brick building on a low platform, doorway glowing faintly. A slow 60-second Ken Burns drift is already underway. Moving the mouse (or tilting the phone) shifts the frame a few degrees — parallax via layered transform. A single line of UI: the scene title, a date pill ("c. 5500 BCE · Ubaid I"), and the honesty toggle, default off (full imagining).
2. Narration (2 s+). Bottom third, on a soft gradient, sentences fade in one at a time as you dwell or scroll: "You are standing on packed river mud, still warm from the day. The room ahead of you is small — you could cross it in five steps. Against the far wall, a low brick podium; in the niche behind it, something you cannot quite see. On the altar, fish — offered, burning, the smoke going up through the doorway past you into the first stars." Scroll advances the beats; each beat may nudge the camera (CSS transform keyed to scroll position).
3. The flip. A quiet "⚖ sources" button flips the narration card. The back lists each sentence with its license: packed mud platform, podium, niche, fish bones + ash on altar → eridu-temple-sequence-archaeology (Safar, Lloyd & Mustafa 1981); dusk, smoke, stars → atmosphere, unlicensed. Every link goes into the vault proper.
4. The toggle. You flip "what we actually know." The image desaturates and darkens at the edges; imagined sentences fade to 15% opacity, inferred ones go amber, attested ones stay bright white. The scene visibly contracts to its evidentiary skeleton: a small room, a podium, a niche, fish bones, ash. A footer line appears: "Everything still bright is in the ground or in a text. Everything faded is us."
5. Exit / next. Scrolling past the last beat reveals the scene's "Record" footer — the standard frontmatter card, counter-evidence included (e.g. Qafzeh's utilitarian-ochre steelman is in the scene's own footer) — plus a strip of the other three scenes ordered by date, inviting the deep-time walk: 92,000 BP → 9600 BCE → 5500 BCE → 1250 BCE.
Data from the vault
All four launch scenes are already evidenced at tier 1–2; no new research is required, only narration authoring against existing extractions:
- Qafzeh (~92 ka BP) — claim-qafzeh-ochre-earliest-symbolic-behavior: deliberately transported (~35 km) and heat-treated red ochre; ochre-stained perforated Glycymeris shells worn as ornaments; intentional burials in a shared stratigraphic horizon. Honest-mode skeleton: grave, ochre, shells. Imagined: the people, the gestures, any grief. The claim's own counter-evidence (utilitarian ochre uses; Blombos/Skhul priority) feeds the footer.
- Göbekli Tepe (c. 9600 BCE) — schmidt-2010-gobekli-tepe-worlds-oldest-temple: Enclosure D's T-shaped pillars to 5.5 m with belt/loincloth reliefs (stylised anthropomorphs), high-relief foxes, snakes, vultures, scorpions; wild-game feasting debris; no dwellings, no hearths in-site. Honest mode keeps the stones and the bones; the dawn gathering and any rite are imagined.
- Eridu Level XVIII (c. 5500 BCE) — eridu-temple-level-xviii-earliest-cult-building + eridu-temple-sequence-archaeology: ~12 × 15 ft mud-brick room, central altar/podium, cult niche, fish bones and ash on the altar floor. The narration must not name Enki — sumerian-religion explicitly flags that identification as later back-projection; honest mode demotes "the god of the sweet water below" to imagined.
- Anyang (c. 1250 BCE) — shang-religion + oracle-bones-earliest-dated-chinese-religious-writing: heat applied to prepared ox scapulae/turtle plastrons; charge–crack–reading–verification structure; the king as ultimate divination authority, professional bu diviners; radiocarbon-dated to Wu Ding's reign 1254–1197 BCE. Richest scene — here even the words spoken can be attested (inscribed charges survive verbatim). Counter-evidence footer: royal ritual only; commoner religion invisible.
Frontmatter for scene notes follows the vault contract (title, type: scene, domain, tier, status, created, tags + attestation_earliest, sources, counter_evidence), so scenes validate and render like every other note type.
Implementation sketch
Bun only, zero npm deps, all inside the existing tools/server.ts (currently 467 lines; this adds ~150).
- Content: one markdown file per scene,
type: scene, living beside the notes they dramatise (e.g.02_mesopotamian/2_notes/scene-eridu-level-xviii.md). Narration beats are list items with an evidence-class prefix the renderer parses: [A] On the altar, fish — offered, burning.→ attested[N] The figure in the niche watches you.→ inferred[X] The first stars are out over the marsh.→ imagined
Each beat may end with ← source-note which the renderer lifts into the flip-side panel.
- Server: extend
NOTE_TYPESwith"scene"and add arenderScene(fm, body, rel)branch in the existing.mdhandler — same pattern asrenderNote. Scene image resolved by the existingillo()convention:assets/scenes/<slug>.png. - CSS (no JS needed for motion): full-bleed wrapper (
position:fixed; inset:0), image at 115% scale with a 60 s@keyframesKen Burns drift; honest-mode via a.honestclass on<body>drivingfilter: saturate(.25) brightness(.7)on the image and per-class opacity on[data-ev=X]spans; card flip via the standardtransform: rotateY(180deg)+backface-visibilitypair; all transitions ~600 ms ease. - Vanilla JS (~40 lines, inline like the existing version-poll script): pointer-move parallax (
translate3dof ±12 px on the image layer, lerped inrequestAnimationFrame);IntersectionObserverto fade narration beats in as the user scrolls; the toggle and flip as class flips;deviceorientationparallax on mobile behind a feature check.prefers-reduced-motionkills Ken Burns and parallax. - Discovery: tradition/claim pages that have a sibling scene get the "Stand there" card automatically (filename convention
scene-*), and a/scenesindex page lists all four chronologically — the deep-time walk. INDEX.md links/scenesonce implemented (AGENTS.md §5: implemented experiences are linked from INDEX). - Server stays read-only, no external requests, port 4870 unchanged.
Images needed
GPT-5.5 via codex CLI only (AGENTS.md §5). Shared suffix for every prompt: photorealistic, cinematic 35mm photography, natural light, archaeologically accurate, no text or lettering anywhere in the image, no watermark, no modern objects, 21:9 aspect.
1. Qafzeh burial, ~92,000 BP — Interior of a limestone cave in the Galilee, late afternoon light raking through the entrance. In a shallow pit in the cave floor, a deliberate human burial being covered: anatomically modern people with Middle Paleolithic dress (animal hide), lumps of deep red ochre beside the grave, small perforated seashells stained red on a hide cord. Dust motes in the light shaft, earth tones, the red ochre the only saturated color. Mood: hushed, intimate, vast time.
2. Göbekli Tepe at dawn, c. 9600 BCE — Inside a circular stone enclosure on a barren hilltop in southeastern Anatolia at first light. Two massive T-shaped limestone pillars, 5.5 m tall, at center; ring of smaller T-pillars in a rough stone wall; crisp low-relief carvings of foxes, snakes, vultures and scorpions on pillar shafts; carved belt and arms on the central pillars suggesting stylised giant figures. Cold blue dawn sky grading to amber at the horizon, breath-fog, scattered wild-game bones and grinding stones near the wall. No buildings, no fields, emptiness beyond the enclosure. Mood: monumental, eerie, pre-everything.
3. Eridu temple platform at dusk, c. 5500 BCE — A small rectangular mud-brick shrine (about 4 × 5 m) on a low clay platform at the edge of a southern Mesopotamian freshwater marsh, dusk. Through the single doorway: a low mud-brick altar podium with small fish offerings smoldering on it, thin smoke drifting out the door, a shadowed cult niche in the far wall with something indistinct inside. Reed beds and still water catching the last light, first stars above. Mud-brick texture detailed and hand-formed. Mood: the first temple, quiet, liminal.
4. Shang divination chamber, c. 1250 BCE — Interior of a royal hall at Yinxu (Anyang), night, lit by bronze lamps and a brazier. A Shang diviner in Bronze-Age Chinese court dress presses a glowing hardwood brand to a prepared ox scapula; a fresh crack spreads across the bone with visible heat-shimmer; the king seated beyond in shadow, attendants with turtle plastrons and bronze ritual vessels (taotie-pattern surfaces, no writing visible). Smoke, lamplight on bronze, deep shadow. Mood: tense, official, the moment before the answer.
5. (Honest-mode plate, optional per scene) — Same camera position as each scene above but as the excavated reality: e.g. for Eridu, the eroded mud-brick wall stubs and altar base of Temple Level XVIII in an excavation trench, midday, neutral documentation light. Used as the desaturated cross-fade target when the honesty toggle is on; if budget-limited, CSS desaturation of the main plate suffices and these are deferred.
Effort
M overall. The server work is small and self-contained; the cost centers are image generation/iteration to the "take you back in time" bar and narration authoring with per-sentence evidence discipline.
- Ships first (S, ~half a day): the scene renderer in
server.ts(scene note type, full-bleed layout, Ken Burns, narration beats, honesty toggle with CSS desaturation) + one scene: Eridu — smallest cast, strongest tier-1 grounding, and it banners the vault's flagship Mesopotamia domain. Proves the format end-to-end. - Second (S): Göbekli Tepe and Anyang scenes — notes are rich, prompts above are ready; mostly image iteration + narration.
- Third (M): Qafzeh (hardest image — must be reverent, not lurid; review against AGENTS.md §1 before publishing),
/sceneschronological index, "Stand there" cards on parent notes, INDEX link. - Later / optional (M): honest-mode excavation plates (prompt 5), gyro parallax polish, scroll-driven camera nudges per beat.