Bias Audit — Wave 1 (2026-06-11)
Scope: all 51 content notes in 01_prehistoric … 09_comparative (37 tier-1 sources, 13 tier-2 notes, 1 tier-3 synthesis), plus 00_meta/OPEN_QUESTIONS.md, audited against AGENTS.md, 00_meta/Methodology.md, and all six templates. Method: full read of every file, plus systematic rg scans for frontmatter completeness, enum validity, url/url_verified consistency, confidence×evidence-class pairing, and wiki-link resolution.
Headline: no missing mandatory frontmatter field anywhere (a genuinely clean scan — every claim has counter_evidence, every motif has a transmission enum, every source has url_verified). The violations live one level down: in how fields are used, in cross-domain contradictions, and in the workflow loop.
The field-presence scan passed 51/51. The violations are semantic:
| File:field | Violation |
|---|---|
02_mesopotamian/1_sources/atrahasis-epic-cuneiform-tablets.md:url | url points to en.wikipedia.org/wiki/Gilgamesh_flood_myth while citation names Lambert & Millard 1969. The URL is not the source; it is a tertiary page about a different (related) topic. Key extractions explicitly cite "Wikipedia: Gilgamesh flood myth" and a "COJS summary note" as the basis of quoted claims. |
02_mesopotamian/1_sources/enuma-elish-creation-myth-tablets.md:url | Same pattern: citation = Talon 2005 / Lambert 2013, URL = Wikipedia article, extractions cite "Wikipedia: Enūma Eliš." |
02_mesopotamian/1_sources/eridu-temple-sequence-archaeology.md:url | Citation = Safar/Lloyd/Mustafa 1981 monograph; URL = worldhistory.org (quality-tertiary at best). |
01_prehistoric/1_sources/hovers-et-al-2003…md:url_verified | URL present (uchicago DOI page) but url_verified: not-online. not-online means offline source; with a URL recorded the honest values are yes or no. Same misuse in 03_egyptian/1_sources/stevenson-predynastic-burials-2009.md (a live open-access PDF URL!), 05_abrahamic/1_sources/blenkinsopp…2008.md, 08_indigenous/1_sources/nunn-reid-2016….md. Four files. (04_indo_european's watkins and west notes use no correctly — that is the pattern to copy.) |
06_dharmic/2_notes/vedic-religion.md:attestation_earliest | Labels "c. 1500–1200 BCE (composition of earliest Rigveda hymns, oral)" as "class: 2-text". Methodology.md is explicit: composition estimates are class 3-reconstruction; only attestation grounds the field. The same domain's own rigveda-oral-composition-attestation-gap.md states this rule correctly — the tradition profile contradicts its neighbor. |
06_dharmic/2_notes/ and 09_comparative/2_notes/:sources | Source arrays without wiki-link syntax ("bronkhorst-greater-magadha-2007" vs the "…" used everywhere else). Links resolve by name but break graph tooling consistency. |
01_prehistoric/1_sources/schmidt-2010…md:tradition | Göbekli Tepe filed under tradition: paleolithic-mortuary-religion. The site is Pre-Pottery Neolithic, not Paleolithic, and (per the note's own text) has produced no burials. The analytical category is being stretched to hold a site that fits neither of its words. |
09_comparative/2_notes/great-flood.md (occurrences table) | Greek, Indian, and Chinese rows carry "(tier-1 note pending)" — dated assertions resting on in-row citations with no tier-1 record. Honestly flagged, but as it stands a tier-2 note is carrying tier-1 load for three traditions. |
Broken wiki-links: 28 distinct targets do not exist. Worst offenders: every Feeds into section written by the 02, 03, 05, 07 gatherers names tier-2 notes that were never written (el-yahweh-relationship, canaanite-substrate-claim, pie-dragon-slaying-formula ×3, shinto-origins-archaeology-vs-text, ubaid-religious-continuity, etc.). Two are wrong slugs for notes that do exist: dying-and-rising-god → actual dying-rising-god (in 05/israelite-religion-origins.md) and indo-european-sky-father → actual sky-father (in 06/vedic-religion.md). Full list preserved in the scan; the two wrong slugs are the cheapest fixes with the highest navigation payoff.
Orphaned source: 07_east_asian/1_sources/hardacre-shinto-history-diversity.md feeds into two notes that don't exist and is cited by nothing. A tier-1 note with zero tier-2 consumers and phantom feeds-into links is gathered-but-unused inventory.
Mostly genuinely strong — several counter_evidence fields (04 dyeus, 08 aboriginal flooding, 07 oracle bones, 09 dying-rising) are real steelmen with named scholars and stated mechanisms. The weakest three in the vault:
1. 00_meta/Methodology.md — "Hybrid transmission … blurs the trichotomy; the 'unresolved' value and per-hypothesis evidence sections exist for this." This is a counter-argument that defuses itself in the same sentence. The actual steelman against the trichotomy (motifs have no sound-law analogue, so the linguistic metaphor may not transfer; Berezkin's own coding-decision critique) is sitting unused in 09/berezkin-motif-database.md.
2. 03_egyptian/2_notes/ancient-egyptian-religion.md — one prong (continuity-model challenge) against a profile that asserts dozens of claims spanning 4,000 years, Atenism's theological category, and maat theology. Nothing counter-evidences the Atenism or function-tagging sections at all.
3. 02_mesopotamian/2_notes/sumerian-religion.md — same single-prong pattern: only Ubaid-ethnolinguistic continuity is challenged; the pantheon systematization, function claims, and the "high" overall confidence go unopposed.
Pattern worth naming: tradition profiles systematically get weaker counter-evidence than claim notes. Claims are atomic so the steelman is forced to engage; profiles are sprawling so the steelman picks one limb. Either profiles need multi-prong counter_evidence or their confidence should be read as applying only to the counter-evidenced limb.
- 05_abrahamic — school over-weight, the clearest case. Cross (Harvard) + Smith (Cross's student, same framework — the note's own reliability section admits it) + Blenkinsopp (endorses the Kenite hypothesis Cross also accepts). All three tier-1 sources sit inside one broadly continuist consensus. The dissenters (Day, Emerton, Hadley, Kaufmann, Tebes) appear only inside counter-evidence prose, never as tier-1 records. The steelman is being written from memory of the opposition, not from gathered opposition.
- 04_indo_european — all three sources are pro-reconstruction comparativists (Mallory-Adams, Watkins, West). No skeptical source on the limits of comparative mythology. Counter-evidence is again internal.
- *08_indigenous — all three sources are proponents of bold contested theses* (Lewis-Williams's shamanism model, Nunn-Reid's deep-time oral memory, Witzel's Laurasian/Gondwana scheme). The notes themselves calibrate beautifully (confidence: low, twice), but the critics doing the real work in those notes — Henige, Hiscock, Bahn, Hutton — have no tier-1 records. The domain gathered the claims and cited the critiques from memory.
- 06_dharmic and 07_east_asian — the models to copy. 06 pairs a revisionist (Bronkhorst) against the mainstream (Olivelle) plus a deliberately-included methodologically-weak historical source (Marshall 1931) used as an object of critique. 07 pairs Keightley vs Eno as named interpretive rivals on Di. This is what per-domain diversity should look like.
- 09_comparative — no methodology over-weight found. Smith vs Mettinger is an explicit steelman pair; philology (West), epigraphy (Linear B), and areal statistics (Berezkin) are distinct methods. The one gap: function-theory scholarship in the synthesis (Durkheim, Norenzayan, Sosis, Bentzen, Malinowski, the Whitehouse retraction) has zero tier-1 records — already self-filed as Q2/Q8/Q9, so noted, not double-counted.
- Era skew (vault-wide): 02's textual anchors are 1969/1981 editions accessed through tertiary summaries; nothing in 02 post-dates Lambert 2013, and Wasserman 2020 is mentioned but not used.
The hard rule — high resting on class 3-reconstruction/4-ethnography — is violated zero times in frontmatter. Genuinely impressive: 04 keeps the PIE deity claim at medium with attestation_earliest: speculative; 08 holds both deep-time claims at low; 09's high verdicts are scoped and class-2 anchored.
Real calibration problems found instead:
1. 02/atrahasis-antedates-gilgamesh-flood-tablet.md — confidence: high violates the AGENTS §2.4 Wikipedia cap. Its sole tier-1 source is the atrahasis source note whose extractions are explicitly grounded in "Wikipedia: Gilgamesh flood myth" and a COJS summary. AGENTS.md: tertiary-grounded claims are "capped at medium until upgraded." The claim is almost certainly correct (09's independently-gathered George 2003 note supports it), but as cited within its own domain it is high-on-Wikipedia. Either re-source 02 against Lambert & Millard / George directly, or drop to medium.
2. The same descent verdict carries medium in 04 and high in 09. 04/dyeus-sky-father-cognate-set.md concludes the claim "cannot be elevated to high without either Anatolian evidence or direct PIE texts"; 09/sky-father.md rates the descent verdict high, citing Anatolian evidence (šiu-, Luwian Tiwaz) that 04 says doesn't exist (see §6). The scoping differs slightly (deity-reconstruction vs name-formula descent), but no reader can recover that from the frontmatter, and neither note acknowledges the other.
3. 05/el-yahweh-identification-canaanite.md — body says "high for the presence of El-identification, medium for the mechanism"; frontmatter says only high. The split is the honest value; the single field overstates.
4. Single-source highs: 02/atrahasis-antedates… (1 source), 03/pyramid-texts-oldest-large-corpus.md (1 source, and a world-superlative claim whose own counter_evidence concedes the Sumerian comparison is unresolved). Both satisfy the ≥1-source letter; neither satisfies its spirit at high.
5. Speculative-labeling is otherwise good — 04's attestation_earliest: speculative, 08's split anchors ("transmission chain undated"), and 09's "(composition inferential)" tags are exactly per Methodology. The single mislabel is 06/vedic-religion.md (§1).
No advocacy and no debunking agenda found anywhere — the emic sections in 05 (Jewish/Christian/Islamic self-account rendered respectfully and at length), 06 (śruti/apauruṣeya framing), and 08 (Dreaming as everywhen, with an explicit warning against stripping emic richness) are exemplary. Three mild items:
06/rigveda-oral-composition-attestation-gap.mdglosses the OIT camp as "Out-of-India (OIT) school and Hindu nationalist scholars." The motive attribution does argumentative work the evidence sections already do better; "rejected as methodologically unsound" suffices. Mild tone leak, debunking-direction.01/claim-qafzeh-ochre…mdSupport section: shell/burial co-occurrence "rules out purely functional explanation" — overclaims relative to its own source note ("does not fully foreclose functional alternatives") and its own counter-evidence ("the argument is not airtight"). Etic overstatement inside Support, corrected elsewhere in the same note.01/claim-qafzeh…Support also imports "the Durkheimian threshold for 'social fact'" as if it were an evidential standard — theory smuggled in as measurement. Cosmetic.
Sky-father (04 vs 09) — contradictions, not just divergence:
| Point | 04_indo_european/dyeus-sky-father-cognate-set.md | 09_comparative/sky-father.md |
|---|---|---|
| Luwian | "Luwian shows no cognate either" | "Luwian Tiwaz (sun god)" listed as Anatolian root reflex, 2-text |
| Hittite | "Hittite does not have Dyeus as a theonym… uses Anu" — treated as a gap threatening the reconstruction | "root reflex šiu- 'god' in Old Hittite copies c. 16th c. BCE (2-text)" — treated as an attestation anchor |
| Germanic Týr | *Tîwaz presented as direct continuation of the theonym | "via *deywós 'celestial one'" (per West) — i.e. a derivative, not the theonym itself |
| Rigveda oldest ms. | "manuscript 1464 CE" (BORI/Pune) | "manuscripts c. 14th c. CE" (rigveda-dyaus row) |
| Confidence in descent | medium ("cannot be elevated to high…") | high |
| Mutual awareness | does not link sky-father | does not link the 04 note |
The Anatolian rows are genuinely irreconcilable as written: West 2007 (cited by both domains) treats šiu-/Tiwaz as root reflexes with the caution that the figure must be argued — 09 transmits the caution, 04 flatly denies the cognates. One of them must change.
Rigveda manuscript date — three-way contradiction: 06/rigveda-oral-composition-attestation-gap.md says oldest ms. c. 1040 CE, Nepal; 04/rigveda-oral-gap-composition-vs-attestation.md says 1464 CE, BORI Pune ("No earlier physical manuscript … is known to exist in any collection" — directly contradicted by 06); 09/rigveda-dyaus.md says c. 14th c. CE. These are two near-duplicate claim notes (04 and 06 cover the same atomic claim — itself a one-claim-per-note breach across domains) plus a source note, none linking the others. The oral-gap arithmetic differs by ~400 years depending on which you read.
Flood (02 vs 09) — broadly consistent, three minor divergences: (a) SB Gilgamesh redaction dated "c. 1300–1000 BCE" in 02 vs "c. 1200–1100 BCE" in 09; (b) 02's counter-evidence holds open that OB Gilgamesh "may not have contained" a flood account where 09 (per George) states the OB version "did not include the flood story" — strength mismatch; (c) 02 hedges that Eridu Genesis "may represent an … older tradition (possibly oral c. 2800 BCE)" while 09 leans on Civil's suspicion that the Sumerian text is dependent on the Akkadian — opposite leans, both flagged as open, acceptable but worth one reconciling sentence. Verdict agreement is solid: both call Genesis←Mesopotamia contact. One enum misuse: 02 labels the Gilgamesh-XI-from-Atrahasis relation "descent" — by the vault's definitions (descent = inherited from common ancestor) direct literary incorporation within one scribal culture is not descent; 09 correctly describes it as textual dependence without forcing the enum.
Attestation dates that do agree (recorded for fairness): Atrahasis tablets (Ammi-ṣaduqa reign — "1646–1626" vs "c. 1635" are the same epigraphic fact), Eridu Genesis tablet c. 1600 BCE, Pyramid Texts c. 2350 BCE (03 ↔ 09), Kuntillet Ajrud c. 800 BCE, oracle bones 1254–1197 BCE.
- No invented-looking URLs. All nine recorded URLs are plausibly real (DOI paths match the cited articles for Hovers 2003, Blenkinsopp 2008, Nunn-Reid 2016; the escholarship ID matches Stevenson's known entry).
- The three
url_verified: yes(all in 02) are honest about access but dishonest in role: what was verified is a Wikipedia/worldhistory page, recorded in theurlfield of a note whosecitationis a print monograph. The result is a note that looks like a verified scholarly source and is a verified tertiary summary. This is the single most corrosive pattern found, because every downstream confidence judgment inherits it silently (§4.1). - Four
not-onlinewith URLs present (§1) — wrong enum, low malice: three are paywalled DOI pages, but Stevenson's is an open-access PDF that could simply have been fetched and markedyes. 04's twourl_verified: noentries are the correct honest pattern for known-but-unfetched URLs.
| # | Severity | Violation | Location | Remediation |
|---|---|---|---|---|
| 1 | Critical | Tier-1 sources grounded in Wikipedia/tertiary pages while citing scholarly editions; url field misrepresents source basis | all 3 sources in 02_mesopotamian/1_sources/ | Re-extract from Lambert & Millard 1969 / George 2003 / Lambert 2013 (09's notes show it's doable); or relabel extractions as tertiary-derived and cap downstream confidence |
| 2 | Critical | confidence: high on claim whose evidence chain is Wikipedia-grounded (AGENTS §2.4 cap = medium) | 02/2_notes/atrahasis-antedates-gilgamesh-flood-tablet.md | Drop to medium until #1 fixed, or re-source against 09/1_sources/epic-of-gilgamesh-tablet-xi.md (George) and keep high |
| 3 | Critical | Three-way factual contradiction: Rigveda oldest manuscript = 1040 CE (06) vs 14th c. (09) vs 1464 CE (04, with an explicit "no earlier ms. exists" assertion) | 04/2_notes/rigveda-oral-gap…, 06/2_notes/rigveda-oral-composition…, 09/1_sources/rigveda-dyaus.md | One agent verifies the actual ms. record; merge 04+06 into one claim note (one claim per note) with cross-links; file a Q if genuinely unresolved |
| 4 | Critical | Open Questions loop broken: ~30 open questions emitted in domain notes (01: 4, 02: 4, 03: 5, 04: 8, 05: 5, 06: 8, 07: 4, 08: 5+) never copied to the register, despite template mandate | 00_meta/OPEN_QUESTIONS.md vs every tradition/claim note's "Open questions" section | Harvest all domain-note open questions into the register as Q12+; make register-sync an explicit gatherer checklist item |
| 5 | Major | Anatolian-evidence contradiction: 04 denies Hittite/Luwian cognates that 09 cites as dated attestations | 04/2_notes/dyeus-sky-father-cognate-set.md vs 09/2_notes/sky-father.md | Reconcile against West 2007 ch. 4 (root reflexes real; theonym-as-sky-father absent — both notes should state both halves); cross-link the notes |
| 6 | Major | Same descent verdict rated medium (04) and high (09) with no mutual reference | same two notes | After #5, align scoped confidences explicitly: high for name/formula descent, medium for deity-reconstruction; each note links the other |
| 7 | Major | 28 broken wiki-links, incl. 2 wrong slugs to existing notes and 4+ phantom "Feeds into" targets per domain | vault-wide (list in §1) | Fix the 2 wrong slugs now; for phantom targets either write the stub or rename the link to an existing note; add link-check to wave gate |
| 8 | Major | url_verified: not-online used for sources with recorded URLs (incl. one open-access PDF) | hovers (01), stevenson (03), blenkinsopp (05), nunn-reid (08) | Change to no, or fetch and mark yes; reserve not-online for URL-less print sources |
| 9 | Major | Composition estimate labeled class 2-text in attestation_earliest, contradicting Methodology and the same domain's own claim note | 06/2_notes/vedic-religion.md | Relabel composition as 3-reconstruction inference; keep Mitanni 1380 BCE as the honest 2-text anchor |
| 10 | Major | School over-weight with no opposing tier-1 sources: 05 (Harvard/continuist ×3), 04 (pro-reconstruction ×3), 08 (thesis-proponents ×3) | domain 1_sources/ rosters | Wave 2: one dissenting tier-1 source per domain (Day 2000 for 05; a comparative-method critique for 04; Henige's response paper for 08) |
| 11 | Minor | Orphaned tier-1 source with phantom feeds-into | 07/1_sources/hardacre-shinto-history-diversity.md | Write the Shinto attestation-gap claim note (the material is ready) or mark the source as wave-2 inventory |
| 12 | Minor | Göbekli Tepe (PPN, no burials) filed under paleolithic-mortuary-religion | 01/1_sources/schmidt-2010…md, 01/2_notes/paleolithic-mortuary-religion.md | Rename category or split a pre-pottery-neolithic-ritual tradition note |
| 13 | Minor | transmission: descent enum applied to intra-cultural literary dependence | 02/2_notes/atrahasis-antedates…md | Describe as textual dependence; reserve enum for cross-tradition verdicts |
| 14 | Minor | sources arrays without syntax | 06/2_notes/ (3 files), 09/2_notes/ (3 files) | Normalize; cheap sed |
| 15 | Minor | Frontmatter confidence: high flattens body's explicit high/medium split | 05/2_notes/el-yahweh-identification-canaanite.md | Scope the field ("high (presence) / medium (mechanism)") or take the lower bound |
| 16 | Minor | SB Gilgamesh redaction date divergence (1300–1000 vs 1200–1100 BCE); OB-Gilgamesh-flood strength mismatch ("may not have" vs "did not") | 02 vs 09 (§6) | Adopt George 2003 values in 02 |
| 17 | Minor | World-superlative claim at high on a single tier-1 source | 03/2_notes/pyramid-texts-oldest-large-corpus.md | Add a Mesopotamian-corpus source or scope title to "earliest Egyptian/continuous corpus" |
| 18 | Minor | Inconsistent evidence_class for the same corpus (oracle bones: 1-archaeology in keightley, 2-text in eno) | 07/1_sources/* | Pick the convention (inscribed artifacts = 1+2 dual, as the 07 claim notes already do) and apply to sources |
| 19 | Minor | Mild tone leaks: "Hindu nationalist scholars" motive-tagging; "rules out purely functional explanation" overclaim | 06/2_notes/rigveda…gap.md, 01/2_notes/claim-qafzeh…md | Delete motive attribution; soften to "weighs against" |
| 20 | Minor | Three occurrence-table rows resting on "(tier-1 note pending)" | 09/2_notes/great-flood.md | Write the three pending source notes (Greek/Deucalion, Manu, Gun-Yu) in wave 2 |
Counts: 4 critical · 6 major · 10 minor.
1. Fix the 02 source chain first (#1–#2). It is the only place where the vault's strongest stated taboo-adjacent pattern (tertiary grounding presented as scholarly grounding) feeds a high confidence. The repair is cheap because 09 already gathered the proper editions for the same texts — 02 can largely re-cite them.
2. Run a reconciliation pass before any new gathering on overlapping topics. Rigveda manuscripts (#3) and Anatolian sky-father evidence (#5–#6) prove that two agents writing about the same fact will diverge. Concretely: any wave-2 note touching a topic with an existing note in another domain must link it and state agreement/disagreement in one sentence.
3. Restore the Open Questions loop (#4): harvest the ~30 stranded domain-note questions into the register, then have wave-2 gatherers draw assignments from it — the register is currently synthesizing-agent monoculture (Q1–Q11 all from 00/09), which is itself a bias channel.
4. Buy one dissenting source per skewed domain (#10). The counter-evidence fields are currently written from the gatherers' memory of critics; gathering Day 2000, Henige's response, and a comparative-mythology critique converts remembered steelmen into sourced ones — and tests whether they survive contact with the actual texts.
5. Make two scans part of the wave gate: (a) wiki-link resolution (this wave: 28 broken), (b) url ↔ url_verified consistency (this wave: 7 files). Both are mechanically checkable; neither should reach the human reviewer again.
6. Tradition-profile counter-evidence standard: require at least one counter-evidence prong per major section of a tradition profile (origins, continuity, functions), or explicitly scope the confidence field to the counter-evidenced part (§2 pattern).
7. Keep doing what 06/07/09 did: rival-source pairing (Keightley/Eno, Smith/Mettinger, Bronkhorst/Olivelle) produced the vault's best-calibrated notes. That design choice — not reviewer vigilance — is the scalable bias control.