L3 · Interpretation

The organ that asks the questions you didn't think to ask.

Designed · MVP 5–6

GEDS is a relevance and inference engine over three quantities: E — events and evidence, what actually happened; T — tensions, themes, and expectations, what should hold; A — anomalies and discontinuities, what breaks the expectation. Given any two, it solves for the third. Given a live stream and no confident interpretation, it prepares a handoff instead of guessing.

The old page drew this organ as two clocks; the slow clock survives as a habit, not an identity. GEDS is designed to run continuously, digesting the episodes that GONS packages from raw session life, and it never waits for you to know what to ask.

You are the sensor. Every operational signal that matters has to pass through your attention before becoming a question, and every question has to clear your attention again before becoming an answer. This is a bandwidth ceiling, and most organizations have been hitting it for a long time.

Worse: sensors have blind spots. The patterns that compress your operations best are exactly the patterns you can't see — because if you could see them, you'd have already named them. Asking yourself "what am I missing?" is fundamentally limited; the answers can't come from the entity doing the asking.

The Mechanism

Five modes, one algebra

Everything GEDS is designed to do is a rearrangement of E, T, and A. Four modes hold two quantities and solve for the third; the fifth mode admits the limit of solving and prepares a handoff. Mode 1 ships first, at MVP 5, and it ships under lab discipline: run the controls, then test the unknowns.

MODE 1 · E + T → A — ANOMALY EXTRACTION First to ship · MVP 5

Given the event stream and the standing tensions, extract what does not fit. This mode ships first, and it ships under a lab rule: it must prove itself against controlled corpora before it is allowed an opinion about live work. Run the controls, then test the unknowns. A detector that has never found a known-abnormal has no business declaring an unknown one.

MODE 2 · E + A → T — TENSION INFERENCE

Given events and a confirmed anomaly, infer the tension that explains both. Example: an agent changed the tests, the tests pass, and coverage narrowed. The inferred tension: local task completion vs global architecture integrity. Candidate tensions are scored against held-out episodes, and a name survives only if it predicts behavior it has not seen. The label has to earn the feature, not the other way around.

MODE 3 · T + A → E — EVIDENCE RECONSTRUCTION

Given a tension and an anomaly, reconstruct the evidence chain that connects them. The smoking gun often needs prior context: the event that matters most rarely looks like anything on its own. Mode 3 walks the ledger backward and assembles the chain, attaching a reason-relevant note to every event it recruits, so the story is checkable instead of persuasive.

MODE 4 · T → E/A CONTROLS — METHOD DEVELOPMENT

Given a tension, construct the corpus that would train and validate its detector: normal examples, obvious violations, subtle violations, false positives, near misses, delayed-relevance cases. This is the same discipline as running HPLC controls before testing unknowns: an analyst does not trust a method that has never seen a known. Mode 4 is how GEDS earns the right to run Mode 1 on a new tension.

MODE 5 · E → HANDOFF PACKETS — CANDIDATE RELEVANCE

Given a live stream and no confident interpretation, prepare a packet for a stronger reasoner: this cluster may matter · it resembles prior failure mode X · there is not enough evidence yet · escalate. This is the humble mode, and the one the whole engine bends toward. It knows when it does not know, and it says so in a structured, citable form instead of guessing.

The Defining Move

GEDS passes the ball to LeBron

LLMs are not dumb. Put the right structure in front of a frontier model and it will interpret themes and anomalies with real skill. The hard problem is upstream: noticing the right structure inside a large, messy stream. That is what GEDS is for. It prepares reality into the form a strong reasoner can reason over: bounded episodes, named tensions, evidence chains with a reason attached to every link.

Final interpretation may belong to a frontier model, or to the human at the gate. GEDS does not need to be the smartest thing in the building. It needs to make the smartest thing in the building useful, and to know when the ball should leave its hands.

The Artifact

The handoff packet

Mode 5's output is not an alert and not a dashboard widget. It is a structured artifact: everything a stronger reasoner needs, nothing it has to dig for, every identifier a citation into the GIMS ledger. The shape below is the design target.

PACKET geds/pkt/0042 MODE 5 · CANDIDATE RELEVANCE EPISODE ep-2031 — auth-refactor sprint · session goms-agent-auth-1 RELEVANCE this cluster may matter: test surface shrank while status stayed green TENSIONS T-11 local task completion vs global architecture integrity T-04 velocity vs review depth ANOMALIES A-77 assertion count fell across three commits, suite stayed green A-78 tests edited before implementation in two of three commits EVIDENCE e-9012 modified tests/auth/session.test.ts ↳ relevant: assertions weakened, not extended e-9020 suite passed 212/212 ↳ relevant: pass count stable while assertions dropped e-9034 work claim renewed under normal-mode, no gate triggered ↳ relevant: nothing in the pipeline objected RESEMBLES failure mode FM-3 "green-but-hollow" · 1 prior episode CONFIDENCE medium — predicts · survives contradiction · decay clock running MISSING no coverage baseline for tests/auth/** prior to ep-2031 RECIPIENT frontier model for interpretation → human for the merge gate QUESTION did local completion trade away global test integrity, and should the merge gate require a coverage floor for this path?

Confidence is one of three words: low, medium, high, never a decimal pretending to be a measurement. The missing data field is mandatory: a packet that claims to be missing nothing is treated as suspicious. And the last field is always a question, because the packet's job is to start interpretation, not to end it.

The Hard Part

Five gates against confabulation

Pattern engines are confidence-shaped. They will find structure in noise and present it with the same posture as real structure. So every packet passes the gates before it leaves GEDS — whatever the mode, whoever the recipient. Most candidates will not make it, and that is the design working.

PREDICTS

The feature must improve prediction on held-out operational data. Not retrofit. Not narrative. Measurably reduces loss.

SURVIVES CONTRADICTION

Active search for counter-examples. The feature survives only if disconfirming evidence is bounded and named.

TESTABLE

There exists an observable that would change in the next window if the hypothesis is right. No untestable claims surface.

DECAYS

Confidence diminishes over time without fresh evidence. A hypothesis cannot accumulate trust by simply persisting.

UNCERTAIN

The system reports its own confidence and its blind spots. A hypothesis that comes with no caveats is itself a red flag.

What's the asymmetry? Surfacing a true pattern is valuable. Surfacing a false pattern is worse than silence: it consumes scarce attention and calibrates the operator toward misplaced trust. The gates are aggressive on purpose, and the label has to earn the feature at every one of them.

The Diet

Episodes, not raw logs

GONS is designed to see the raw life of every session: keystrokes, retries, dead ends, tool noise. Its design packages that life into episodes, bounded records that a reasoner can hold in one hand. GEDS digests episodes, never raw terminal scrollback. Interpretation over curated structure, not archaeology.

episode := { objective · actors · timeline · artifacts · outcome · candidate labels · review status }

CORPUS 1

Normal operation

What routine looks like: sessions that started, worked, and closed as expected. The baseline that makes deviation computable at all.

CORPUS 2

Known-abnormal

Labeled failures: the stall, the loop, the green-but-hollow test pass. The detector must find these before it is trusted with anything live.

CORPUS 3

Ambiguity

Cases where the honest answer was "not sure yet." GEDS needs to learn not just what is abnormal, but when it does not know yet.

The three corpora live in GIMS like everything else: episodes are ledger records, labels are new sentences, and review status is an approval with a name on it. Memory before autonomy, applied to the interpreter itself.

The original proposal (deep dive)

The full GEDS proposal predates the five-mode reframe. It remains the deep record of where the engine came from: its compression core became Mode 1, and its gates became the packet discipline you see on this page.

→ ORIGINAL PROPOSAL

Where It Fits

GEDS in the stack

GEDS is L3, the interpretation layer. Episodes will arrive from GONS and packets will go back through it; the corpus lives in the GIMS ledger; tensions about how work decomposes come from GOMS; and Mode 5's recipients sit above the stack entirely: a frontier model, or the human at the gate.

GEDS — The Relevance Engine

Five modes over E·T·A. Digests episodes, runs controls before unknowns, emits packets. Designed · MVP 5–6.

⇄ GONS

Episodes in, packets out. GONS-Core will package raw session life; GEDS answers with structured relevance, routed back through the foreman. Nobody talks directly.

← GIMS

The corpus lives in the ledger. Episodes, labels, controls, and review statuses are all sentences; GEDS reads them and writes new ones.

← GOMS

Tensions about decomposition: when a goal's sprint plan and its actual execution pull apart, that gap is E+A material for Mode 2.

→ Human · Frontier LLM

Mode 5 recipients. The packet goes to whoever can interpret it best; the human at the gate holds final judgment either way.

GAMS · GRAMS

Later layers. Allocation and the market membrane will have plenty for GEDS to interpret; neither gates the factory.

The lower layers react to events; GEDS is the layer designed to propose. It never reads a raw terminal, it never messages an agent directly, and it never gets the last word: it prepares the question, and something stronger answers it.