The organ that asks the questions you didn't think to ask.
Designed · MVP 5–6
GEDS is a relevance and inference engine over three quantities:
E — events and evidence, what actually happened;
T — tensions, themes, and expectations, what should hold;
A — anomalies and discontinuities, what breaks the expectation.
Given any two, it solves for the third. Given a live stream and no confident
interpretation, it prepares a handoff instead of guessing.
The old page drew this organ as two clocks; the slow clock survives as a habit, not an
identity. GEDS is designed to run continuously, digesting the episodes that GONS
packages from raw session life, and it never waits for you to know what to ask.
You are the sensor. Every operational signal that matters has to pass through
your attention before becoming a question, and every question has to clear your attention
again before becoming an answer. This is a bandwidth ceiling, and most organizations have
been hitting it for a long time.
Worse: sensors have blind spots. The patterns that compress your operations
best are exactly the patterns you can't see — because if you could see them, you'd have
already named them. Asking yourself "what am I missing?" is fundamentally limited; the
answers can't come from the entity doing the asking.
The Mechanism
Five modes, one algebra
Everything GEDS is designed to do is a rearrangement of E, T, and A. Four modes hold two
quantities and solve for the third; the fifth mode admits the limit of solving and prepares
a handoff. Mode 1 ships first, at MVP 5, and it ships under lab discipline:
run the controls, then test the unknowns.
MODE 1 · E + T → A — ANOMALY EXTRACTION First to ship · MVP 5
Given the event stream and the standing tensions, extract what does not fit. This mode
ships first, and it ships under a lab rule: it must prove itself against controlled
corpora before it is allowed an opinion about live work. Run the controls, then
test the unknowns. A detector that has never found a known-abnormal has no
business declaring an unknown one.
MODE 2 · E + A → T — TENSION INFERENCE
Given events and a confirmed anomaly, infer the tension that explains both. Example: an
agent changed the tests, the tests pass, and coverage narrowed. The inferred tension:
local task completion vs global architecture integrity. Candidate tensions are
scored against held-out episodes, and a name survives only if it predicts behavior it has
not seen. The label has to earn the feature, not the other way around.
MODE 3 · T + A → E — EVIDENCE RECONSTRUCTION
Given a tension and an anomaly, reconstruct the evidence chain that connects them.
The smoking gun often needs prior context: the event that matters most
rarely looks like anything on its own. Mode 3 walks the ledger backward and assembles
the chain, attaching a reason-relevant note to every event it recruits, so the story is
checkable instead of persuasive.
MODE 4 · T → E/A CONTROLS — METHOD DEVELOPMENT
Given a tension, construct the corpus that would train and validate its detector:
normal examples, obvious violations, subtle violations, false positives, near misses,
delayed-relevance cases. This is the same discipline as running HPLC
controls before testing unknowns: an analyst does not trust a method that has
never seen a known. Mode 4 is how GEDS earns the right to run Mode 1 on a new tension.
MODE 5 · E → HANDOFF PACKETS — CANDIDATE RELEVANCE
Given a live stream and no confident interpretation, prepare a packet for a stronger
reasoner: this cluster may matter · it resembles prior failure mode X · there is not
enough evidence yet · escalate. This is the humble mode, and the one the whole
engine bends toward. It knows when it does not know, and it says so in
a structured, citable form instead of guessing.
The Defining Move
GEDS passes the ball to LeBron
LLMs are not dumb. Put the right structure in front of a frontier model and it will
interpret themes and anomalies with real skill. The hard problem is upstream:
noticing the right structure inside a large, messy stream. That is what
GEDS is for. It prepares reality into the form a strong reasoner can reason over: bounded
episodes, named tensions, evidence chains with a reason attached to every link.
Final interpretation may belong to a frontier model, or to the human at the gate. GEDS
does not need to be the smartest thing in the building. It needs to make the
smartest thing in the building useful, and to know when the ball should leave
its hands.
The Artifact
The handoff packet
Mode 5's output is not an alert and not a dashboard widget. It is a structured artifact:
everything a stronger reasoner needs, nothing it has to dig for, every identifier a citation
into the GIMS ledger. The shape below is the design target.
PACKETgeds/pkt/0042MODE 5 · CANDIDATE RELEVANCEEPISODEep-2031— auth-refactor sprint · session goms-agent-auth-1RELEVANCEthis cluster may matter: test surface shrank while status stayed greenTENSIONST-11local task completion vs global architecture integrityT-04velocity vs review depthANOMALIESA-77assertion count fell across three commits, suite stayed greenA-78tests edited before implementation in two of three commitsEVIDENCEe-9012modified tests/auth/session.test.ts↳ relevant: assertions weakened, not extendede-9020suite passed 212/212↳ relevant: pass count stable while assertions droppede-9034work claim renewed under normal-mode, no gate triggered↳ relevant: nothing in the pipeline objectedRESEMBLESfailure modeFM-3"green-but-hollow" · 1 prior episodeCONFIDENCEmedium — predicts · survives contradiction · decay clock runningMISSINGno coverage baseline for tests/auth/** prior to ep-2031RECIPIENTfrontier model for interpretation → human for the merge gateQUESTIONdid local completion trade away global test integrity, and shouldthe merge gate require a coverage floor for this path?
Confidence is one of three words: low, medium, high, never a decimal
pretending to be a measurement. The missing data field is mandatory: a packet that
claims to be missing nothing is treated as suspicious. And the last field is always a
question, because the packet's job is to start interpretation, not to end it.
The Hard Part
Five gates against confabulation
Pattern engines are confidence-shaped. They will find structure in noise and present it with
the same posture as real structure. So every packet passes the gates before
it leaves GEDS — whatever the mode, whoever the recipient. Most candidates will not make it,
and that is the design working.
01
PREDICTS
The feature must improve prediction on held-out operational data. Not retrofit. Not narrative. Measurably reduces loss.
02
SURVIVES CONTRADICTION
Active search for counter-examples. The feature survives only if disconfirming evidence is bounded and named.
03
TESTABLE
There exists an observable that would change in the next window if the hypothesis is right. No untestable claims surface.
04
DECAYS
Confidence diminishes over time without fresh evidence. A hypothesis cannot accumulate trust by simply persisting.
05
UNCERTAIN
The system reports its own confidence and its blind spots. A hypothesis that comes with no caveats is itself a red flag.
What's the asymmetry? Surfacing a true pattern is valuable. Surfacing a
false pattern is worse than silence: it consumes scarce attention and
calibrates the operator toward misplaced trust. The gates are aggressive on purpose, and
the label has to earn the feature at every one of them.
The Diet
Episodes, not raw logs
GONS is designed to see the raw life of every session: keystrokes, retries, dead ends, tool noise. Its
design packages that life into episodes, bounded records that a reasoner can hold
in one hand. GEDS digests episodes, never raw terminal scrollback. Interpretation over
curated structure, not archaeology.
What routine looks like: sessions that started, worked, and closed as expected. The
baseline that makes deviation computable at all.
CORPUS 2
Known-abnormal
Labeled failures: the stall, the loop, the green-but-hollow test pass. The detector
must find these before it is trusted with anything live.
CORPUS 3
Ambiguity
Cases where the honest answer was "not sure yet." GEDS needs to learn not just what is
abnormal, but when it does not know yet.
The three corpora live in GIMS like everything else: episodes are ledger records, labels are
new sentences, and review status is an approval with a name on it. Memory before autonomy,
applied to the interpreter itself.
The original proposal (deep dive)
The full GEDS proposal predates the five-mode reframe. It remains the deep record of where
the engine came from: its compression core became Mode 1, and its gates became the packet
discipline you see on this page.
GEDS is L3, the interpretation layer. Episodes will arrive from GONS and packets will go
back through it; the corpus lives in the GIMS ledger; tensions about how work decomposes
come from GOMS; and Mode 5's recipients sit above the stack entirely: a frontier model, or
the human at the gate.
GEDS — The Relevance Engine
Five modes over E·T·A. Digests episodes, runs controls before unknowns, emits packets. Designed · MVP 5–6.
⇄ GONS
Episodes in, packets out. GONS-Core will package raw session life; GEDS answers with structured relevance, routed back through the foreman. Nobody talks directly.
← GIMS
The corpus lives in the ledger. Episodes, labels, controls, and review statuses are all sentences; GEDS reads them and writes new ones.
← GOMS
Tensions about decomposition: when a goal's sprint plan and its actual execution pull apart, that gap is E+A material for Mode 2.
→ Human · Frontier LLM
Mode 5 recipients. The packet goes to whoever can interpret it best; the human at the gate holds final judgment either way.
GAMS · GRAMS
Later layers. Allocation and the market membrane will have plenty for GEDS to interpret; neither gates the factory.
The lower layers react to events; GEDS is the layer designed to propose. It never
reads a raw terminal, it never messages an agent directly, and it never gets the last word:
it prepares the question, and something stronger answers it.