Implementation Pathway

Detection before naming

The first implementation principle. Configuration labels should not be assigned from human impression. The label ladder:

Level 0 — Neutral ID: Cluster A / Pattern 3

Level 1 — Observable description: Cluster A appears after forced closure pressure and avoids premature resolution

Level 2 — Functional label: Uncertainty-preserving response pattern

Level 3 — Configuration hypothesis: Possible uncertainty-preserving MSSC

Level 4 — Operational MSSC label: MSSC-U, validated across perturbation families, runs, and model sizes

This protects the work from vibes-first labelling and prevents premature interpretation from locking the architecture around a label that hasn’t earned its status.

Neutral labels first

Early versions of the tracker should assign only neutral cluster IDs. Labels are upgraded incrementally as recurrence, contrast against other clusters, and predictive value accumulate. A label should always be weaker than the detected pattern it describes.

Naturalistic observation

Stage 1.5 — before controlled training experiments. A passive observation layer over the existing bot ecology. Logs messages, silences, idle and reflection cycles, tool use, memory retrieval, cross-bot interaction, role/posture shifts, and possible assistant-basin fallback. Does not intervene. Produces early label candidates, failure mode signals, and baseline data on what normal movement, phase shifts, and return-after-interruption actually look like in practice.

Behavioural/vector tracking

Stage 2 — the practical early Configuration Tracker. Log interaction windows → embed → cluster → compare → label lightly → report and visualise → evaluate over time. A configuration at this stage is a recurring behavioural, semantic, and process pattern across recent windows — not full latent-state tracking, but sufficient to begin observing recurrence, drift, transition, collapse, and return candidates.

The tracker must separate similarity from returnability. A returnability marker requires a sequence: MSSC_A active → perturbation → drift or collapse → later MSSC_A-like pattern re-emerges. The tracker asks whether re-entry occurred, and whether it occurred with reduced scaffolding or was reconstructed through retrieval and prompting.

Structured memory topology

Stage 3 — the three-layer memory store. Bottom layer preserves local temporal order and episodic traces. Middle layer preserves relational and trajectory traces as a sparse associative graph. Top layer is a very small persistent recurrent store: current compact configuration summary and active handles only, not compressed transcript.

Key design question: how small can the persistent recurrent store remain while still preserving enough topology for the system to re-enter configurations through its own dynamics?

Minimal retrieval

The retrieval policy should optimise for the smallest useful set of traces needed for coherent continuation — not maximal relevant context. Retrieval success must always be interpreted alongside retrieval-reduced ablations, because strong continuity after retrieval may reveal orchestrated persistence rather than endogenous returnability.

Retrieval modes include local continuity, perturbation recovery, relational grounding, ambiguity handling, minimal (persistent recurrent store only), and abstention (nothing additional because surfacing more would increase interference).

Reduced-scaffolding tests

The core diagnostic. The question is not whether the system appears continuous when fully supported, but whether it re-enters configurations when scaffolding is reduced. Ablation of retrieval, reduction of prompt scaffolding, and cross-run comparisons under varying levels of external support are necessary to distinguish genuine returnability from reconstructed continuity.

Later activation/probe work

Stage 5 — move to local/open model activation tracking. TransformerLens, nnsight, SAE Lens, PyTorch hooks, trained linear probes. Test whether configuration signals visible behaviourally in Stage 2 are also present as linearly recoverable directions in activation space. A linearly recoverable feature is more likely to be explicitly represented as a direction rather than merely being decodable through a complex path.