> Layer 5: Interaction Interleaving for Maximum Information Gain
/plugin marketplace add plurigrid/asi/plugin install asi-skills@asi-skillsThis skill inherits all available tools. When active, it can use any tool Claude has access to.
Layer 5: Interaction Interleaving for Maximum Information Gain
"universal topos construction for social cognition and democratization of mathematical approach to problem-solving to all" — Plurigrid: the story thus far
Active Inference as Information Maximization: The entropy-sequencer implements the core Active Inference principle from Active Inference in String Diagrams: agents select actions that maximize expected information gain (epistemic value) while minimizing surprise (pragmatic value).
String Diagram Pattern:
Perception ─┬→ Entropy Estimation ──┐
│ ↓
Action ←────┴─ Max Information ←─ Sequence Optimizer
This bidirectional loop embodies bmorphism's principle that "all is bidirectional" — perception informs action, action generates new percepts.
Temperature-Aware Sequencing: When T→0 (low temperature), favor exploitation of known high-information patterns. When T→∞ (high temperature), explore uniformly. This mirrors the Langevin dynamics exploration-exploitation trade-off.
Version: 1.1.0
Trit: 0 (Ergodic - coordinates information flow)
Bundle: core
The core information-theoretic foundation from [Soatto & Chiuso]:
argmax_u I(ξ; I^{t+1}) = argmax_u [H(I_{t+1} | I^t, u) - H(I_{t+1} | ξ, u)]
└───────┬───────┘ └───────┬───────┘
what we learn from residual noise
action u (white, isotropic)
Key insight: Given the scene ξ, nuisances become invertible. The residual uncertainty is "white, independent, isotropic" — justifying GF(3) as the minimal observable after all structured nuisances are factored out.
def nuisance_invariant_representation(observation: Observation) -> Sufficient:
"""
φ^(I) = sufficient statistic after factoring invertible nuisances
In DGA detection: ELMo embedding = φ^(domain_string)
In CRDT: operation trit = φ^(edit_sequence)
In games: strategy signature = φ^(action_history)
"""
# Remove invertible nuisances (viewpoint, lighting, surface form)
canonical = canonicalize(observation)
# What remains: minimal sufficient statistic
return project_to_sufficient(canonical)
Following Nørretranders' User Illusion inversion:
Passive: World → Observation → Agent (user illusion)
Active: Agent ⊛ World → Δ(World) (line damage = the observable)
The observation IS the interaction trace, not pixels rendered from latent state.
Entropy-sequencer as strategy optimizer in compositional game theory:
┌───────────────────────────┐
I^t ─│ │─→ u* (optimal action)
│ Entropy-Sequencer Game │
R ←──│ │←── H(I_{t+1}|I^t, u)
└───────────────────────────┘
entropyGame :: OpenGame History () Action InformationGain
entropyGame = Game {
-- Forward: select action maximizing expected entropy
play = \history -> argmax_u $ expectedEntropy history u,
-- Backward: propagate information gain as utility
coplay = \history action -> conditionalEntropy (observe action) history,
-- Equilibrium: greedy max-entropy is Nash when agents share scene ξ
equilibrium = \history -> maxEntropyAction history == nashEquilibrium history
}
-- Sequential composition: each step conditions on accumulated context
fullSequence :: OpenGame () () [Action] TotalGain
fullSequence = foldr (;) idGame (replicate n entropyGame)
where
-- Nash equilibrium of composed game = greedy max-entropy sequence
-- (proven via backward induction on information gain)
class MultiAgentEntropyGame:
"""
Coalition formation through information sharing.
GF(3) roles:
+1 (Generator): Proposes high-entropy actions
0 (Coordinator): Evaluates joint information gain
-1 (Validator): Prunes redundant/low-gain actions
"""
def nash_equilibrium(self, agents: List[Agent], scene: Scene) -> Strategy:
"""
At equilibrium: no agent can unilaterally increase
joint information gain by changing their action.
H(I_{t+1}|I^t, u*) ≥ H(I_{t+1}|I^t, u) for all deviations u
"""
strategies = {}
for agent in agents:
trit = agent.gf3_role
if trit == +1: # Generator
strategies[agent] = self.max_entropy_proposal(agent)
elif trit == 0: # Coordinator
strategies[agent] = self.evaluate_joint_gain(agents)
else: # Validator
strategies[agent] = self.prune_redundant(strategies)
return strategies
def shapley_information_value(self, agent: Agent, coalition: Set) -> float:
"""
Agent's marginal contribution to coalition's information gain.
φ_i = Σ_{S⊆N\{i}} (|S|!(n-|S|-1)!/n!) [v(S∪{i}) - v(S)]
where v(S) = H(I|actions of S) - H(I|ξ)
"""
return self._shapley_sum(agent, coalition, self._info_gain_value)
## Overview
Entropy-sequencer arranges interaction sequences to maximize learning efficiency. Instead of chronological replay, it reorders interactions to maximize information gain at each step, enabling 3x faster pattern learning.
**NEW (Langevin Integration)**: Temperature-aware sequencing that respects Langevin dynamics and Fokker-Planck convergence analysis. Temperature from Langevin analysis directly controls the noise scale in sequence optimization.
## Capabilities
### 1. arrange-by-max-entropy
Reorder interactions to maximize information content.
```python
from entropy_sequencer import MaxEntropyArranger
arranger = MaxEntropyArranger(seed=0xf061ebbc2ca74d78)
optimal_sequence = arranger.arrange(
interactions=all_interactions,
strategy="greedy_information_gain",
lookahead=5
)
# Returns sequence where each step maximizes new information
Reorder interactions respecting Langevin temperature dynamics.
# Temperature from Langevin analysis affects noise scale
optimal_sequence = arranger.arrange_temperature_aware(
interactions=all_interactions,
temperature=0.01, # From Langevin analysis
maximize_gradient_alignment=True, # Color-gradient correlation
fokker_planck_mixing_time=500 # Estimated convergence time
)
# Temperature directly controls exploration vs exploitation:
# - Low T (0.001): Sharp basin exploration
# - Medium T (0.01): Balanced exploration
# - High T (0.1): Broad exploration
Compute information gain for a sequence ordering.
def information_gain(sequence: List[Interaction]) -> float:
"""
I(S) = Σ H(X_i | X_1, ..., X_{i-1})
Where H is conditional entropy - how surprising each
interaction is given what came before.
"""
total_gain = 0.0
context = []
for interaction in sequence:
surprise = conditional_entropy(interaction, context)
total_gain += surprise
context.append(interaction)
return total_gain
Search promising permutations efficiently.
# Don't enumerate all n! permutations - use heuristics
search = PermutationSearch(
strategy="beam", # beam search
beam_width=100,
scoring_fn=information_gain,
seed=0xf061ebbc2ca74d78
)
best_ordering = search.find_best(
interactions=interactions,
max_iterations=1000
)
Measure how predictable a sequence is (lower = more entropic).
predictability = calculate_predictability(sequence)
# Returns:
# - autocorrelation: How much each step predicts the next
# - topic_clustering: Are similar topics grouped? (high = predictable)
# - temporal_monotonicity: Is it chronological? (high = predictable)
# - overall_score: Combined predictability [0, 1]
Post 1 → Post 2 → Post 3 → Post 4 → Post 5
Predictability: 0.85 (high - chronological)
Post 5 → Post 1 → Post 3 → Post 2 → Post 4
Predictability: 0.23 (low - each step surprising)
Information Gain: 3.2x baseline
GitHub → Bluesky → Web → GitHub → Bluesky
Predictability: 0.45 (medium - forced context switches)
User1 mentions → User2 replies → User3 quotes
Predictability: 0.55 (follows social graph)
-- Store entropy-optimized sequences
CREATE TABLE optimized_sequences (
sequence_id VARCHAR PRIMARY KEY,
original_order VARCHAR[],
optimized_order VARCHAR[],
information_gain FLOAT,
predictability_score FLOAT,
strategy VARCHAR,
seed BIGINT,
created_at TIMESTAMP
);
-- Query: Get best sequences for training
SELECT * FROM optimized_sequences
WHERE information_gain > 2.0
ORDER BY information_gain DESC
LIMIT 100;
| Trit | Skill | Role |
|---|---|---|
| -1 | three-match | Reduces/validates sequence constraints |
| 0 | entropy-sequencer | Coordinates optimal ordering |
| +1 | triad-interleave | Generates interleaved streams |
Conservation: (-1) + (0) + (+1) = 0 ✓
def greedy_max_entropy(interactions: List, seed: int) -> List:
"""O(n²) greedy algorithm for entropy maximization."""
rng = SplitMix64(seed)
remaining = set(range(len(interactions)))
sequence = []
context = []
while remaining:
best_idx = None
best_gain = -float('inf')
for idx in remaining:
gain = conditional_entropy(interactions[idx], context)
if gain > best_gain:
best_gain = gain
best_idx = idx
sequence.append(interactions[best_idx])
context.append(interactions[best_idx])
remaining.remove(best_idx)
return sequence
# entropy-sequencer.yaml
search:
strategy: beam # greedy, beam, genetic, simulated_annealing
beam_width: 100
max_iterations: 1000
scoring:
entropy_weight: 1.0
diversity_weight: 0.3
topic_switch_bonus: 0.2
reproducibility:
seed: 0xf061ebbc2ca74d78
deterministic: true
# 1. Load interactions
just entropy-load interactions.duckdb
# 2. Optimize sequence
just entropy-optimize --strategy beam --lookahead 5
# 3. Compare to baseline
just entropy-compare --baseline chronological
# 4. Export for training
just entropy-export optimized_sequence.json
Connection to context-sensitive embeddings (ELMo, polysemous effects):
NLP (ELMo) Effect Systems (Polysemy)
──────────────────────────────────────────────────────
"bow" → embedding(context) Embed → handler(context)
biLM forward/backward Effect stack (outer→inner)
Polyseme disambiguation Effect interpretation
DGA Detection Application (Koh & Rhodes, arXiv:1811.08705):
# The semantic signature IS the entropy signature
def dga_entropy_signature(domain: str) -> float:
"""
DGA domains have anomalous H(I_{t+1}|I^t, u) because:
- Legitimate: words contextually valid → low conditional entropy
- DGA: pseudorandom concatenation → high conditional entropy
ELMo embedding = φ^(domain) = nuisance-invariant representation
Classifier learns: p(DGA | φ^) via semantic entropy
"""
words = wordninja.split(domain)
embeddings = elmo.embed(words)
return conditional_entropy_chain(embeddings)
Effect Chaining = biLM Context Propagation:
Forward: effect₁ ; effect₂ ; effect₃ → accumulated context
Backward: handler₃ ∘ handler₂ ∘ handler₁ ← interpretation stack
# Order matters (non-commutative):
runState (runError m) ≠ runError (runState m)
From Higgins et al. symmetry-based disentanglement:
G = G_h × G_v × G_c (horizontal × vertical × hue)
↓ abelianize
G/[G,G] → Z₃ (GF(3) quotient)
The trit is the minimal disentangled factor — the irreducible quantum of "something happened" after factoring out all structured nuisances.
The entropy flow has temporal direction (from bidirectional-lens-logic):
Kind = (covariant, contravariant)
H(I_{t+1} | I^t, u) : Covariant (+1) — forward prediction
H(I_{t+1} | ξ, u) : Contravariant (-1) — backward from scene
I(ξ; I_{t+1}) : Invariant (⊗) — linear combination
Unit : Bivariant (0) — coordinator
Tensor product determines information flow composition:
Tensor : Ty (covx, conx) -> Ty (covy, cony)
-> Ty (covx && covy, conx && cony)
-- When forward entropy (+1) tensors with backward entropy (-1):
-- Result is INVARIANT (linear) — must consume exactly once
The two NotIntro rules explain why +1 and -1 generators/validators have different operational semantics even when balanced:
NotIntroCov : {a : Ty (True, con)} -> Term (a :: as) Unit -> Term as (Not a)
NotIntroCon : {a : Ty (cov, True)} -> Term (a :: as) Unit -> Term as (Not a)
-- Both valid for bivariant types, but DIFFERENT RESULTS!
triad-interleave - Generates base interleaved streamsagent-o-rama (Layer 4) - Consumes optimized sequencesgay-mcp - Deterministic seedingthree-match - Constraint validationopen-games - Play/coplay strategy structurepolysimy-effect-chains - Effect interpretation verificationcybernetic-immune - Reafference for self/non-self via entropybidirectional-lens-logic - 4-kind lattice foundationThis skill connects to the K-Dense-AI/claude-scientific-skills ecosystem:
general: 734 citations in bib.duckdbThis skill maps to Cat# = Comod(P) as a bicomodule in the equipment structure:
Trit: 0 (ERGODIC)
Home: Prof
Poly Op: ◁
Kan Role: Adj
Color: #26D826
The skill participates in triads satisfying:
(-1) + (0) + (+1) ≡ 0 (mod 3)
This ensures compositional coherence in the Cat# equipment structure.