36 brain mechanisms. 7 development phases. One working agent that remembers, feels, and thinks.
Your AI Forgets You Mid-Conversation. Here’s the Architecture That Doesn’t. — Medium · LinkedIn
Most AI agents are stateless — each conversation starts from scratch. The Neuro-Cognitive Agent (NCA) models how the human brain actually works: parallel subconscious processing, emotionally-weighted memory, and a conscious mind that reasons over all of it.
Every message is parsed for intent, emotional intensity, and entities — before any reasoning begins. Like the thalamus filtering sensory input, NCA’s perception layer decides what matters.
Six specialized “subminds” run in parallel on Haiku 4.5 — metacognition, deductive reasoning, emotional coherence, graph context, anticipatory retrieval, and neocortical consolidation. A basal ganglia gate decides which ones fire.
Memories aren’t just stored — they decay over time, get boosted when recalled, and are retrieved through six different brain wave modes depending on context. A knowledge graph links entities across conversations.
Over time, individual memories consolidate into narrative arcs, which merge into a person schema — a persistent mental model of who you are, what you care about, and how you think. Just like a close friend would.
Every turn follows an 8-step cognitive loop — modeled after the flow of information through the human brain. Here’s what happens when you say something.
Haiku parses the message for intent, emotional intensity (NRC lexicon), entities, and urgency. Decides the retrieval strategy.
Collect subconscious cues from the previous cycle — metacognition confidence, emotional reads, deductive insights. These inform the gate.
The basal ganglia gate evaluates perception signals and selects which subminds fire. A greeting activates only metacognition. A complex synthesis question activates all six.
Brain wave mode selected by time gap and intent. Vector search finds relevant memories, then CA3 spreading activation pulls in connected memories through entity links. NTAS scores decay over time.
Memories, subconscious cues, entity facts, person schema, and pacing guidance are assembled into a working memory context block with proportional token budgets. The density dial adjusts based on schema presence.
A forward model (Haiku) predicts whether the assembled context is good enough. If confidence is below threshold, it triggers re-retrieval or memory trimming before Opus fires.
Opus 4.6 receives the full context block and produces structured cognition: thoughts, reasoning, plans, self-criticism, emotional read, and the response shown to the user.
The exchange is encoded as a new memory with temporal, emotional, and entity bindings. Climbing fiber signals calibrate the gate for next time. Prediction errors accumulate toward schema updates.
Every numbered marker maps a real brain mechanism to its NCA implementation. Hover over a marker for details.
Prefrontal| # | Brain Region / Mechanism | How It Affects Humans | Neuroscience Function | NCA Layer | When It Fires | Example | Phase |
|---|---|---|---|---|---|---|---|
| Sensory Gating & Perception | |||||||
| 1 | Thalamus | Filters sensory overload so you focus on what matters | Sensory gating & filtering | Perception | Every turn — first step of cognitive loop | “Tell me about Portugal” → intent: question, entities: [Portugal], urgency: 0.7 | 1 |
| 2 | Thalamic reticular nucleus | Ignore background noise while reading | Attentional gating of working memory | Conscious | Context block exceeds token budget | 12 memories + 5 cues exceed budget → lowest-scoring memories pruned; cues protected | 2 |
| Emotional Processing | |||||||
| 3 | Amygdala | You flinch before you think — gut-level alarm | Emotional salience detection | Perception | Every turn — NRC lexicon scan | “I’m terrified about the diagnosis” → intensity 0.82 (fear + anticipation dominant) | 3 |
| 4 | Amygdala → Spotlight | In crisis you recall vivid emotions, forget peripherals | Emotional hijack of retrieval | Memory | emotional_intensity > 0.7 | Intense grief → Spotlight mode: 5 emotionally-similar memories, no spreading activation | 3 |
| 5 | Amygdala → Anticipatory suppression | When flooded emotionally you stop planning ahead | Emotional focus inhibits prediction | Subconscious | retrieval_hint = “spotlight” | Emotional crisis → anticipatory submind gets −0.5 suppression, stops pre-fetching | 3 |
| 6 | NRC Word-Emotion Lexicon | “Home” feels different than “domicile” | Plutchik 8 emotions + 2 sentiments | Subconscious | Emotional coherence submind fires (intensity > 0.3) | “I’m so proud of Lily” → joy 0.7, trust 0.5 → two-channel blend with Haiku analysis | 3 |
| Memory Encoding & Retrieval | |||||||
| 7 | Hippocampus (DG/CA3) | A smell triggers a childhood memory | Pattern completion retrieval | Memory | Every turn — after perception | “What was that restaurant?” → vector search finds Cantinho do Avillez from 80 memories | 1 |
| 8 | CA3 recurrent collaterals | Friend → their partner → the restaurant you all visited | Spreading activation (2-hop) | Memory | Entities found in initial retrieval (capped at 20) | “Elena” in memory → hop 1: “Portugal trip” → hop 2: “Maria’s wedding” | 2 |
| 9 | Hippocampal unified encoding | You remember what, when, where, and how it felt | Bind temporal + contextual + experiential + associative | Memory | After every agent response | Career anxiety talk → stored with timestamp, entities [Marcus, Dev], valence −0.3 | 1 |
| 10 | Hippocampal replay | Sleep replays the day, strengthening what matters | Theme extraction via replay to neocortex | Subconscious | Consolidation submind fires (depth ≥ 3, info load accumulates) | 5 turns about career → “Theme: tension between IC identity and leadership aspiration” | 3 |
| 11 | NTAS temporal decay | Forget last Tuesday’s lunch, remember your wedding | Salience decay + reconsolidation boost | Memory | During retrieval scoring | 3-month-old memory (salience 0.6) decays to 0.41; re-accessed memory gets 1.15× boost | 1 |
| 12 | Cortical oscillations | Alert focus retrieves differently than drowsy free-association | Brain wave state-dependent retrieval | Memory | Brain wave mode selected by time gap + intent | Quick follow-up → Gamma (7 memories); “trace the arc” → Delta (12 memories, wide spread) | 1–3 |
| 13 | Entorhinal cortex | You know a phone number without reliving the day you learned it | Entity property binding | Memory | Perceived entities have graph nodes — fires during retrieval | “How’s Elena?” → graph returns: lives in Lisbon, works at CERN, married to Miguel | 2 |
| 14 | Entorhinal → Context | Hearing “Elena” instantly surfaces her key attributes | Entity facts surfacing | Conscious | Entity facts exist in graph — injected into context block | Elena mentioned → context block gets “Elena: physicist, Lisbon, partner: Miguel” as working memory | 2 |
| Cognitive Control & Gating | |||||||
| 15 | Basal ganglia (GPi/SNr) | Brain doesn’t analyze grammar when someone says “hello” | Tonic inhibition of subminds | Subconscious | Every turn — gate evaluates all subminds before firing | “Hey Koda” → only metacognition fires; deductive, graph, anticipatory all inhibited | 4 |
| 16 | Striatal MSNs | Several clues accumulate into a sudden realization | Multi-input integration → graded disinhibition | Subconscious | Multiple signals combine — entities + complexity + emotion accumulate activation | “Compare Elena’s career to mine” → entity (0.3) + question (0.25) + complexity (0.2) = 0.75 > threshold | 4 |
| 17 | VTA → D1 dopamine | Curiosity and surprise make you more mentally flexible | Arousal lowers firing threshold | Subconscious | Emotional intensity > 0.5 or novel entities detected | Surprise question about forgotten topic → arousal 0.7 lowers thresholds by 0.15, more subminds activate | 4 |
| 18 | VTA → mPFC dopamine | “Connect the dots” activates narrative-building circuits | Circuit-specific consolidation facilitation | Subconscious | Synthesis request or high-complexity with person schema present | “How has my thinking on career changed?” → consolidation submind gets +0.3 boost, fires earlier | 5 |
| Consolidation & Schema Formation | |||||||
| 19 | Systems consolidation | Over weeks, isolated experiences become “my year abroad” | Hippocampal → neocortical cascade | Memory | Session ends or narrative arc threshold reached | 8-turn career chat → session summary → arc: “Growing tension between IC work and leadership” → schema update | 5 |
| 20 | mPFC schema formation | You carry a mental model of close friends | Person mental model (McClelland 1995) | Memory | Enough narrative arcs accumulate (≥3) or bootstrap from rich initial history | After 5 sessions → person schema: “values autonomy, anxious about leadership, close to Elena and Lily” | 5 |
| 21 | mPFC → Context block | Your mental model of someone shapes every interaction | Schema injection into working memory | Conscious | Person schema exists — injected into every context block | Schema “values deep work, anxious about promotion” appears in Opus context → shapes tone and advice | 5 |
| 22 | vmPFC schema-driven cognition | You reason from your model of a friend, not by replaying every talk | Reason from mental model, not from scratch | Conscious | Schema present + many memories retrieved → density dial shifts to schema-lean | Schema covers career context → density: “lean on schema, prioritize recent episodes” → fewer memories needed | 5 |
| Executive & Higher-Order Cognition | |||||||
| 23 | Prefrontal cortex (dlPFC) | “Wait, that doesn’t follow” — you catch logical gaps | Logical structure analysis | Subconscious | Gate activation > threshold for deductive submind (question or high complexity) | “Why did the project fail?” → deductive submind cue: “causal chain: timeline pressure + scope creep + no escalation” | 1 |
| 24 | Prefrontal metacognition | You know when you’re confused — thinking about thinking | Reasoning quality monitoring | Subconscious | Always — bypasses basal ganglia gate entirely | Low-confidence question → metacognition cue: “complexity: high, confidence: 0.4, suggest clarification” | 1 |
| 25 | Prefrontal cortex (executive) | The inner voice that deliberates and forms a considered response | Deliberate conscious reasoning | Conscious | Every turn — after all subconscious cues and memories assembled | Opus receives context block with 8 memories + 3 cues + schema → produces structured cognition with speech | 1 |
| 26 | Working memory capacity | Juggle ~4 things; more than that and you drop details | Cowan’s ~4-item limit | Conscious | Context block assembly — proportional token budgets enforced | 150k token budget: memories get 60%, cues 25%, schema 15% → overflow items pruned by relevance | 3 |
| 27 | Predictive coding | Walking into a kitchen you pre-activate “stove, fridge, sink” | Proactive pre-fetch of associations | Subconscious | Conversation depth ≥ 2, not in Spotlight mode | 3rd turn about Portugal trip → anticipatory cue pre-fetches: “likely topics: food, Elena, Lisbon weather” | 2 |
| 28 | Medial temporal lobe network | You navigate who knows whom, who’s connected to what | Entity relationship traversal | Subconscious | Entities detected in perception (entity count > 0) | “Tell me about Elena and Miguel” → graph traversal: Elena—PARTNER→Miguel, Elena—WORKS_AT→CERN | 1 |
| 29 | Angular gyrus semantic priming | “How’s the restaurant search?” — you know they mean Belcanto | Implicit entity detection | Memory | No explicit entities in input, but recent context contains relevant ones | “Any updates?” after discussing Lisbon → resolves implicit entities: [Lisbon, Belcanto, Elena] | 5.2 |
| Homeostatic Regulation | |||||||
| 30 | Adenosine sleep pressure | The longer you’re awake and learning, the more you need to sleep | Information-load consolidation trigger | Subconscious | Cumulative info load exceeds threshold during conversation | 10 dense turns → info_load 8.5 → consolidation submind gets +0.4 boost, triggers mid-session summary | 6a |
| Cerebellar Prediction & Calibration | |||||||
| 31 | Cerebellar forward model | You know the cup is heavier than expected before you think about it | Haiku evaluates context quality before Opus fires; re-retrieves or trims memories if confidence < 0.6 | Subconscious | Before Opus fires — Haiku evaluates assembled context quality | Context has 3 memories but low relevance → confidence 0.4 < 0.6 threshold → re-retrieves with broader query | 7.0 |
| 32 | Climbing fiber error learning | Missing three free throws in a row changes how you shoot the fourth | Error-driven tonic calibration — gate learns which subminds matter per user | Subconscious | After each turn — compares predicted vs actual response quality | Emotional submind predicted “high need” but user wanted facts → error signal adjusts tonic level −0.05 | 7.1 |
| 33 | Prediction error accumulation | “I keep misjudging her” triggers revising your mental model | EMA error tracking — 3+ consecutive mismatches trigger schema reconsolidation | Memory | EMA error > staleness threshold for 3+ consecutive turns | Schema says “anxious about leadership” but user is now excited → 3 mismatches → triggers schema rebuild | 7.2 |
| 34 | Cerebellar temporal sequencing | You know when to pause, when to be brief, when to elaborate | Response pacing & fluency prediction | Conscious | Every turn — analyzes message length ratio, emotion trajectory, topic continuity | User sends 5-word follow-up after long message → brief-warm register, ~50 words target | 7.3 |
| Planned | |||||||
| 35 | dlPFC relational processing | Comparisons and conditionals demand more cognitive effort | Deductive relational complexity modulation | Subconscious | Planned — compare/contrast or multi-conditional detected | “How does X compare to Y given Z?” → deductive submind gets extended context window | 6 |
| 36 | Dentate gyrus pattern separation | You distinguish similar but different memories (today’s parking spot vs yesterday’s) | Scaling benchmark validation | Memory | Planned — near-duplicate memories detected at retrieval time | Two lunch meetings at same café → separator distinguishes Feb vs March event by context | 6 |
NCA is tested against a raw Opus 4.6 baseline in a 12-question blind A/B evaluation. An independent Opus judge scores both responses across six dimensions, without knowing which is NCA. Results below are averaged across 4 benchmark runs (48 total questions).
Averaged across 4 benchmark runs. Scale: 1–10. Delta shows NCA advantage.
Specific fact recall from months-old conversations — restaurant names, employee counts, book titles. This is the architecture’s core strength: NTAS retrieval + entity binding finds what context-stuffing can’t.
Tracking how opinions and feelings changed over time. Undefeated. The consolidation pipeline (session summaries → narrative arcs → schema) gives NCA a structural advantage no baseline can match.
Connecting threads across conversations — infrastructure plans, blog arcs, trip logistics. CA3 spreading activation and graph traversal surface connections the baseline misses.
NCA’s weakest areas. On purely emotional questions, the architecture sometimes over-retrieves or over-structures. On large-scale personal growth questions, the baseline’s simpler approach can surface more detail. Active area of improvement.
NCA’s brain-inspired architecture creates capabilities that a stateless LLM simply cannot match. Each use case below shows which mechanisms power it, and a side-by-side scenario comparing NCA to a raw Opus baseline with the same conversation history pasted into context.
A companion that genuinely remembers your life — names, dates, evolving preferences, and emotional context — across months of conversation. Not a chatbot that starts fresh every session.
Tracks emotional patterns over time, recognizes growth, and responds with calibrated warmth. Knows when to listen, when to challenge, and when to celebrate progress — because it remembers the full arc.
Tracks projects, people, decisions, and open threads across weeks. Synthesizes status from scattered conversations. Connects dots that a human assistant might miss because the context was spread across too many touchpoints.
Remembers every paper discussed, every hypothesis explored, and every dead end hit. Connects ideas across sessions and fields — surfacing the citation from three weeks ago that’s suddenly relevant to today’s experiment.
Maintains deep context on each account — stakeholders, pain points, usage patterns, past issues, and relationship dynamics. Turns every interaction into a continuation, not a cold start.
Each phase added new brain-inspired mechanisms and was validated by benchmark testing. The score shows NCA wins vs baseline losses in blind evaluation.
Perception, hippocampal encoding & retrieval, NTAS temporal decay, 6 brain wave modes, conscious reasoning (Opus), subconscious workers (Haiku). The basic cognitive loop.
Entorhinal entity binding, CA3 spreading activation (2-hop), token budgets, anticipatory submind. Memory retrieval became associative instead of just vector-search.
Amygdala emotional gating, Spotlight retrieval mode, NRC lexicon, neocortical consolidation submind, density dial. The first major benchmark jump.
Basal ganglia gate with graded inhibition, striatal MSN integration, VTA dopaminergic modulation. Not every question needs every submind.
mPFC person schema, VTA→mPFC gate boost, vmPFC density, anti-confabulation guard, angular gyrus implicit entity detection. The agent builds a mental model of you.
Information-load tracking triggers consolidation when conversation density accumulates. The agent knows when it’s time to summarize and consolidate.
Forward model context prediction, climbing fiber error learning, schema staleness detection, response pacing. The agent predicts, evaluates, and self-corrects. NCA leads in 5 of 6 dimensions.
dlPFC relational complexity, dentate gyrus pattern separation, memory scaling validation at 5000+ memories.
Four layers, modeled after the human brain’s information flow. ~4,500 lines of Python across 30+ files.
Haiku 4.5 extracts intent, emotional intensity, entities, and urgency from every message. Decides retrieval strategy before anything else fires.
Metacognition, deductive reasoning, emotional coherence, graph context, anticipatory retrieval, neocortical consolidation. Each runs on Haiku 4.5 in a separate process. The gate selectively activates based on input signals.
NTAS temporal scoring, 6 brain wave retrieval modes, CA3 spreading activation, entity property binding, multi-scale consolidation (session → arc → schema).
Receives assembled context block with memories, cues, schema, and pacing guidance. Produces structured cognition: thoughts, reasoning, self-criticism, and speech.