AI | Antoine Weill--Duflos

cortexmd: a long-term memory and code-navigation brain for AI agents

Wed, 03 Jun 2026 00:00:00 +0000

cortexmd is a long-term memory and code-navigation brain for AI agents, exposed over the Model Context Protocol. It started as a private project on my homelab called obsidian-mcp, a server that let Claude read, search, and write notes in my Obsidian vault. I built it for myself, then cleaned it up to share.

It does two things.

The first is memory. Agents forget everything between sessions. cortexmd gives them somewhere to put what they learn: memories auto-categorised into kinds like observation, decision, insight, and plan, with a heat lifecycle where reading a memory warms it and inactivity cools it down. Recall is hybrid, fusing full-text and semantic search, boosted by temperature and links. At the start of a session the agent does a wakeup that surfaces the hottest, most relevant memories, so it picks up where it left off.

The second is code navigation. A Rust indexer walks a repo, parses it with tree-sitter, and builds a SQLite symbol database recording each symbol’s name, kind, signature, docstring, file range, and call graph. That index is exposed as cheap MCP tools: symbol search, file outline, callers and callees, change-impact, call-chain, dead code, import cycles, and copy-paste duplicates. The design goal is that an agent navigates code by querying the index, at roughly 60 tokens per result, instead of reading whole files. There is an opt-in shell hook that rewrites things like grep and cat on an indexed repo into the equivalent code-nav call.

The piece that made it shippable is the brain-vault model. cortexmd owns a separate brain vault that is the only thing it ever writes to. Your own vaults are attached as read-only sources, indexed for search and code-nav, never modified, with a default-deny allowlist so private subtrees stay out. Data flows one way, so there is no shared mutable file and no merge race.

  SOURCE_VAULTS[]  (read-only, opt-in, allowlisted)
  ┌───────────┐  ┌───────────┐  ┌───────────┐
  │  notes/   │  │  code/    │  │  docs/    │
  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
        │  index (one-way, read)      │
        └──────────────┼──────────────┘
                       ▼
              ┌──────────────────┐
              │     cortexmd     │   <- sole writer
              │   (MCP server)   │
              └────────┬─────────┘
                       │ writes
                       ▼
              ┌──────────────────┐
              │   BRAIN_VAULT    │   memories · journal · diaries
              │ (own dir, not    │   tasks · KG notes · code-repos.json
              │  your vault)     │
              └──────────────────┘

It runs in two modes: a local-stdio mode with no Docker, no auth, and no network, recommended for one person; and a self-hosted HTTP mode with auth for multi-client setups. The repo is a polyglot monorepo, a TypeScript MCP server and a single Rust binary, kept honest by a shared contract and a CI parity check.

cortexmd is pre-alpha and MIT licensed. APIs and config names are still in flux.

The full story is in a four-part blog series. Start with Giving an AI Agent a Second Brain.

Open-Sourcing the Brain: the Brain-Vault Model

Sun, 07 Jun 2026 00:00:00 +0000

In part 3 I described the code-navigation side of this thing: a Rust indexer that walks a repo, builds a symbol database, and lets an agent query the structure of the code for roughly the cost of a single grep instead of reading whole files. That, plus the memory engine from part 2, was the private tool I had been running on my homelab for a couple of months. It worked. I used it every day.

But it had a limitation built into its foundation, and that limitation is the reason this post exists.

The problem: it only worked for me

The private version, the one still called obsidian-mcp at the time, was shaped entirely around my setup. It read my personal Obsidian vault, the one I keep synced across my machines and treat as the source of truth for everything I do. Its conventions, its paths, the way it discovered and indexed notes, all of it assumed my environment, my layout, my habits. Day to day that was invisible. It was a genuinely good tool, and it kept getting better the more I leaned on it.

The trouble was that it was good for me in a way that made it impossible to hand to anyone else. You could not just point it at your own notes and have it work. It expected my vault, mounted the way I mount it, synced the way I sync it. It read from a space tuned to exactly one person, and that person was me. As a personal tool that was completely fine. As something to open-source it was a dead end, because the first thing any other user would hit is that the whole design quietly assumed they were me.

So the question that drove the redesign was not how to fix a bug. It was simpler and more demanding: what would it take for someone who is not me to run this over their own notes, safely, without inheriting my setup? Answering that honestly meant separating two things the private version had tangled together, the notes I read and the data the tool writes.

The redesign: the brain-vault model

The fix that made cortexmd shareable is almost embarrassingly simple once you have been burned. cortexmd owns its own separate brain vault, and that brain vault is the only thing it is ever allowed to write to. Memories, the journal, agent diaries, tasks, the knowledge-graph notes, the list of indexed code repos: all of it lives in the brain vault, and cortexmd is the sole writer.

Your own vaults, the ones you edit by hand in Obsidian, are attached as read-only source vaults. cortexmd indexes them for search and code-navigation, and it never modifies them. Not a heat update, not a tag, not a single byte. Attaching a source vault is opt-in, with a default-deny allowlist, so you can keep private subtrees out of the index entirely and only expose the parts you want the agent to see.

Data flows one way. Source vaults in, brain vault out, and the two never overlap. You attach whatever vault is yours, the tool reads it and writes nothing back to it, and the brain it builds lives somewhere else entirely. That is what makes it general: there is no longer any assumption that the vault is mine, or mounted my way, or synced my way. It is also what makes it safe, because a tool that never writes to your notes cannot clobber them, and there is no shared file for a sync agent to fork. The coupling that kept the private version stuck to my machine is simply gone.

  SOURCE_VAULTS[]  (read-only, opt-in, allowlisted)
  ┌───────────┐  ┌───────────┐  ┌───────────┐
  │  notes/   │  │  code/    │  │  docs/    │
  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘
        │  index (one-way, read)      │
        └──────────────┼──────────────┘
                       ▼
              ┌──────────────────┐
              │     cortexmd     │   <- sole writer
              │   (MCP server)   │
              └────────┬─────────┘
                       │ writes
                       ▼
              ┌──────────────────┐
              │   BRAIN_VAULT    │   memories · journal · diaries
              │ (own dir, not    │   tasks · KG notes · code-repos.json
              │  your vault)     │
              └──────────────────┘

The default brain vault is a dedicated data directory, something like ~/.local/share/cortexmd/brain, never your real Obsidian vault. You can point it somewhere else, but the default keeps the tool’s writes and your notes in two clearly separate places from the first run.

Two ways to run it

Once the write story was sound, the deployment story needed to match. cortexmd ships with two modes, and they are co-equal rather than a real one and a toy one.

The recommended default for a single person is local-stdio. It runs on your own machine, talks MCP over stdio to whatever client you use, and reads your vaults straight from disk. No sync. No Docker. No auth. No network at all. For one person on one machine this is everything you need, and it is the mode I would steer almost anyone to first. The whole point of the redesign was that a single user should be able to get the full brain with none of the operational weight the private version had wrapped around itself.

The second mode is self-hosted HTTP, and it is explicitly the advanced path. Here cortexmd runs as an Express server with proper auth (API key or OAuth2), and source vaults are pulled in read-only over a transport. That transport is an interface I called the IVault seam, with implementations for local disk, git-pull, WebDAV, and S3. This is the mode for multi-client or genuinely remote setups, where several MCP clients share one brain or the source data lives somewhere other than the server’s own disk. It is more moving parts, and you only reach for it when you actually need it.

The important thing is that both modes share the same read-only-source, sole-writer-brain model. The HTTP mode keeps the same guarantee, your sources stay read-only and the brain is the only write target. It just changes how the read-only sources reach the indexer.

A polyglot monorepo held together by a contract

The other thing open-sourcing forced me to clean up was the seam between the two languages this project is made of, because it really is two projects wearing one coat.

packages/server is the TypeScript MCP server: Node 22, Express, the memory engine, the recall logic, the tool definitions that show up to clients under the mcp__cortexmd__ namespace. crates/cli is a single Rust binary, cortexmd-cli, that is the tree-sitter indexer plus the CLI client, the session hooks, and the status-line HUD. Two toolchains, glued together by CI.

The hard part of a split like this is the place where they have to agree exactly. The symbol IDs that the Rust indexer produces have to be byte-identical to the ones the TypeScript side expects, or the whole code-navigation layer quietly points at nothing. So there is a contract/ directory that holds the shared wire format, the symbol-ID specification, and a set of golden fixtures. A CI parity check runs the same inputs through both sides and fails the build if the Rust producer and the TypeScript consumer ever disagree about what an ID should be. The contract is the referee, and it keeps both languages honest without either one having to trust the other.

The rename, and what it is now

When I pulled all of this together to share it, the old name no longer fit. obsidian-mcp described what it started as: a bridge to one app’s vault. What it had become was a memory and code-navigation brain that happened to use Obsidian-style markdown as one storage format among others. So it became cortexmd, and that is the name it ships under.

A note on honesty: this is pre-alpha. It is public and MIT licensed at github.com/Leicas/cortexmd, and the config names and some of the APIs are still in flux. I built it for myself first, ran it on my own homelab over my own private vault of personal and work notes, realised it was wired too tightly to my own setup to share, generalised it, and then cleaned it up enough to put it where other people can use it. It is not a finished product and I am not pretending it is one.

What I do feel good about is the shape of it. Your notes stay yours, on your disk, read-only, with the private parts excluded by default. The brain the agent builds lives in its own place and never reaches back into your files. In the default mode nothing leaves your machine: no cloud, no account, no network. That is the local-first, own-your-data version of the idea I actually wanted all along, and it took pulling it loose from my own setup to get there.

If any of this is useful to you, the project page has the overview and the links: cortexmd. And if you want to read it from the start, part 1 is where the series begins.

Series

This is part 4 of a four-part series on cortexmd.

Part 1: Giving an AI Agent a Second Brain
Part 2: The Memory Engine: Heat, Decay, and Dreams
Part 3: The Token Killer: Navigating Code Without Reading It
Part 4: Open-Sourcing the Brain: the Brain-Vault Model (you are here)

The Token Killer: Navigating Code Without Reading It

Sat, 06 Jun 2026 00:00:00 +0000

In part two I wrote about the half of cortexmd that fights forgetting: the memory engine, with its heat and decay and its nightly dream. This post is about the other half, the half that fights waste.

Here is the problem. When you ask an AI agent to work on a real codebase, the default move is to read files. The agent opens a file, the whole thing lands in its context, and now you are paying for every line of it. Most of those lines are noise for the task at hand. You wanted to know what one function does and who calls it, and instead you bought a thousand-line file plus imports plus three helper modules it pulled in to be safe. Do that a few times and the context window is full of code the agent will never use, the signal is buried, and the bill is real.

The fix is to stop reading code and start querying it.

A repo is a graph, not a pile of text

The insight is old and boring and correct: source code is not really a flat pile of text. It is a graph of symbols. Functions, methods, types, and the edges between them, who calls whom. An IDE knows this. “Go to definition” and “find all references” do not read your files top to bottom every time you click. They consult an index. cortexmd gives an agent the same thing.

The indexer is a Rust binary (cortexmd-cli, the same binary that ships the CLI client and the session hooks I will get to). It walks a repo, parses each file with tree-sitter, and writes the result into a SQLite symbol database. For every symbol it records the name, the kind (function, method, type, and so on), the signature, the docstring if there is one, the file range, and, crucially, the call graph: callers and callees. tree-sitter is the right tool here because it is fast, it is incremental, and it speaks a lot of languages, so the same indexing pass works across a polyglot repo instead of needing one bespoke parser per toolchain.

Once that database exists, the agent never has to open a file just to orient itself.

The code-nav tools

The index is exposed to MCP clients as a set of cheap tools. Each one answers a specific question that an agent actually asks while working:

symbol search: find symbols by name, signature, or docstring text. The entry point into everything else.
file outline: the shape of a file (its symbols and their signatures) without the bodies. You get the table of contents instead of the book.
get one symbol: pull the body of exactly one symbol when you have decided you need it, and nothing else.
callers and callees: walk the call graph in either direction. Who calls this, what does this call.
change impact: the transitive answer to “if I change this, who breaks?” This is the one I reach for most before touching anything load-bearing.
call chain: the path from one symbol to another, so you can see how A actually reaches Z.
find dead code: symbols nothing resolves to.
find import cycles: where the module graph loops back on itself.
find semantic duplicates: copy-paste detection, the near-identical bodies that drifted apart.

The code-nav savings view, with cortexmd indexing its own repository (the project dogfooding its indexer). The figures here come from the project’s seeded demo, not a formal benchmark.

The pattern across all of them is the same. The agent narrows before it reads. Search to find the symbol, outline to see the neighbourhood, callers and change-impact to understand the blast radius, and only then, if it truly needs the body, get one symbol. Most tasks never need a full file at all.

Roughly 60 tokens per result

Here is the design goal that drove the whole thing. A code-nav lookup is meant to cost roughly 60 tokens per result. Reading a whole file costs thousands. So querying the index is meant to be many times cheaper than reading, for the same useful answer.

I want to be honest about what that number is and is not. It is a target I designed toward, not a benchmark I am quoting at you. The exact cost depends on the symbol, the language, how much docstring there is. But the shape of it is the entire point: a result is a compact record (name, kind, signature, a range, some edges), not a slab of source. When the unit of work is a 60-token fact instead of a 2,000-token file, an agent can ask twenty questions for the price of one read, and the context window stays full of answers instead of haystack.

It also reads better for the model. A clean list of callers is easier to reason over than the same information smeared across five files the agent had to load to reconstruct it.

Catching the old habit

There is a catch with giving an agent better tools: it has to remember to use them. The muscle memory of “investigate the code” is grep, cat, head, tail. Those habits are deep, and an agent will happily fall back to them and start hauling files into context the moment you stop watching.

So cortexmd ships an opt-in shell hook. When it is on and you are working in an indexed repo, it rewrites those commands into their cheap code-nav equivalents. A grep for a symbol becomes a symbol search. A cat of a file becomes a file outline. The agent thinks it is doing the old thing, and the index quietly answers instead. It is opt-in on purpose, because rewriting someone’s shell commands is exactly the kind of magic you want to consent to rather than discover, and because the rewrite only makes sense on a repo that is actually indexed.

The nice property is that it meets the agent where its habits already are. You do not have to retrain the reflex, you just intercept it.

Dogfooding on its own source

I did not test this on a toy. cortexmd is a polyglot monorepo (TypeScript on one side, Rust on the other, more on that in part four), and I pointed the indexer at the project’s own source and worked on it through its own code-nav tools. That is the test that matters. When you are changing the indexer while navigating with the indexer, the rough edges find you fast. “Change impact says nothing breaks, so why did that break” is a very motivating sentence to read in your own logs.

Dogfooding is also where the two halves of cortexmd meet. The code index tells the agent what the code is right now. The memory engine from part two tells it why the code is the way it is, the decisions and the dead ends that no symbol database will ever record. Structure plus history. One is queried, one is recalled, and together they are most of what I want a collaborator to have.

This is still pre-alpha, so the exact tool names and config will move around. The idea underneath is stable: navigate code by querying an index, not by reading files, and pay 60 tokens for a fact instead of thousands for a haystack.

In part four I get to the part that turned a private homelab tool into something I could put on the internet: why a tool tuned entirely to my own setup could not be shared as-is, the brain-vault redesign that generalised it, and why I open-sourced it.

The project page is here, and the code is at github.com/Leicas/cortexmd.

Series

Giving an AI Agent a Second Brain
The Memory Engine: Heat, Decay, and Dreams
The Token Killer: Navigating Code Without Reading It (you are here)
Open-Sourcing the Brain: the Brain-Vault Model

The Memory Engine: Heat, Decay, and Dreams

Fri, 05 Jun 2026 00:00:00 +0000

In part one I described two problems that kept biting me while working with AI agents. The first was that they forget everything between sessions. The second was that they burn tokens re-reading code they have already seen. This post is about the first problem, and the part of cortexmd I am most attached to: the memory engine. The overall approach is inspired by mempalace, a memory-palace project for AI agents; what follows is how cortexmd builds its own version.

The naive fix for forgetting is to dump everything into context. Keep a big file of notes, paste it in at the start of every session, and hope the agent reads it. I tried versions of that, and it falls apart fast. The file grows without bound. Old, stale facts sit next to the one thing that actually matters today, with equal weight. You pay for the whole pile on every turn, and the signal you care about gets buried in noise you have long since stopped caring about. A human memory does not work like that, and it should not. So the design goal was simple to state and harder to build: the agent should remember the way a person does, where the things you use stay sharp and the things you stop touching fade.

Eight kinds of memory

When an agent stores something, cortexmd does not treat it as an undifferentiated blob of text. Each memory is auto-categorised into one of eight kinds: observation, decision, insight, conversation, fact, preference, plan, and reflection. The distinction matters because these things behave differently over time and want to be retrieved differently. A preference (I always want British spelling, I hate em dashes) is a long-lived fact about how I work, and it should keep surfacing. A conversation snippet is contextual and mostly useful soon after it happened. A decision is something you want to be able to find again months later when you ask yourself why on earth you did that. Tagging the kind up front gives the rest of the system something to reason with, instead of forcing every later step to guess from raw text.

Heat: hot, warm, cold

The core idea is that every memory has a temperature, and temperature decays. A fresh or recently used memory is hot. Leave it untouched and it cools to warm, then after roughly a month of inactivity it drifts to cold, and colder memories are eventually archived rather than kept in the front of the agent’s mind.

The crucial detail is promote-on-access: reading a memory heats it back up. This is the whole trick. You do not have to manually curate what is important. Importance is revealed by use. The memories you and the agent keep reaching for stay hot precisely because you keep reaching for them, and the ones you never touch sink on their own. It is the same instinct as a least-recently-used cache, except the thing being cached is the agent’s sense of what currently matters, and the eviction is graceful: cold and archived, not deleted.

Why bother with all this instead of one flat store? Because temperature gives recall a prior. When the agent goes looking for something, it does not face a flat sea of equally plausible notes. It has a built-in sense of what has been live lately, and that signal costs nothing extra to maintain because it falls out of normal use.

Consolidation: tidying the cold drawer

Letting memories cool is only half the story. If you simply let cold memories pile up, you end up with a drawer full of near-duplicate scraps: five slightly different notes about the same long-finished task, each a little stale, none worth reading on its own. So cortexmd consolidates. Related cold memories get folded together into summaries, so the gist survives in one coherent place while the redundant fragments stop cluttering things. The detail is not thrown away carelessly, it is compressed into something you would actually want to read later. Cooling decides what is no longer urgent; consolidation decides what to do with it.

Hybrid recall

Storing memory well is pointless if you cannot get it back. Recall in cortexmd is hybrid. It runs a lexical full-text search (the keyword match, good at exact terms and names) and fuses it with a semantic search over embeddings (the meaning match, good when you remember the idea but not the words). Lexical alone misses anything phrased differently from your query. Semantic alone can drift toward things that are vaguely on-topic but not what you meant. Fusing the two covers for the weaknesses of each.

On top of the fused score, the ranking is boosted by three things: temperature (hotter memories rank higher, because recency of use is a signal), importance (some memories are simply weightier), and links (a memory connected to other relevant memories is more likely to be the one you want). The result is a ranking that reflects not just textual similarity but how live and how connected a memory is. That is much closer to how you actually recall things than a plain similarity score.

Waking up

All of this comes together at the start of a session in what I call the wakeup. Instead of beginning every conversation as a blank slate, the agent does a memory wakeup that surfaces the hottest, most relevant memories. It is the difference between a colleague who walks in already knowing where you left off yesterday and one you have to brief from scratch every single morning. The wakeup leans on everything above: the heat model decides what is currently live, hybrid recall decides what is relevant, and the agent starts the session already oriented. This is the moment where the whole engine earns its keep, because it is the moment you feel the agent remembering you.

The smarter-brain round: links and dreams

The pieces above were the heart of the v2.0 memory system. A later round, which I think of as the smarter-brain work, added a few things that make the brain feel less like a database and more like something that thinks while you are away.

The Intelligence tab of the dashboard: vault health, dream insights, theme clusters, and entity and knowledge-graph counts. Demo data from the project’s seeded sample vault.

The first is automatic knowledge-graph links. As data is stored, cortexmd draws links between related notes on its own, instead of waiting for me to wire them up by hand. Manual linking is exactly the kind of bookkeeping that sounds nice and never actually happens, so having the connections form automatically as a side effect of storing things means the link signal in recall keeps getting richer without any effort from me.

The second is the dream. cortexmd runs a scheduled consolidation pass, on a quiet schedule, that I named the dream because of what it does and when it does it. It reconciles similar notes, with particular attention to older, cooled-down ones, and folds them into project notes. It is the background gardener of the brain: while nothing is happening, it walks through the cooled corners, notices that these three half-finished thoughts are really one thing, and tidies them into a coherent project note. You wake the agent up the next day and the brain is a little more organised than you left it, without you having done anything.

The third is something I borrowed straight from Obsidian: a vault graph view, rendered on a canvas in the web dashboard. Because the knowledge graph is real, you can look at it. Seeing the brain as a constellation of linked notes, with the dense clusters and the lonely orphans laid out in front of you, makes the whole thing feel concrete in a way a list of rows never does.

The vault graph view in the dashboard. Each dot is a note, each line a link. This is the project’s self-contained demo vault, so the note names are seeded sample data, not my own notes.

Click a node and the note opens in the side panel with its links. Same seeded demo data.

Why a heat model wins

Pulling it together: the reason a heat model beats dumping everything into context is that attention is the scarce resource, for an agent exactly as for a person. A flat store treats a note from eight months ago and a decision from this morning as equals, makes you pay for both on every turn, and forces the agent to rediscover what matters each time. The heat model encodes what matters as a property of the data itself, keeps it current for free through ordinary use, compresses what has gone cold instead of hoarding it, and surfaces the live, relevant slice at wakeup. The agent carries less, and what it carries is the right stuff.

That handles forgetting. The other half of the original problem, the agent burning tokens re-reading code it has already seen, needs a completely different mechanism. That is a Rust indexer and a symbol database, and it is the subject of part three: The Token Killer.

cortexmd is pre-alpha and MIT licensed. The code, including the memory engine described here, lives on the project page and on GitHub at github.com/Leicas/cortexmd. Names and config are still in flux, so treat the specifics as a snapshot rather than a contract.

Series

This is part two of a four-part series on cortexmd:

Giving an AI Agent a Second Brain
The Memory Engine: Heat, Decay, and Dreams (you are here)
The Token Killer: Navigating Code Without Reading It
Open-Sourcing the Brain: the Brain-Vault Model

Giving an AI Agent a Second Brain

Thu, 04 Jun 2026 00:00:00 +0000

I work with a coding agent most days now. It is genuinely good. It reads my code, reasons about it, proposes changes, runs the tests, fixes what it broke. And every single time I open a fresh session, it has the memory of a goldfish.

It does not remember the decision we made last week about why a module is structured the way it is. It does not remember that I prefer commas to dashes, or that one corner of the codebase is load-bearing and fragile. It does not remember the conversation where we ruled out an approach for good reasons. All of that context lived in the previous session, and the previous session is gone. So I re-explain. Then I re-explain again the next day.

That is the first problem. The agent forgets.

And it is not only the code. The moment I ask it to help with anything human, the same hole opens up. Ask it to draft an email and it has no idea who the recipient is to me, whether this is a close friend, a colleague, or a partner I need to handle with care, so it has no idea what tone to take, because the tone lived in past conversations it can no longer see. It does a poor job of linking one session to the next, so every thread starts cold. And the way I keep my life split makes it worse: personal in one account, work in another, the way most people do. The moment I cross from one to the other, whatever the agent had learned about me is simply gone. Poof. No memory.

Two problems, not one

The second problem is quieter but it shows up on every invoice. To do anything useful, the agent has to understand the code, and the way it understands code is by reading it. So it reads files. Whole files. To answer a small question about one function, it will pull an entire module into context, and often the modules that call that module too. Multiply that across a working session and you are paying, in tokens, to load the same source over and over, most of which is irrelevant to the question at hand.

Both problems come from the same place: the agent has no persistent store of what it has learned, and no cheap way to look things up. It only has the context window in front of it, and the context window is both forgetful and expensive to fill.

I decided to do something about both. Not because I had a product idea, but because it was annoying me on a daily basis and I had a homelab sitting there asking to be useful.

There was also a personal reason the shape of the solution felt obvious. A while ago, after reading a friend’s long write-up of his own personal-knowledge-management journey, I started keeping notes in Obsidian. Building that second brain for myself changed how I thought about the problem. If a vault of linked notes works as external memory for me, it should work as external memory for the agent too. I could let it read mine to get started, as a read-only source, and then let it build its own brain, one that I could actually open, navigate, and understand. Not a black box of embeddings somewhere, but notes, in a vault, that I own.

The homelab origin

For a while now I have run a small MCP server on my homelab. MCP, the Model Context Protocol, is the standard way to give an AI client tools and data it can reach out to. The server I built was called obsidian-mcp, and its first job was simple: give Claude the ability to read, search, and write notes in my Obsidian vault.

It ran in a Docker container behind a reverse proxy, my notes were already there, and suddenly the agent could reach into them. That alone was useful. But it also turned the vault into a natural place to put the answers to my two problems, because a vault is just structured text that an agent can read and write, and that is exactly what both a memory and a code index need to be backed by.

So the server grew two new capabilities, one for each problem.

The first capability is a memory system, inspired by mempalace, a memory-palace project for AI agents. Instead of letting everything evaporate at the end of a session, the agent can store what it learns: an observation, a decision, an insight, a preference I stated out loud. Those memories do not just pile up forever in a flat list. They have a lifecycle. The ones that get used stay warm and easy to surface, the ones nobody touches cool off and eventually get folded into summaries, and at the start of a new session the agent does a wakeup that brings the hottest, most relevant memories back to the surface. The point is continuity. The agent picks up roughly where it left off instead of from zero. That is the subject of part two.

The second capability is a code index. Rather than read whole files to understand a repository, the agent queries an index of it. A Rust indexer walks the repo, parses it, and records the things you actually want to look up: what symbols exist, their signatures, where they live, and crucially who calls whom. Then the agent asks targeted questions. What does this function look like? Who calls it? What breaks if I change it? Each answer is small and cheap, on the order of a lookup rather than a full read, instead of dragging the entire file into context. The design goal is blunt: a code-nav lookup should cost roughly sixty tokens per result and be many times cheaper than reading the file it came from. That is the subject of part three.

From a private tool to cortexmd

For months this was a personal thing. It ran on my hardware, over my own private Obsidian vault, the one that holds both personal and work notes. I am not going to quote any of it here, and the tool itself is deliberately built so that the data stays mine. But the point stands: it was a tool I made for myself, and I used it every day.

Then I hit a different kind of wall, one that came precisely from how well it worked for me. I will tell it properly in part four, but the short version is that the whole thing was tuned to my own setup, my vault, mounted and synced my way, so it was a great personal tool and impossible for anyone else to run. Making it shareable meant a redesign, and that redesign is what finally turned it into something other people could use.

That redesign became cortexmd. It is open source, MIT licensed, and public at github.com/Leicas/cortexmd. It is honestly pre-alpha. The APIs and the config names are still in flux, and I would not bet a production workflow on it yet. The honest framing is the right one: I built this for myself, then cleaned it up to share. The cleanup is real work and it is most of part four.

What it became: the cortexmd control panel. This screenshot is from the project’s self-contained demo, so the data is seeded samples, not my own vault.

So that is the shape of the series. There were two problems, an agent that forgets and an agent that burns tokens re-reading code. There are two answers, a memory system and a code index, both born inside a homelab MCP server. And there is the redesign that turned a private tool into something you can run yourself.

What is coming

Part 2, the memory engine. Heat, decay, and dreams. The eight categories a memory can fall into, the hot to warm to cold lifecycle, promote-on-access, consolidation, hybrid recall that fuses full-text and semantic search, the session wakeup, and the auto-linking graph that wires notes together as they are stored.
Part 3, the token killer. The Rust and tree-sitter indexer, the SQLite symbol database, the code-nav tools, the roughly sixty tokens per result idea, the opt-in shell hook that rewrites things like grep and cat on an indexed repo into the cheap equivalent, and what it was like to dogfood all of it on the project’s own source.
Part 4, open-sourcing the brain. Why a tool that only worked for me had to be redesigned to share, the brain-vault model that generalises it, the two deployment modes, the polyglot monorepo held together by a shared contract, the rename, and why I care about owning my own data.

If you want to skip ahead to the code, the project page is over here and the repo is on GitHub. Otherwise, part two is where the agent starts to remember.

Series

This is Part 1: Giving an AI Agent a Second Brain (you are here).

Part 1: Giving an AI Agent a Second Brain (this post)
Part 2: The Memory Engine: Heat, Decay, and Dreams
Part 3: The Token Killer: Navigating Code Without Reading It
Part 4: Open-Sourcing the Brain: the Brain-Vault Model

Project page: cortexmd. Source: github.com/Leicas/cortexmd.