Archiledger combines Greek arkhē (origin, first principle) with "Ledger" - a foundational record serving as the source of truth for AI memory.
It is a specialized Knowledge Graph that serves as a RAG (Retrieval-Augmented Generation) system with vector search. It is exposed as a Model Context Protocol (MCP) server to enable LLM-based assistants to store, connect, and recall information using a graph database.
1. Overview
1.1. Why Archiledger?
LLMs are powerful, but they forget everything when a conversation ends:
-
Repeating yourself - Telling your assistant the same preferences over and over
-
Lost insights - Valuable analysis from one session isn’t available in the next
-
No connected thinking - Information lives in silos without relationships
-
Manual categorization - You must tag and organize everything yourself
Archiledger solves this with a graph-based memory powered by agentic RAG:
| Problem | Solution |
|---|---|
Context resets every conversation |
Persistent notes that survive restarts |
Flat, disconnected notes |
Typed links between atomic notes (Zettelkasten) |
Manual categorization burden |
AI-powered automatic classification |
No temporal awareness |
ISO-8601 timestamps on every note |
Keyword search limits |
Vector search finds semantically similar notes |
Hard to explore large graphs |
Graph traversal via |
Static knowledge graphs |
Memory evolution through agent evaluation |
Why Agentic Memory?
Standard memory systems face several challenges when storing and retrieving information:
| Challenge | Example |
|---|---|
Ambiguity |
"How do I handle taxes?" - personal vs business context unclear |
Scattered evidence |
Remote work policy for contractors spans HR docs and contractor agreements |
Static categorization |
Tags and categories become stale as knowledge evolves |
No self-correction |
System accepts similarity scores without evaluation |
Agentic memory addresses these through an agent-based control loop that analyzes, evaluates, and evolves the knowledge graph.
Agentic Memory: Automatic Classification
The key differentiator is the Agentic Memory module, which uses an AI agent to:
-
Analyze content - Extract keywords, context, and tags automatically
-
Find relationships - Search for semantically similar existing memories
-
Evaluate evolution - Decide whether to link, update, or create new notes
-
Self-correct - Assess classification quality and refine when needed
This eliminates the burden of manual categorization while maintaining a rich, connected knowledge graph.
How It Works
The AgenticMemoryAgent implements a multi-step pipeline for content classification:
Step 1: Content Analysis
The agent analyzes incoming content to extract structured metadata:
-
Keywords: Salient nouns, verbs, and key concepts ordered by importance
-
Context: Single sentence summarizing topic, arguments, and purpose
-
Tags: Broad categories for classification (domain, format, type)
Step 2: Evolution Evaluation
The agent uses RAG to find similar existing memories and evaluates relationships:
-
Searches for semantically similar notes in the knowledge graph
-
Compares new content with neighbors to identify relationships
-
Decides whether to create links, update tags, or modify existing notes
Trade-offs
|
Agentic memory introduces trade-offs compared to manual note management:
|
Use agentic memory when:
-
You want automatic content classification
-
Your knowledge graph needs to evolve over time
-
Manual tagging doesn’t scale
Use the core module when:
-
You need deterministic, predictable behavior
-
Latency is critical
-
You want full control over classification
Quick Decision Guide
| Requirement | Recommended Approach |
|---|---|
Pure Java, no LLM |
Core Module (Maven) |
LLM with full manual control |
MCP Server |
AI classification in Java app |
Agentic Memory (Embabel) |
LLM with automatic memory management |
Agentic Memory MCP |
Full control over tags/links |
Core Module or MCP Server |
Automatic knowledge evolution |
Agentic Memory (either) |
Minimal latency, deterministic behavior |
Core Module or MCP Server |
1.2. Architecture
| This server implements no authentication and uses an embedded graph database designed for local development only. Not recommended for production. |
Domain Layer
The domain layer contains the core business entities:
-
MemoryNote: Represents a single atomic note with content, tags, keywords, and timestamps
-
MemoryNoteId: Unique identifier for notes
-
NoteLink: Typed relationship between notes
-
LinkDefinition: Defines a link with context explaining the relationship
Application Layer
The application layer orchestrates domain logic:
-
MemoryNoteService: Main service interface for note operations
-
Handles retrieval count tracking and embedding generation
-
Coordinates between domain and infrastructure layers
Infrastructure Layer
The infrastructure layer provides technical implementations:
-
Persistence:
LadybugMemoryNoteRepository- stores notes as nodes, links as relationships -
Vector Search:
LadybugEmbeddingsService- uses LadybugDB’s native vector extension with HNSW indexing -
MCP:
McpToolAdapter- exposes memory operations as MCP tools
Agentic Memory
The agentic-memory module provides AI-driven memory evolution:
-
AgenticMemoryAgent: Analyzes notes and suggests new links based on semantic relationships
-
Context-Aware Links: Automatically evaluates whether to add, update, or remove links
-
MemoryNoteSearchOperations: Implements RAG interfaces for vector search and result expansion
1.3. Core Concepts
Zettelkasten Method
Archiledger implements the Zettelkasten method for knowledge management:
-
Atomic Notes: Each note contains a single idea or piece of information
-
Links: Notes are connected through typed relationships
-
Tags: Notes can be categorized with multiple tags
-
Keywords: Searchable terms for quick lookup
Memory Notes
A MemoryNote is the fundamental unit of storage:
| Property | Description |
|---|---|
|
Unique identifier (MemoryNoteId) |
|
The main text content of the note |
|
Set of tags for categorization |
|
Set of keywords for search |
|
ISO-8601 timestamp when created |
|
ISO-8601 timestamp when last modified |
|
Number of times the note has been retrieved |
Note Links
Links connect notes with typed relationships:
| Link Type | Use Case |
|---|---|
|
General relationship between notes |
|
One note depends on another |
|
One thing impacts another |
|
Component/container relationship |
|
Replaces previous decision/approach |
|
Conflicts with another note |
|
Default relationship type |
Each link includes a context field explaining why the relationship exists.
Vector Search
Archiledger uses semantic similarity search:
-
Embeddings: Generated via Spring AI’s ONNX Model (all-minilm-l6-v2)
-
Storage: LadybugDB’s native vector extension
-
Indexing: HNSW (Hierarchical Navigable Small World) for fast approximate nearest neighbor matching
-
Temperature Scaling: Adjustable similarity scoring
-
Threshold Filtering: Filter results by minimum similarity score
2. Getting Started
2.1. Installation
Building from Source
Clone the repository and build:
git clone https://github.com/thecookiezen/archiledger.git
cd archiledger
mvn clean package
This builds all modules:
-
core/target/archiledger-core-*.jar- Core library -
mcp/target/archiledger-server-*.jar- Low-level MCP server -
agentic-memory/target/agentic-memory-*.jar- Agentic memory library -
agentic-memory-mcp/target/agentic-memory-mcp-*.jar- Agentic memory MCP server
2.2. Quick Start
This guide will get you started with Archiledger in 5 minutes.
Using the Core Module (Java API)
Create a MemoryNoteService and start creating notes:
import com.thecookiezen.archiledger.application.service.MemoryNoteService;
import com.thecookiezen.archiledger.application.service.MemoryNoteServiceImpl;
import com.thecookiezen.archiledger.domain.model.MemoryNote;
import com.thecookiezen.archiledger.domain.model.MemoryNoteId;
// Create the service (requires repository configuration)
MemoryNoteService service = new MemoryNoteServiceImpl(repository);
// Create a note
MemoryNote note = new MemoryNote(
new MemoryNoteId("my-first-note"),
"This is my first memory note",
Set.of("example", "getting-started"),
Set.of("memory", "note")
);
MemoryNote saved = service.createNote(note);
// Retrieve the note
Optional<MemoryNote> retrieved = service.getNote(saved.getId());
// Search for similar notes
List<SimilarityResult<MemoryNote>> results = service.similaritySearch("memory note");
Using the MCP Server
Start the MCP server:
# Transient (in-memory)
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
-jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
The server runs on port 8080 with the MCP endpoint at localhost:8080/mcp.
By default, the server uses a local ONNX embedding model (all-MiniLM-L6-v2). See Embedding Model Configuration to use OpenAI, Ollama, or custom HuggingFace models.
|
2.3. Running with Docker
Low-Level MCP Server
Transient (Data lost when container stops):
docker run -p 8080:8080 registry.hub.docker.com/thecookiezen/archiledger:latest
Persistent (Data saved to host filesystem):
docker run -p 8080:8080 \
-v /path/to/local/data:/data \
registry.hub.docker.com/thecookiezen/archiledger:latest
Custom data path:
docker run -p 8080:8080 \
-e LADYBUGDB_DATA_PATH=/custom/data/archiledger.lbdb \
-v /path/to/local/data:/custom/data \
registry.hub.docker.com/thecookiezen/archiledger:latest
Agentic Memory MCP Server
Requires LLM configuration for AI-powered features:
docker run -p 8080:8080 \
-e OPENAI_CUSTOM_BASE_URL=https://api.example.com \
-e OPENAI_CUSTOM_MODELS=model-name \
-e OPENAI_CUSTOM_API_KEY=your_api_key \
registry.hub.docker.com/thecookiezen/archiledger-agentic-memory:latest
With persistent storage:
docker run -p 8080:8080 \
-v /path/to/local/data:/data \
-e OPENAI_CUSTOM_BASE_URL=https://api.example.com \
-e OPENAI_CUSTOM_MODELS=model-name \
-e OPENAI_CUSTOM_API_KEY=your_api_key \
registry.hub.docker.com/thecookiezen/archiledger-agentic-memory:latest
Environment Variables
| Variable | Default | Description |
|---|---|---|
|
|
File path where LadybugDB stores data |
|
|
Directory for LadybugDB extension cache |
|
- |
Base URL for the OpenAI-compatible API |
|
- |
Model name to use |
|
- |
API key for authentication |
|
|
Custom completions endpoint path |
The /data volume must be writable by UID 1000 (spring user).
|
Docker Tips
-
Persistent Data: Always mount a volume (
-v) to preserve your knowledge graph -
Container Lifecycle: Run with
-d(detached mode) -
Port Conflicts: Map to different port (e.g.,
-p 9090:8080) and update URL -
Named Containers: Use
--name archiledgerfor easy management -
Debug Logs:
docker logs archiledger
3. Modules
3.1. Agentic Memory Module
The agentic memory module provides AI-powered memory management with automatic content classification, tagging, and knowledge graph evolution.
Maven Dependency
<dependency>
<groupId>com.thecookiezen</groupId>
<artifactId>agentic-memory</artifactId>
<version>0.0.6</version>
</dependency>
Dependencies
The agentic memory module is built on the Embabel framework.
Running the Agentic Memory MCP Server
# Transient
java -jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar
# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
-jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar
| Requires LLM configuration for AI-powered features. |
LLM Configuration
Set the following environment variables:
export OPENAI_CUSTOM_BASE_URL=https://api.example.com
export OPENAI_CUSTOM_MODELS=model-name
export OPENAI_CUSTOM_API_KEY=your_api_key
export OPENAI_CUSTOM_COMPLETIONS_PATH=/v4/chat/completion
Debug Configuration
Enable debug mode to see the agent’s internal prompts and LLM responses when writing memories:
agentic-memory.debug=true
Or using environment variable:
export AGENTIC_MEMORY_DEBUG=true
java -jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar
When enabled, the agentic_memory_write tool will log the prompts sent to the LLM and the raw responses received. This is useful for troubleshooting agent behaviour or understanding how content is being classified.
| Property | Default | Description |
|---|---|---|
|
|
Enable verbose logging of LLM prompts and responses during memory writes |
Agentic Memory MCP Tools
| Tool | Description |
|---|---|
|
Perform semantic similarity search across memory notes |
|
Given a note ID, expand to find connected/linked notes |
|
Traverse upward in the knowledge graph to find parent/related notes |
|
Store content with automatic AI classification, tagging, and link generation |
The agent automatically:
-
Analyzes content for keywords, context, and tags
-
Searches for similar existing memories
-
Evaluates potential relationships
-
Creates typed links with explanatory context
-
Stores the classified note
Using in Java Applications
import com.thecookiezen.archiledger.agenticmemory.AgenticMemoryAgent;
import com.thecookiezen.archiledger.agenticmemory.domain.UpsertMemoryRequest;
@Autowired
private AgenticMemoryAgent agent;
public void storeMemory(String content) {
UpsertMemoryRequest request = new UpsertMemoryRequest(content);
var result = agent.storeMemory(request);
System.out.println("Created note: " + result.id());
System.out.println("Tags: " + result.tags());
System.out.println("Links: " + result.links().size());
}
Domain Models
NoteAnalysis
Result of content analysis:
public record NoteAnalysis(
List<String> keywords, // Salient concepts ordered by importance
String context, // Summary: topic, arguments, purpose
List<String> tags // Classification categories
) {}
EvolutionDecision
Result of evolution evaluation:
public record EvolutionDecision(
boolean shouldEvolve, // Whether to modify the graph
List<SuggestedLink> suggestedLinks, // Links to create
List<String> updatedTags, // Updated tags for new note
List<NeighborUpdate> neighborUpdates // Updates to existing notes
) {}
Relation Types
The agent uses these relationship types when creating links:
| Type | Use When |
|---|---|
|
General semantic relationship |
|
New note builds upon existing knowledge |
|
New note conflicts with existing knowledge |
|
New note is an example of existing concept |
|
New note replaces outdated information |
3.2. Core Module
The core module provides direct, programmatic access to memory operations without requiring an LLM.
Maven Dependency
<dependency>
<groupId>com.thecookiezen</groupId>
<artifactId>archiledger-core</artifactId>
<version>0.0.6</version>
</dependency>
Creating Notes
MemoryNote note = new MemoryNote(
new MemoryNoteId("architecture-decision"),
"Use event-driven architecture for the notification system",
Set.of("architecture", "decision"),
Set.of("event-driven", "notification", "system")
);
MemoryNote saved = service.createNote(note);
Creating Links
LinkDefinition link = new LinkDefinition(
new MemoryNoteId("architecture-decision"),
new MemoryNoteId("notification-service"),
"DEPENDS_ON",
"The architecture decision depends on the notification service design"
);
service.addLink(link);
Similarity Search
// Basic search
List<SimilarityResult<MemoryNote>> results = service.similaritySearch("notification system");
// Advanced search with parameters
List<SimilarityResult<MemoryNote>> results = service.similaritySearch(
"notification system", // query
10, // topK
0.5, // threshold
0.7 // temperature
);
Graph Traversal
// Get all linked notes
List<MemoryNote> linked = service.getLinkedNotes(new MemoryNoteId("architecture-decision"));
// Get linked notes with specific relation type
List<MemoryNote> dependencies = service.getLinkedNotes(
new MemoryNoteId("architecture-decision"),
"DEPENDS_ON",
10
);
// Traverse upward in the graph
List<MemoryNote> parents = service.getNotesUpward(
new MemoryNoteId("notification-service"),
3, // maxHops
20 // limit
);
3.3. MCP Module
The MCP module exposes all core operations as MCP tools for LLM-based assistants.
Running the Server
# Transient (in-memory)
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
-jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
The server uses streamable HTTP transport on port 8080.
Tool Categories
Note Management
| Tool | Description |
|---|---|
|
Create one or more memory notes with content, keywords, tags, and optional links |
|
Retrieve a specific note by ID (increments retrieval counter) |
|
Find all notes with a given tag (e.g., |
|
Delete notes by their IDs, including associated links and embeddings |
Link Management
| Tool | Description |
|---|---|
|
Add typed links between notes with context (e.g., |
|
Remove typed links between notes |
Graph Exploration
| Tool | Description |
|---|---|
|
Read the entire knowledge graph (all notes and links) |
|
Find all notes directly connected to a given note |
|
List all unique tags currently used across notes |
|
Semantic similarity search with temperature scaling and threshold filtering |
4. Configuration
4.1. Server Properties
Configure the MCP server through Spring Boot properties:
# MCP Server Configuration
spring.ai.mcp.server.name=archiledger-server
spring.ai.mcp.server.version=1.0.0
spring.ai.mcp.server.protocol=STREAMABLE
server.port=8080
Vector Storage
| Property | Default | Description |
|---|---|---|
|
|
LadybugDB extension cache directory |
Embeddings are stored using LadybugDB’s native vector extension with HNSW indexing for fast approximate nearest neighbor matching.
See Embedding Model Configuration for customizing the embedding model.
4.2. Embedding Model Configuration
Archiledger uses vector embeddings for semantic similarity search. By default, it uses a local ONNX model that requires no external API calls.
|
When changing embedding models, you must set |
Default: Local ONNX Model (No Configuration Required)
The default embedding model is all-MiniLM-L6-v2 from HuggingFace, running locally via ONNX runtime:
-
Dimensions: 384
-
Model size: ~80MB (downloaded on first use)
-
No API key required
-
No external dependencies
# No configuration needed - just run:
java -jar archiledger-server-1.0.0-SNAPSHOT.jar
Option 1: Custom HuggingFace ONNX Models
Use any ONNX-compatible embedding model from HuggingFace. The model is downloaded and cached locally.
| Model | Dimensions | Size | Quality |
|---|---|---|---|
all-MiniLM-L6-v2 (default) |
384 |
~80MB |
Good balance of speed and quality |
bge-small-en-v1.5 |
384 |
~120MB |
Better quality for English text |
bge-base-en-v1.5 |
768 |
~400MB |
High quality for English text |
all-mpnet-base-v2 |
768 |
~400MB |
Excellent quality, slower |
multilingual-e5-small |
384 |
~450MB |
Multilingual support |
nomic-embed-text-v1 |
768 |
~270MB |
Long context support |
Example: Using bge-small-en-v1.5
java \
-Dspring.ai.embedding.transformer.onnx.modelUri=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx \
-Dspring.ai.embedding.transformer.tokenizer.uri=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json \
-Dladybugdb.embedding.dimensions=384 \
-jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
Or with environment variables:
export SPRING_AI_EMBEDDING_TRANSFORMER_ONNX_MODELURI=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx
export SPRING_AI_EMBEDDING_TRANSFORMER_TOKENIZER_URI=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json
export LADYBUGDB_EMBEDDING_DIMENSIONS=384
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
Option 2: OpenAI-Compatible APIs
Use embedding models from OpenAI, ZhiPu AI, Mistral, or any OpenAI-compatible API provider.
Example: Using OpenAI text-embedding-3-small
export SPRING_AI_OPENAI_BASE_URL=https://api.openai.com
export SPRING_AI_OPENAI_API_KEY=sk-your-api-key
export SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=text-embedding-3-small
export LADYBUGDB_EMBEDDING_DIMENSIONS=1536
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
Option 3: Ollama Local Models
Use local embedding models served by Ollama via its OpenAI-compatible endpoint.
-
Install Ollama: ollama.ai
-
Pull an embedding model:
ollama pull nomic-embed-text
| Model | Dimensions | Notes |
|---|---|---|
nomic-embed-text |
768 |
Popular, good quality |
mxbai-embed-large |
1024 |
High quality |
all-minilm |
384 |
Fast, lightweight |
snowflake-arctic-embed |
1024 |
Excellent quality |
Example: Using Ollama with nomic-embed-text
# First, ensure Ollama is running and model is pulled
ollama pull nomic-embed-text
# Then start Archiledger with Ollama embedding
export SPRING_AI_OPENAI_BASE_URL=http://localhost:11434
export SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=nomic-embed-text
export LADYBUGDB_EMBEDDING_DIMENSIONS=768
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar
Docker Configuration
When running in Docker, pass environment variables with -e:
OpenAI via Docker:
docker run -p 8080:8080 \
-v ./ladybugdb-data:/data/ladybugdb \
-e SPRING_AI_OPENAI_BASE_URL=https://api.openai.com \
-e SPRING_AI_OPENAI_API_KEY=sk-your-api-key \
-e SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=text-embedding-3-small \
-e LADYBUGDB_EMBEDDING_DIMENSIONS=1536 \
registry.hub.docker.com/thecookiezen/archiledger:latest
Environment Variables Reference
| Variable | Default | Description |
|---|---|---|
|
(Spring AI default) |
HuggingFace ONNX model URL |
|
(Spring AI default) |
HuggingFace tokenizer JSON URL |
|
- |
OpenAI-compatible API base URL |
|
- |
API key for authentication |
|
- |
Embedding model name |
|
384 |
Vector dimensions (must match model) |
Switching Models on Existing Data
|
Changing embedding models on an existing database will cause search inconsistencies. The stored embeddings won’t match the new model’s embeddings. Recommended approach:
|
4.3. CORS Configuration
Configure Cross-Origin Resource Sharing (CORS) for browser-based clients.
Configuration Properties
| Property | Default | Description |
|---|---|---|
|
|
Enable CORS support |
|
|
Set |
|
|
Explicit list of permitted origins |
|
|
Regex patterns for dynamic origin matching |
|
|
Add |
|
|
Preflight cache duration in seconds |
5. Integrations
5.1. MCP Client Configuration
Connect to Archiledger via the streamable HTTP endpoint: localhost:8080/mcp
Gemini CLI
Add to settings.json:
{
"mcpServers": {
"archiledger": {
"httpUrl": "http://localhost:8080/mcp"
}
}
}
6. Reference
6.1. MCP Tools Reference
Low-Level MCP Tools
Note Management
| Tool | Description | Parameters |
|---|---|---|
|
Create one or more memory notes |
|
|
Retrieve a specific note by ID |
|
|
Find all notes with a given tag |
|
|
Delete notes by their IDs |
|
Link Management
| Tool | Description | Parameters |
|---|---|---|
|
Add typed links between notes |
|
|
Remove typed links between notes |
|
Graph Exploration
| Tool | Description | Parameters |
|---|---|---|
|
Read the entire knowledge graph |
none |
|
Find notes connected to a given note |
|
|
List all unique tags |
none |
|
Semantic similarity search |
|
Agentic Memory MCP Tools
| Tool | Description | Parameters |
|---|---|---|
|
Semantic similarity search |
|
|
Expand from a note to find connected notes |
|
|
Traverse upward in the graph |
|
|
Store content with automatic classification |
|