Archiledger

Archiledger combines Greek arkhē (origin, first principle) with "Ledger" - a foundational record serving as the source of truth for AI memory.

It is a specialized Knowledge Graph that serves as a RAG (Retrieval-Augmented Generation) system with vector search. It is exposed as a Model Context Protocol (MCP) server to enable LLM-based assistants to store, connect, and recall information using a graph database.

1. Overview

1.1. Why Archiledger?

LLMs are powerful, but they forget everything when a conversation ends:

Repeating yourself - Telling your assistant the same preferences over and over
Lost insights - Valuable analysis from one session isn’t available in the next
No connected thinking - Information lives in silos without relationships
Manual categorization - You must tag and organize everything yourself

Archiledger solves this with a graph-based memory powered by agentic RAG:

Problem Solution

Problem	Solution
Context resets every conversation	Persistent notes that survive restarts
Flat, disconnected notes	Typed links between atomic notes (Zettelkasten)
Manual categorization burden	AI-powered automatic classification
No temporal awareness	ISO-8601 timestamps on every note
Keyword search limits	Vector search finds semantically similar notes
Hard to explore large graphs	Graph traversal via `LINKED_TO` relationships
Static knowledge graphs	Memory evolution through agent evaluation

Context resets every conversation

Persistent notes that survive restarts

Flat, disconnected notes

Typed links between atomic notes (Zettelkasten)

Manual categorization burden

AI-powered automatic classification

No temporal awareness

ISO-8601 timestamps on every note

Keyword search limits

Vector search finds semantically similar notes

Hard to explore large graphs

Graph traversal via LINKED_TO relationships

Static knowledge graphs

Memory evolution through agent evaluation

Why Agentic Memory?

Standard memory systems face several challenges when storing and retrieving information:

Challenge	Example
Ambiguity	"How do I handle taxes?" - personal vs business context unclear
Scattered evidence	Remote work policy for contractors spans HR docs and contractor agreements
Static categorization	Tags and categories become stale as knowledge evolves
No self-correction	System accepts similarity scores without evaluation

Challenge

Example

Ambiguity

"How do I handle taxes?" - personal vs business context unclear

Scattered evidence

Remote work policy for contractors spans HR docs and contractor agreements

Static categorization

Tags and categories become stale as knowledge evolves

No self-correction

System accepts similarity scores without evaluation

Agentic memory addresses these through an agent-based control loop that analyzes, evaluates, and evolves the knowledge graph.

Agentic Memory: Automatic Classification

The key differentiator is the Agentic Memory module, which uses an AI agent to:

Analyze content - Extract keywords, context, and tags automatically
Find relationships - Search for semantically similar existing memories
Evaluate evolution - Decide whether to link, update, or create new notes
Self-correct - Assess classification quality and refine when needed

This eliminates the burden of manual categorization while maintaining a rich, connected knowledge graph.

How It Works

The AgenticMemoryAgent implements a multi-step pipeline for content classification:

Step 1: Content Analysis

The agent analyzes incoming content to extract structured metadata:

Keywords: Salient nouns, verbs, and key concepts ordered by importance
Context: Single sentence summarizing topic, arguments, and purpose
Tags: Broad categories for classification (domain, format, type)

Step 2: Evolution Evaluation

The agent uses RAG to find similar existing memories and evaluates relationships:

Searches for semantically similar notes in the knowledge graph
Compares new content with neighbors to identify relationships
Decides whether to create links, update tags, or modify existing notes

Step 3: Memory Storage

Based on the evolution decision, the agent:

Creates the new note with extracted metadata
Establishes typed links to related notes with context
Optionally updates neighbor notes with new context or tags

Trade-offs

Agentic memory introduces trade-offs compared to manual note management:

Latency: Each agent decision involves LLM calls. Classification may take 2-5 seconds per note.
Cost: Token consumption per write operation. Factor this into high-volume scenarios.
Variability: The agent may make different decisions based on context. Results are not always deterministic.
Evaluator Quality: Self-evaluation depends on LLM’s ability to judge relevance accurately.

Use agentic memory when:

You want automatic content classification
Your knowledge graph needs to evolve over time
Manual tagging doesn’t scale

Use the core module when:

You need deterministic, predictable behavior
Latency is critical
You want full control over classification

Four Ways to Use Archiledger

Quick Decision Guide

Requirement	Recommended Approach
Pure Java, no LLM	Core Module (Maven)
LLM with full manual control	MCP Server
AI classification in Java app	Agentic Memory (Embabel)
LLM with automatic memory management	Agentic Memory MCP
Full control over tags/links	Core Module or MCP Server
Automatic knowledge evolution	Agentic Memory (either)
Minimal latency, deterministic behavior	Core Module or MCP Server

Requirement

Recommended Approach

Pure Java, no LLM

Core Module (Maven)

LLM with full manual control

MCP Server

AI classification in Java app

Agentic Memory (Embabel)

LLM with automatic memory management

Agentic Memory MCP

Full control over tags/links

Core Module or MCP Server

Automatic knowledge evolution

Agentic Memory (either)

Minimal latency, deterministic behavior

Core Module or MCP Server

1.2. Architecture

This server implements no authentication and uses an embedded graph database designed for local development only. Not recommended for production.

Domain Layer

The domain layer contains the core business entities:

MemoryNote: Represents a single atomic note with content, tags, keywords, and timestamps
MemoryNoteId: Unique identifier for notes
NoteLink: Typed relationship between notes
LinkDefinition: Defines a link with context explaining the relationship

Application Layer

The application layer orchestrates domain logic:

MemoryNoteService: Main service interface for note operations
Handles retrieval count tracking and embedding generation
Coordinates between domain and infrastructure layers

Infrastructure Layer

The infrastructure layer provides technical implementations:

Persistence: LadybugMemoryNoteRepository - stores notes as nodes, links as relationships
Vector Search: LadybugEmbeddingsService - uses LadybugDB’s native vector extension with HNSW indexing
MCP: McpToolAdapter - exposes memory operations as MCP tools

Agentic Memory

The agentic-memory module provides AI-driven memory evolution:

AgenticMemoryAgent: Analyzes notes and suggests new links based on semantic relationships
Context-Aware Links: Automatically evaluates whether to add, update, or remove links
MemoryNoteSearchOperations: Implements RAG interfaces for vector search and result expansion

1.3. Core Concepts

Zettelkasten Method

Archiledger implements the Zettelkasten method for knowledge management:

Atomic Notes: Each note contains a single idea or piece of information
Links: Notes are connected through typed relationships
Tags: Notes can be categorized with multiple tags
Keywords: Searchable terms for quick lookup

Memory Notes

A MemoryNote is the fundamental unit of storage:

Property Description

Property	Description
`id`	Unique identifier (MemoryNoteId)
`content`	The main text content of the note
`tags`	Set of tags for categorization
`keywords`	Set of keywords for search
`createdAt`	ISO-8601 timestamp when created
`updatedAt`	ISO-8601 timestamp when last modified
`retrievalCount`	Number of times the note has been retrieved

id

Unique identifier (MemoryNoteId)

content

The main text content of the note

tags

Set of tags for categorization

keywords

Set of keywords for search

createdAt

ISO-8601 timestamp when created

updatedAt

ISO-8601 timestamp when last modified

retrievalCount

Number of times the note has been retrieved

Note Links

Links connect notes with typed relationships:

Link Type Use Case

Link Type	Use Case
`RELATES_TO`	General relationship between notes
`DEPENDS_ON`	One note depends on another
`AFFECTS`	One thing impacts another
`PART_OF`	Component/container relationship
`SUPERSEDES`	Replaces previous decision/approach
`CONTRADICTS`	Conflicts with another note
`LINKED_TO`	Default relationship type

RELATES_TO

General relationship between notes

DEPENDS_ON

One note depends on another

AFFECTS

One thing impacts another

PART_OF

Component/container relationship

SUPERSEDES

Replaces previous decision/approach

CONTRADICTS

Conflicts with another note

LINKED_TO

Default relationship type

Each link includes a context field explaining why the relationship exists.

Vector Search

Archiledger uses semantic similarity search:

Embeddings: Generated via Spring AI’s ONNX Model (all-minilm-l6-v2)
Storage: LadybugDB’s native vector extension
Indexing: HNSW (Hierarchical Navigable Small World) for fast approximate nearest neighbor matching
Temperature Scaling: Adjustable similarity scoring
Threshold Filtering: Filter results by minimum similarity score

Graph Traversal

Navigate the knowledge graph through relationships:

getLinkedNotes: Find all notes directly connected to a given note
getNotesUpward: Traverse upward in the graph to find parent/related notes
readGraph: Get the entire knowledge graph for visualization

2. Getting Started

2.1. Installation

Prerequisites

Java 21 or higher
Maven 3.9+

Building from Source

Clone the repository and build:

git clone https://github.com/thecookiezen/archiledger.git
cd archiledger
mvn clean package

This builds all modules:

core/target/archiledger-core-*.jar - Core library
mcp/target/archiledger-server-*.jar - Low-level MCP server
agentic-memory/target/agentic-memory-*.jar - Agentic memory library
agentic-memory-mcp/target/agentic-memory-mcp-*.jar - Agentic memory MCP server

Maven Dependency

Add the core module as a dependency:

<dependency>
    <groupId>com.thecookiezen</groupId>
    <artifactId>archiledger-core</artifactId>
    <version>0.0.6</version>
</dependency>

2.2. Quick Start

This guide will get you started with Archiledger in 5 minutes.

Using the Core Module (Java API)

Create a MemoryNoteService and start creating notes:

import com.thecookiezen.archiledger.application.service.MemoryNoteService;
import com.thecookiezen.archiledger.application.service.MemoryNoteServiceImpl;
import com.thecookiezen.archiledger.domain.model.MemoryNote;
import com.thecookiezen.archiledger.domain.model.MemoryNoteId;

// Create the service (requires repository configuration)
MemoryNoteService service = new MemoryNoteServiceImpl(repository);

// Create a note
MemoryNote note = new MemoryNote(
    new MemoryNoteId("my-first-note"),
    "This is my first memory note",
    Set.of("example", "getting-started"),
    Set.of("memory", "note")
);

MemoryNote saved = service.createNote(note);

// Retrieve the note
Optional<MemoryNote> retrieved = service.getNote(saved.getId());

// Search for similar notes
List<SimilarityResult<MemoryNote>> results = service.similaritySearch("memory note");

Using the MCP Server

Start the MCP server:

# Transient (in-memory)
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
     -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

The server runs on port 8080 with the MCP endpoint at localhost:8080/mcp.

By default, the server uses a local ONNX embedding model (all-MiniLM-L6-v2). See Embedding Model Configuration to use OpenAI, Ollama, or custom HuggingFace models.

2.3. Running with Docker

Low-Level MCP Server

Transient (Data lost when container stops):

docker run -p 8080:8080 registry.hub.docker.com/thecookiezen/archiledger:latest

Persistent (Data saved to host filesystem):

docker run -p 8080:8080 \
  -v /path/to/local/data:/data \
  registry.hub.docker.com/thecookiezen/archiledger:latest

Custom data path:

docker run -p 8080:8080 \
  -e LADYBUGDB_DATA_PATH=/custom/data/archiledger.lbdb \
  -v /path/to/local/data:/custom/data \
  registry.hub.docker.com/thecookiezen/archiledger:latest

Agentic Memory MCP Server

Requires LLM configuration for AI-powered features:

docker run -p 8080:8080 \
  -e OPENAI_CUSTOM_BASE_URL=https://api.example.com \
  -e OPENAI_CUSTOM_MODELS=model-name \
  -e OPENAI_CUSTOM_API_KEY=your_api_key \
  registry.hub.docker.com/thecookiezen/archiledger-agentic-memory:latest

With persistent storage:

docker run -p 8080:8080 \
  -v /path/to/local/data:/data \
  -e OPENAI_CUSTOM_BASE_URL=https://api.example.com \
  -e OPENAI_CUSTOM_MODELS=model-name \
  -e OPENAI_CUSTOM_API_KEY=your_api_key \
  registry.hub.docker.com/thecookiezen/archiledger-agentic-memory:latest

Environment Variables

Variable Default Description

Variable	Default	Description
`LADYBUGDB_DATA_PATH`	`/data/archiledger.lbdb`	File path where LadybugDB stores data
`LADYBUGDB_EXTENSION_DIR`	`/data/ladybugdb-extensions`	Directory for LadybugDB extension cache
`OPENAI_CUSTOM_BASE_URL`	-	Base URL for the OpenAI-compatible API
`OPENAI_CUSTOM_MODELS`	-	Model name to use
`OPENAI_CUSTOM_API_KEY`	-	API key for authentication
`OPENAI_CUSTOM_COMPLETIONS_PATH`	`/v1/chat/completions`	Custom completions endpoint path

LADYBUGDB_DATA_PATH

/data/archiledger.lbdb

File path where LadybugDB stores data

LADYBUGDB_EXTENSION_DIR

/data/ladybugdb-extensions

Directory for LadybugDB extension cache

OPENAI_CUSTOM_BASE_URL

Base URL for the OpenAI-compatible API

OPENAI_CUSTOM_MODELS

Model name to use

OPENAI_CUSTOM_API_KEY

API key for authentication

OPENAI_CUSTOM_COMPLETIONS_PATH

/v1/chat/completions

Custom completions endpoint path

The /data volume must be writable by UID 1000 (spring user).

Docker Tips

Persistent Data: Always mount a volume (-v) to preserve your knowledge graph
Container Lifecycle: Run with -d (detached mode)
Port Conflicts: Map to different port (e.g., -p 9090:8080) and update URL
Named Containers: Use --name archiledger for easy management
Debug Logs: docker logs archiledger

3. Modules

3.1. Agentic Memory Module

The agentic memory module provides AI-powered memory management with automatic content classification, tagging, and knowledge graph evolution.

Maven Dependency

<dependency>
    <groupId>com.thecookiezen</groupId>
    <artifactId>agentic-memory</artifactId>
    <version>0.0.6</version>
</dependency>

Dependencies

The agentic memory module is built on the Embabel framework.

Running the Agentic Memory MCP Server

# Transient
java -jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar

# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
     -jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar

Requires LLM configuration for AI-powered features.

LLM Configuration

Set the following environment variables:

export OPENAI_CUSTOM_BASE_URL=https://api.example.com
export OPENAI_CUSTOM_MODELS=model-name
export OPENAI_CUSTOM_API_KEY=your_api_key
export OPENAI_CUSTOM_COMPLETIONS_PATH=/v4/chat/completion

Debug Configuration

Enable debug mode to see the agent’s internal prompts and LLM responses when writing memories:

agentic-memory.debug=true

Or using environment variable:

export AGENTIC_MEMORY_DEBUG=true
java -jar agentic-memory-mcp/target/agentic-memory-mcp-0.0.6.jar

When enabled, the agentic_memory_write tool will log the prompts sent to the LLM and the raw responses received. This is useful for troubleshooting agent behaviour or understanding how content is being classified.

Property Default Description

Property	Default	Description
`agentic-memory.debug`	`false`	Enable verbose logging of LLM prompts and responses during memory writes

agentic-memory.debug

false

Enable verbose logging of LLM prompts and responses during memory writes

Agentic Memory MCP Tools

Tool Description

Tool	Description
`memory_vector_search`	Perform semantic similarity search across memory notes
`memory_broaden_search`	Given a note ID, expand to find connected/linked notes
`memory_zoom_out`	Traverse upward in the knowledge graph to find parent/related notes
`agentic_memory_write`	Store content with automatic AI classification, tagging, and link generation

memory_vector_search

Perform semantic similarity search across memory notes

memory_broaden_search

Given a note ID, expand to find connected/linked notes

memory_zoom_out

Traverse upward in the knowledge graph to find parent/related notes

agentic_memory_write

Store content with automatic AI classification, tagging, and link generation

The agent automatically:

Analyzes content for keywords, context, and tags
Searches for similar existing memories
Evaluates potential relationships
Creates typed links with explanatory context
Stores the classified note

Using in Java Applications

import com.thecookiezen.archiledger.agenticmemory.AgenticMemoryAgent;
import com.thecookiezen.archiledger.agenticmemory.domain.UpsertMemoryRequest;

@Autowired
private AgenticMemoryAgent agent;

public void storeMemory(String content) {
    UpsertMemoryRequest request = new UpsertMemoryRequest(content);
    var result = agent.storeMemory(request);

    System.out.println("Created note: " + result.id());
    System.out.println("Tags: " + result.tags());
    System.out.println("Links: " + result.links().size());
}

Domain Models

NoteAnalysis

Result of content analysis:

public record NoteAnalysis(
    List<String> keywords,    // Salient concepts ordered by importance
    String context,           // Summary: topic, arguments, purpose
    List<String> tags         // Classification categories
) {}

EvolutionDecision

Result of evolution evaluation:

public record EvolutionDecision(
    boolean shouldEvolve,              // Whether to modify the graph
    List<SuggestedLink> suggestedLinks, // Links to create
    List<String> updatedTags,          // Updated tags for new note
    List<NeighborUpdate> neighborUpdates // Updates to existing notes
) {}

SuggestedLink

A link to create between notes:

public record SuggestedLink(
    String targetId,       // ID of related note
    String relationType,   // e.g., "related", "extends", "contradicts"
    String context         // Why this link exists
) {}

Relation Types

The agent uses these relationship types when creating links:

Type Use When

Type	Use When
`related`	General semantic relationship
`extends`	New note builds upon existing knowledge
`contradicts`	New note conflicts with existing knowledge
`exemplifies`	New note is an example of existing concept
`supersedes`	New note replaces outdated information

related

General semantic relationship

extends

New note builds upon existing knowledge

contradicts

New note conflicts with existing knowledge

exemplifies

New note is an example of existing concept

supersedes

New note replaces outdated information

3.2. Core Module

The core module provides direct, programmatic access to memory operations without requiring an LLM.

Maven Dependency

<dependency>
    <groupId>com.thecookiezen</groupId>
    <artifactId>archiledger-core</artifactId>
    <version>0.0.6</version>
</dependency>

Creating Notes

MemoryNote note = new MemoryNote(
    new MemoryNoteId("architecture-decision"),
    "Use event-driven architecture for the notification system",
    Set.of("architecture", "decision"),
    Set.of("event-driven", "notification", "system")
);

MemoryNote saved = service.createNote(note);

Creating Links

LinkDefinition link = new LinkDefinition(
    new MemoryNoteId("architecture-decision"),
    new MemoryNoteId("notification-service"),
    "DEPENDS_ON",
    "The architecture decision depends on the notification service design"
);

service.addLink(link);

Similarity Search

// Basic search
List<SimilarityResult<MemoryNote>> results = service.similaritySearch("notification system");

// Advanced search with parameters
List<SimilarityResult<MemoryNote>> results = service.similaritySearch(
    "notification system",  // query
    10,                      // topK
    0.5,                     // threshold
    0.7                      // temperature
);

Graph Traversal

// Get all linked notes
List<MemoryNote> linked = service.getLinkedNotes(new MemoryNoteId("architecture-decision"));

// Get linked notes with specific relation type
List<MemoryNote> dependencies = service.getLinkedNotes(
    new MemoryNoteId("architecture-decision"),
    "DEPENDS_ON",
    10
);

// Traverse upward in the graph
List<MemoryNote> parents = service.getNotesUpward(
    new MemoryNoteId("notification-service"),
    3,   // maxHops
    20   // limit
);

Domain Models

MemoryNote

public record MemoryNote(
    MemoryNoteId id,
    String content,
    Set<String> tags,
    Set<String> keywords,
    Instant createdAt,
    Instant updatedAt,
    long retrievalCount
) {}

NoteLink

public record NoteLink(
    MemoryNoteId from,
    MemoryNoteId to,
    String relationType,
    String context
) {}

3.3. MCP Module

The MCP module exposes all core operations as MCP tools for LLM-based assistants.

Running the Server

# Transient (in-memory)
java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

# Persistent
java -Dladybugdb.data-path=./archiledger.lbdb \
     -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

The server uses streamable HTTP transport on port 8080.

Tool Categories

Note Management

Tool Description

Tool	Description
`create_notes`	Create one or more memory notes with content, keywords, tags, and optional links
`get_note`	Retrieve a specific note by ID (increments retrieval counter)
`get_notes_by_tag`	Find all notes with a given tag (e.g., `architecture`, `decision`, `bug`)
`delete_notes`	Delete notes by their IDs, including associated links and embeddings

create_notes

Create one or more memory notes with content, keywords, tags, and optional links

get_note

Retrieve a specific note by ID (increments retrieval counter)

get_notes_by_tag

Find all notes with a given tag (e.g., architecture, decision, bug)

delete_notes

Delete notes by their IDs, including associated links and embeddings

Link Management

Tool Description

Tool	Description
`add_links`	Add typed links between notes with context (e.g., `DEPENDS_ON`, `RELATED_TO`, `CONTRADICTS`)
`delete_links`	Remove typed links between notes

add_links

Add typed links between notes with context (e.g., DEPENDS_ON, RELATED_TO, CONTRADICTS)

delete_links

Remove typed links between notes

Graph Exploration

Tool Description

Tool	Description
`read_graph`	Read the entire knowledge graph (all notes and links)
`get_linked_notes`	Find all notes directly connected to a given note
`get_all_tags`	List all unique tags currently used across notes
`search_notes`	Semantic similarity search with temperature scaling and threshold filtering

read_graph

Read the entire knowledge graph (all notes and links)

get_linked_notes

Find all notes directly connected to a given note

get_all_tags

List all unique tags currently used across notes

search_notes

Semantic similarity search with temperature scaling and threshold filtering

4. Configuration

4.1. Server Properties

Configure the MCP server through Spring Boot properties:

# MCP Server Configuration
spring.ai.mcp.server.name=archiledger-server
spring.ai.mcp.server.version=1.0.0
spring.ai.mcp.server.protocol=STREAMABLE
server.port=8080

Vector Storage

Property Default Description

Property	Default	Description
`ladybugdb.extension-dir`	`~/.lbug/extensions`	LadybugDB extension cache directory

ladybugdb.extension-dir

~/.lbug/extensions

LadybugDB extension cache directory

Embeddings are stored using LadybugDB’s native vector extension with HNSW indexing for fast approximate nearest neighbor matching.

See Embedding Model Configuration for customizing the embedding model.

Data Path

Set the data path for persistent storage:

java -Dladybugdb.data-path=./archiledger.lbdb -jar archiledger-server.jar

Or using environment variable:

export LADYBUGDB_DATA_PATH=/path/to/archiledger.lbdb
java -jar archiledger-server.jar

4.2. Embedding Model Configuration

Archiledger uses vector embeddings for semantic similarity search. By default, it uses a local ONNX model that requires no external API calls.

When changing embedding models, you must set LADYBUGDB_EMBEDDING_DIMENSIONS to match your model’s output dimensions. Using mismatched dimensions will cause search to fail.

Default: Local ONNX Model (No Configuration Required)

The default embedding model is all-MiniLM-L6-v2 from HuggingFace, running locally via ONNX runtime:

Dimensions: 384
Model size: ~80MB (downloaded on first use)
No API key required
No external dependencies

# No configuration needed - just run:
java -jar archiledger-server-1.0.0-SNAPSHOT.jar

Option 1: Custom HuggingFace ONNX Models

Use any ONNX-compatible embedding model from HuggingFace. The model is downloaded and cached locally.

Table 1. Popular ONNX Embedding Models
Model	Dimensions	Size	Quality
all-MiniLM-L6-v2 (default)	384	~80MB	Good balance of speed and quality
bge-small-en-v1.5	384	~120MB	Better quality for English text
bge-base-en-v1.5	768	~400MB	High quality for English text
all-mpnet-base-v2	768	~400MB	Excellent quality, slower
multilingual-e5-small	384	~450MB	Multilingual support
nomic-embed-text-v1	768	~270MB	Long context support

Example: Using bge-small-en-v1.5

java \
  -Dspring.ai.embedding.transformer.onnx.modelUri=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx \
  -Dspring.ai.embedding.transformer.tokenizer.uri=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json \
  -Dladybugdb.embedding.dimensions=384 \
  -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

Or with environment variables:

export SPRING_AI_EMBEDDING_TRANSFORMER_ONNX_MODELURI=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/onnx/model.onnx
export SPRING_AI_EMBEDDING_TRANSFORMER_TOKENIZER_URI=https://huggingface.co/BAAI/bge-small-en-v1.5/resolve/main/tokenizer.json
export LADYBUGDB_EMBEDDING_DIMENSIONS=384

java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

Option 2: OpenAI-Compatible APIs

Use embedding models from OpenAI, ZhiPu AI, Mistral, or any OpenAI-compatible API provider.

Example: Using OpenAI text-embedding-3-small

export SPRING_AI_OPENAI_BASE_URL=https://api.openai.com
export SPRING_AI_OPENAI_API_KEY=sk-your-api-key
export SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=text-embedding-3-small
export LADYBUGDB_EMBEDDING_DIMENSIONS=1536

java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

Option 3: Ollama Local Models

Use local embedding models served by Ollama via its OpenAI-compatible endpoint.

Prerequisites

Install Ollama: ollama.ai
Pull an embedding model: ollama pull nomic-embed-text

Table 2. Popular Ollama Embedding Models
Model	Dimensions	Notes
nomic-embed-text	768	Popular, good quality
mxbai-embed-large	1024	High quality
all-minilm	384	Fast, lightweight
snowflake-arctic-embed	1024	Excellent quality

Example: Using Ollama with nomic-embed-text

# First, ensure Ollama is running and model is pulled
ollama pull nomic-embed-text

# Then start Archiledger with Ollama embedding
export SPRING_AI_OPENAI_BASE_URL=http://localhost:11434
export SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=nomic-embed-text
export LADYBUGDB_EMBEDDING_DIMENSIONS=768

java -jar mcp/target/archiledger-server-1.0.0-SNAPSHOT.jar

Docker Configuration

When running in Docker, pass environment variables with -e:

OpenAI via Docker:

docker run -p 8080:8080 \
  -v ./ladybugdb-data:/data/ladybugdb \
  -e SPRING_AI_OPENAI_BASE_URL=https://api.openai.com \
  -e SPRING_AI_OPENAI_API_KEY=sk-your-api-key \
  -e SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL=text-embedding-3-small \
  -e LADYBUGDB_EMBEDDING_DIMENSIONS=1536 \
  registry.hub.docker.com/thecookiezen/archiledger:latest

Environment Variables Reference

Variable Default Description

Variable	Default	Description
`SPRING_AI_EMBEDDING_TRANSFORMER_ONNX_MODELURI`	(Spring AI default)	HuggingFace ONNX model URL
`SPRING_AI_EMBEDDING_TRANSFORMER_TOKENIZER_URI`	(Spring AI default)	HuggingFace tokenizer JSON URL
`SPRING_AI_OPENAI_BASE_URL`	-	OpenAI-compatible API base URL
`SPRING_AI_OPENAI_API_KEY`	-	API key for authentication
`SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL`	-	Embedding model name
`LADYBUGDB_EMBEDDING_DIMENSIONS`	384	Vector dimensions (must match model)

SPRING_AI_EMBEDDING_TRANSFORMER_ONNX_MODELURI

(Spring AI default)

HuggingFace ONNX model URL

SPRING_AI_EMBEDDING_TRANSFORMER_TOKENIZER_URI

(Spring AI default)

HuggingFace tokenizer JSON URL

SPRING_AI_OPENAI_BASE_URL

OpenAI-compatible API base URL

SPRING_AI_OPENAI_API_KEY

API key for authentication

SPRING_AI_OPENAI_EMBEDDING_OPTIONS_MODEL

Embedding model name

LADYBUGDB_EMBEDDING_DIMENSIONS

384

Vector dimensions (must match model)

Switching Models on Existing Data

Changing embedding models on an existing database will cause search inconsistencies. The stored embeddings won’t match the new model’s embeddings.

Recommended approach:

Export your notes (using read_graph MCP tool)
Delete the database directory
Start with the new embedding model
Recreate notes (embeddings will be generated with the new model)

4.3. CORS Configuration

Configure Cross-Origin Resource Sharing (CORS) for browser-based clients.

Configuration Properties

Property Default Description

Property	Default	Description
`cors.enabled`	`false`	Enable CORS support
`cors.allow-any-origin`	`false`	Set `Access-Control-Allow-Origin` to `*`
`cors.origins`	`[]`	Explicit list of permitted origins
`cors.match-origins`	`[]`	Regex patterns for dynamic origin matching
`cors.allow-credentials`	`false`	Add `Access-Control-Allow-Credentials` header
`cors.max-age`	`7200`	Preflight cache duration in seconds

cors.enabled

false

Enable CORS support

cors.allow-any-origin

false

Set Access-Control-Allow-Origin to *

cors.origins

[]

Explicit list of permitted origins

cors.match-origins

[]

Regex patterns for dynamic origin matching

cors.allow-credentials

false

Add Access-Control-Allow-Credentials header

cors.max-age

7200

Preflight cache duration in seconds

Development Configuration

Permissive configuration for local development:

cors.enabled=true
cors.allow-any-origin=true

Production Configuration

Restricted configuration for production:

cors.enabled=true
cors.origins=https://my-secure-frontend.internal
cors.allow-credentials=true

5. Integrations

5.1. MCP Client Configuration

Connect to Archiledger via the streamable HTTP endpoint: localhost:8080/mcp

Gemini CLI

Add to settings.json:

{
  "mcpServers": {
    "archiledger": {
      "httpUrl": "http://localhost:8080/mcp"
    }
  }
}

VSCode / GitHub Copilot

Add to settings.json:

{
  "servers": {
    "archiledger": {
      "type": "http",
      "url": "http://localhost:8080/mcp"
    }
  }
}

Antigravity

{
  "mcpServers": {
    "archiledger": {
      "serverUrl": "http://localhost:8080/mcp"
    }
  }
}

Custom Port Configuration

If running on a different port:

java -Dserver.port=9090 -jar archiledger-server.jar

Update the client configuration:

{
  "mcpServers": {
    "archiledger": {
      "httpUrl": "http://localhost:9090/mcp"
    }
  }
}

6. Reference

6.1. MCP Tools Reference

Low-Level MCP Tools

Note Management

Tool Description Parameters

Tool	Description	Parameters
`create_notes`	Create one or more memory notes	`notes` (array)
`get_note`	Retrieve a specific note by ID	`id` (string)
`get_notes_by_tag`	Find all notes with a given tag	`tag` (string)
`delete_notes`	Delete notes by their IDs	`ids` (array of strings)

create_notes

Create one or more memory notes

notes (array)

get_note

Retrieve a specific note by ID

id (string)

get_notes_by_tag

Find all notes with a given tag

tag (string)

delete_notes

Delete notes by their IDs

ids (array of strings)

Link Management

Tool Description Parameters

Tool	Description	Parameters
`add_links`	Add typed links between notes	`links` (array)
`delete_links`	Remove typed links between notes	`links` (array)

add_links

Add typed links between notes

links (array)

delete_links

Remove typed links between notes

links (array)

Graph Exploration

Tool Description Parameters

Tool	Description	Parameters
`read_graph`	Read the entire knowledge graph	none
`get_linked_notes`	Find notes connected to a given note	`noteId` (string), optional: `relationType`, `limit`
`get_all_tags`	List all unique tags	none
`search_notes`	Semantic similarity search	`query` (string), optional: `topK`, `threshold`, `temperature`

read_graph

Read the entire knowledge graph

none

get_linked_notes

Find notes connected to a given note

noteId (string), optional: relationType, limit

get_all_tags

List all unique tags

none

search_notes

Semantic similarity search

query (string), optional: topK, threshold, temperature

Agentic Memory MCP Tools

Tool Description Parameters

Tool	Description	Parameters
`memory_vector_search`	Semantic similarity search	`query` (string), optional: `topK` (default: 10), `threshold` (default: 0.5)
`memory_broaden_search`	Expand from a note to find connected notes	`noteId` (string), optional: `limit` (default: 10)
`memory_zoom_out`	Traverse upward in the graph	`noteId` (string), optional: `limit` (default: 10)
`agentic_memory_write`	Store content with automatic classification	`content` (string)

memory_vector_search

Semantic similarity search

query (string), optional: topK (default: 10), threshold (default: 0.5)

memory_broaden_search

Expand from a note to find connected notes

noteId (string), optional: limit (default: 10)

memory_zoom_out

Traverse upward in the graph

noteId (string), optional: limit (default: 10)

agentic_memory_write

Store content with automatic classification

content (string)