Skip to content

Memory API

Moxxy agents have a multi-tier memory system. Most memory operations happen automatically through primitives during agent runs. The API provides endpoints for searching memory and triggering compaction.

How Memory Works

Memory is managed primarily through built-in primitives that the agent invokes during runs:

PrimitiveDescription
memory.storeStore a fact or observation in long-term memory
memory.recallRetrieve relevant memories by semantic similarity
memory.stm_readRead the current short-term memory context
memory.stm_writeWrite to short-term memory

These primitives are called by the agent itself as part of its reasoning process. You do not need to invoke them manually via the API.

Search Memory

GET /v1/agents/{id}/memory/search

Performs combined semantic and keyword search across the agent's memory.

Query Parameters

ParameterTypeRequiredDescription
qstringYesSearch query
limitintegerNoMax results to return

Example

bash
curl "http://127.0.0.1:3000/v1/agents/researcher/memory/search?q=transformer+architectures&limit=5"

Response (200)

json
[
  {
    "id": "mem_42",
    "content": "The original Transformer paper (Vaswani et al., 2017) introduced the self-attention mechanism.",
    "similarity": 0.92,
    "created_at": "2025-03-10T10:00:00Z"
  },
  {
    "id": "mem_38",
    "content": "GPT-4 uses a mixture-of-experts architecture based on transformers.",
    "similarity": 0.78,
    "created_at": "2025-03-08T15:30:00Z"
  }
]

Trigger Memory Compaction

POST /v1/agents/{id}/memory/compact

Triggers a compaction of the agent's memory. Compaction consolidates, deduplicates, and summarizes stored memories to keep the memory store efficient.

This is typically handled automatically, but you can trigger it manually if needed.

Example

bash
curl -X POST "http://127.0.0.1:3000/v1/agents/researcher/memory/compact"

Response (200)

json
{
  "agent_id": "researcher",
  "status": "compaction_started",
  "memories_before": 250
}

Memory Architecture

Moxxy uses a tiered memory architecture:

Short-Term Memory (STM): The current conversation context. Managed via memory.stm_read and memory.stm_write primitives. Reset when the session is reset via POST /v1/agents/{name}/reset.

Long-Term Memory (LTM): Persistent factual memory stored with vector embeddings for semantic retrieval. The agent stores facts via memory.store and retrieves them via memory.recall. Searchable through the /memory/search API endpoint.

Compaction: Over time, long-term memory accumulates redundant or overlapping entries. Compaction consolidates these into cleaner, more useful summaries. Triggered automatically or manually via /memory/compact.

Open source · Self-hosted · Data sovereign