Memory API
Moxxy agents have a multi-tier memory system. Most memory operations happen automatically through primitives during agent runs. The API provides endpoints for searching memory and triggering compaction.
How Memory Works
Memory is managed primarily through built-in primitives that the agent invokes during runs:
| Primitive | Description |
|---|---|
memory.store | Store a fact or observation in long-term memory |
memory.recall | Retrieve relevant memories by semantic similarity |
memory.stm_read | Read the current short-term memory context |
memory.stm_write | Write to short-term memory |
These primitives are called by the agent itself as part of its reasoning process. You do not need to invoke them manually via the API.
Search Memory
GET /v1/agents/{id}/memory/searchPerforms combined semantic and keyword search across the agent's memory.
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
q | string | Yes | Search query |
limit | integer | No | Max results to return |
Example
curl "http://127.0.0.1:3000/v1/agents/researcher/memory/search?q=transformer+architectures&limit=5"Response (200)
[
{
"id": "mem_42",
"content": "The original Transformer paper (Vaswani et al., 2017) introduced the self-attention mechanism.",
"similarity": 0.92,
"created_at": "2025-03-10T10:00:00Z"
},
{
"id": "mem_38",
"content": "GPT-4 uses a mixture-of-experts architecture based on transformers.",
"similarity": 0.78,
"created_at": "2025-03-08T15:30:00Z"
}
]Trigger Memory Compaction
POST /v1/agents/{id}/memory/compactTriggers a compaction of the agent's memory. Compaction consolidates, deduplicates, and summarizes stored memories to keep the memory store efficient.
This is typically handled automatically, but you can trigger it manually if needed.
Example
curl -X POST "http://127.0.0.1:3000/v1/agents/researcher/memory/compact"Response (200)
{
"agent_id": "researcher",
"status": "compaction_started",
"memories_before": 250
}Memory Architecture
Moxxy uses a tiered memory architecture:
Short-Term Memory (STM): The current conversation context. Managed via memory.stm_read and memory.stm_write primitives. Reset when the session is reset via POST /v1/agents/{name}/reset.
Long-Term Memory (LTM): Persistent factual memory stored with vector embeddings for semantic retrieval. The agent stores facts via memory.store and retrieves them via memory.recall. Searchable through the /memory/search API endpoint.
Compaction: Over time, long-term memory accumulates redundant or overlapping entries. Compaction consolidates these into cleaner, more useful summaries. Triggered automatically or manually via /memory/compact.