Verified by the sovseal team

recall_memory

How semantic recall works locally with sub-25ms p99 latency and zero network round-trips.

recall_memory performs a vector similarity search locally against the on-device LanceDB database. Because the vector database and the embedding model run within the agent's host process, semantic reads never require network requests.

Parameters

ParameterTypeRequiredDescription
querystringThe natural-language query to embed and search.
topKnumberThe maximum number of results to return. Default is 3.
minScorenumberMinimum composite-score threshold. Results below it are filtered out (see Reinforcement-Aware Ranking below).
filtersobjectSQL-like conditions evaluated locally against decrypted record metadata (e.g., matching tags, specific lineages, or categories).

Cold-Start Elimination

The local embedding model (Transformers.js executing all-MiniLM-L6-v2) requires a one-time loading phase into the ONNX runtime. To prevent the first search call from blocking (which takes ~1.2s):

  • Eager Warming: The MCP server and Node SDK trigger model loading instantly upon startup (await memory.ready()), rather than waiting for the first query.
  • Background Loading: In conversational environments, the model loads in a non-blocking background thread while the client registers, guaranteeing that by the time the agent performs its first tool call, the embedding engine is warm.

LRU Cache Tuning & Observability

To achieve sub-10ms query times, the SDK maintains a 256-entry LRU (Least Recently Used) cache of query strings mapped to their compiled 384-dimensional vector embeddings.

  • Repeated or structurally similar queries bypass the ONNX execution path completely, reducing retrieval latency to <1ms.
  • Observability: You can monitor cache performance by listening to the client's telemetry events:
    memory.on("cache", (event) => {
      console.log(`Cache ${event.type}: ${event.query}`); // "hit" or "miss"
    });
  • Configuration: Cache size can be configured in the client initialization options:
    const memory = new sovseal({
      cacheSize: 512, // increase for high-frequency agents
    });

Filter Syntax

Unlike typical vector databases that require server-side filtering, sovseal decodes metadata client-side before applying filters. This prevents the server from learning about your structural queries.

// Search with a category filter and parent constraint
const results = await memory.recall("user testing preferences", {
  topK: 5,
  filters: {
    AND: [
      { category: "development" },
      { tags: { contains: "testing" } }
    ]
  }
});

Reinforcement-Aware Ranking (0.3.5)

Raw vector distance is only the first pass. recall_memory over-fetches 8 × topK candidates by vector distance, then re-ranks them by a composite score before returning the top topK:

score = similarity × decay × reinforcement
FactorDefinition
similaritymax(0, 1 − distance / 2) — cosine-equivalent of the L2 distance to the query vector.
decayexp(−λ_type · days_since(last_reinforced)) — exponential temporal decay. Half-lives are per memory type: episodic 14d, semantic 90d, procedural 180d. Override with SOVSEAL_DECAY_EPISODIC / SOVSEAL_DECAY_SEMANTIC / SOVSEAL_DECAY_PROCEDURAL.
reinforcement1 + ln(1 + reinforce_count) — memories restated more often rank higher.

The practical consequence: a frequently-reinforced older fact can out-rank a fresher, higher-raw-similarity one-off. This is what makes recall behave like memory rather than a plain nearest-neighbor index. Results are returned as { id, text, score } in descending composite-score order.

See Memory Model → Typing, reinforcement & provenance for how type and reinforce_count are set.


Integration Examples

import { sovseal } from "@sovseal/sdk";

const memory = new sovseal({ apiKey: process.env.SOVSEAL_API_KEY });
await memory.ready();

const hits = await memory.recall("prefers vitest over jest", {
  topK: 3,
  minScore: 0.85,
});

console.log(hits);
// Output: [{ payload: { framework: "vitest" }, score: 0.92, path: "user.preferences.testing" }]
// Arguments passed to the tool
{
  "name": "recall_memory",
  "arguments": {
    "query": "prefers vitest over jest",
    "topK": 3
  }
}

// Result
{
  "success": true,
  "data": [
    {
      "path": "user.preferences.testing",
      "payload": {
        "framework": "vitest"
      },
      "score": 0.92
    }
  ],
  "timestamp": 1716301928155
}
# Under self-hosted, you can query the REST endpoint for active snapshot status.
# Note: Payload returns as ciphertext; the local SDK executes the decryption.

curl -X GET "https://your-endpoint.com/v2/agent-state?project_id=sov_proj_123&query_hash=5e883..." \
  -H "Authorization: Bearer my-self-hosted-token"

On this page