Verified by the sovseal team

Architecture Overview

How sovseal keeps recall local, replication asynchronous, and the server permanently blind.

sovseal is built around a single architectural commitment: plaintext never leaves the device, and reads never touch the network. Every other guarantee follows from that.

The Shape of a sovseal Deployment

┌─────────────────────────── Agent Process ───────────────────────────┐
│                                                                     │
│   ┌──────────────┐   store    ┌────────────────────────────┐        │
│   │  Your agent  │ ─────────► │  sovseal SDK / MCP server  │        │
│   │  (LLM tool   │            │  ───────────────────────── │        │
│   │   calls)     │ ◄───────── │  • LanceDB (vectors)       │        │
│   └──────────────┘   recall   │  • Transformers.js (384-d) │        │
│                               │  • AES-256-GCM (per record)│        │
│                               └────────────┬───────────────┘        │
│                                            │ write-behind           │
│                                            │ (ciphertext only)      │
└────────────────────────────────────────────┼────────────────────────┘

                              ┌────────────────────────────┐
                              │  Replication endpoint      │
                              │  (Platform or self-hosted) │
                              │  ───────────────────────── │
                              │  Sees: ciphertext + SHA-256│
                              │  path hashes. Nothing else.│
                              └────────────────────────────┘

Request Lifecycle

The split between local vector queries and remote asynchronous replication creates two distinct operations.

store_memory (Write Lifecycle)

  1. Payload Canonicalization: The SDK receives the payload object, orders keys alphabetically, and serializes it into a stable JSON byte array. This prevents serialization differences from changing the cryptographic hash.
  2. Local Embedding: The payload is sent to the local Transformers.js model, generating a 384-dimensional floating-point dense vector.
  3. Encryption Boundary: The SDK encrypts the serialized JSON bytes using AES-256-GCM with a 96-bit random IV (Initialization Vector) generated on-device, yielding the ciphertext and a 128-bit authentication tag.
  4. Lineage Attachment: The SDK queries the local index for the current HEAD snapshot ID, then computes: snapshot_id = sha256(canonicalize(payload) ‖ parent_snapshot_id)
  5. Local Commit: The vector, path hash (sha256(path)), ciphertext, nonce, auth tag, parent pointer, and snapshot ID are committed to the on-device LanceDB index. The database runs fsync to ensure disk durability.
  6. Return: The API call resolves in <5ms, letting your agent continue without blocking on network round-trips.
  7. Write-Behind Replication: A background worker grabs the local record, bundles it in a batch, and sends the ciphertext and metadata to the replication server.

recall_memory (Read Lifecycle)

  1. Query Embedding: The query string is embedded locally (using the cache if the query matches a recent search).
  2. Vector Lookup: The SDK runs a vector similarity search (L2 distance) across local LanceDB indexes. This operation is 0 RTT and functions completely offline.
  3. Decryption & Verification: The local database retrieves the matching record. The SDK uses the local key to decrypt the ciphertext and validates the AES-GCM authentication tag.
  4. VSR Anchor Validation: The SDK re-derives the snapshot ID by hashing the decrypted payload with the parent snapshot ID and matches it against the stored snapshot ID.
  5. Return: Plaintext is returned to the agent in ~6ms (p50).

Encryption Boundary & Key Custody

Plaintext data and cryptographic keys exist only within your agent process's memory space.

┌────────────────────────────────────────────────────────┐
│                   LOCAL TRUST BOUNDARY                 │
│                                                        │
│  [Plaintext Data] ──► [AES-256-GCM] ──► [Device Key]   │
│                                │                       │
└────────────────────────────────┼───────────────────────┘

                     =========================
                     Public Network Boundary
                     =========================


┌────────────────────────────────────────────────────────┐
│                 UNTRUSTED REMOTE SPACE                 │
│                                                        │
│                [ Opaque Ciphertext ]                   │
│                [  SHA-256 Path Hash ]                  │
│                                                        │
└────────────────────────────────────────────────────────┘

Key Custody

Your master key is held in the OS keychain (macOS Keychain / Windows Credential Manager / Linux libsecret), not a plaintext file.

  • Generated with a CSPRNG on first run; never written to disk in the clear.
  • Purpose-bound subkeys are derived on demand via HKDF-SHA256: k_rest (local at-rest encryption) and k_sync (replication).
  • ~/.sovseal/config.json (0600) holds identity + routing only (project_id, api_key, endpoint) — no key material.
  • Headless fallback: SOVSEAL_KEY_FALLBACK=file stores the master at ~/.sovseal/ (0600); without it, a missing keychain fails closed.
  • The server-blind project identifier is derived on-device: sha256(project_id ‖ ":" ‖ key).
  • WARNING: If you lose the keychain master key (or the fallback key file), your memories cannot be decrypted. There is no password recovery or escrow flow. See Key Management & Custody.

Local Embedding Pipeline

sovseal uses on-device embedding generation to avoid sending plaintext to remote embedding APIs.

  • Engine: ONNX Runtime executing Transformers.js CPU-bound.
  • Model: Xenova/all-MiniLM-L6-v2 (~30 MB download).
  • Directory: Cached on-device in ~/.sovseal/models/.
  • Warming: On first connect, the MCP server or SDK loads the ONNX runtime model. This creates a ~1.2s cold start.
  • Caching: The query embedding is stored in a 256-entry LRU cache, bypassing the model load for repeated agent requests.

Write-Behind Replication & Partition Recovery

Replication runs as a non-blocking queue. The database is optimized to return control to the agent immediately.

  • Batching: Sync jobs are batched (up to 32 records or every 250ms) to reduce network overhead.
  • Offline Buffering: If your machine goes offline or the replication server returns a 5xx error, the replication worker enters an exponential backoff loop, retrying infinitely. Records remain in the local LanceDB queue.
  • Backpressure Gate: If the network is down for a prolonged period and the local outbox exceeds maxOutboxBytes (default: 256 MB), subsequent store_memory calls will block until the network reconnects and the queue drains.
  • Crash Safety: In-flight replication jobs are tracked in the database outbox table. If the agent process crashes, untransmitted sync events are reloaded on restart and successfully replicated.

On this page