Verified by the sovseal team
Transformers.js Local Embedding Subsystem
Details of the default CPU-bound ONNX model used to generate fact embeddings on-device.
Transformers.js serves as the default local execution runtime for generating semantic vector embeddings. It runs the quantized version of the all-MiniLM-L6-v2 model directly on the client CPU.
Model Pinned Integrity Hashes
For maximum security, sovseal verifies the model files on startup against pinned SHA-256 hashes to prevent local tampering or corrupted files:
| File Name | Pinned SHA-256 |
|---|---|
config.json | 7135149f7cffa1a573466c6e4d8423ed73b62fd2332c575bf738a0d033f70df7 |
tokenizer.json | da0e79933b9ed51798a3ae27893d3c5fa4a201126cef75586296df9b4d2c62a0 |
tokenizer_config.json | 9261e7d79b44c8195c1cada2b453e55b00aeb81e907a6664974b4d7776172ab3 |
onnx/model_quantized.onnx | afdb6f1a0e45b715d0bb9b11772f032c399babd23bfc31fed1c170afc848bdb1 |
Performance & LRU Caching
Because generating vector embeddings via ONNX model forward-passes requires CPU cycles (taking ~5–8 ms on modern hardware), sovseal implements an embedding LRU cache to bypass the model execution entirely for redundant queries.
Cache Characteristics
- Scope: Applied only to query/search operations (
recall_memory). Storage operations (store_memory) always bypass the cache to ensure unique fact vectors are written to LanceDB. - Default Capacity: 256 queries.
- Memory Overhead: Extremely low (sub-kilobyte footprint) as it only holds 384-dimensional Float32 vectors.
- Configuration: You can adjust the cache capacity by setting the
SOVSEAL_EMBEDDING_CACHE_SIZEenvironment variable (set to0to disable the cache entirely).
CPU Resource Costs
For typical developer machines and agent host environments:
| Operation Type | Payload Size (Chars) | CPU Execution Time | Memory Footprint |
|---|---|---|---|
| Warmup / JIT Load | - | 1.2s (First-call cold) | ~30 MB RAM |
| Cache Hit Recall | Any | < 0.1 ms (0 CPU forwards) | - |
| Cache Miss Recall | 100 - 500 chars | ~4.2 ms | - |
| Large Store Vector | 5,000 chars | ~9.5 ms | - |