🧪 Skills

MidOS Memory Cascade

Auto-escalating multi-tier memory search that cascades from in-memory cache through SQLite, grep, and LanceDB vector search to find the best answer with mini...

v1.0.0
❤️ 0
⬇️ 145
👁 1
Share

Description


name: midos-memory-cascade description: Auto-escalating multi-tier memory search that cascades from in-memory cache through SQLite, grep, and LanceDB vector search to find the best answer with minimal latency. metadata: {}

MidOS Memory Cascade

A self-tuning, auto-escalating search engine that tries each memory tier from fastest to slowest, stopping as soon as it finds a high-confidence answer.

What It Does

Instead of the agent deciding which storage layer to query, the cascade tries each tier automatically:

Tier Storage Latency Strategy
T0 In-memory session cache <1ms Exact + fuzzy key match
T1 JSON state files <5ms Filename + key match
T2 SQLite (pipeline_synergy.db) <5ms Structured SQL LIKE
T3 SQLite FTS5 <1ms Full-text keyword on 22K rows
T4 Grep over 46K chunks ~3s Brute-force ripgrep fallback
T5 LanceDB keyword (BM25) slow 670K vector rows, no embeddings
T5b LanceDB semantic 3–30s Embedding similarity, last resort

Question routing: Queries starting with how/what/why/etc. skip keyword tiers and route directly to semantic search.

Self-learning: The cascade records which tier resolves each query. After enough history, evolve() learns shortcuts (skip directly to the winning tier) and marks consistently-empty tiers for skip.

Usage

Python API

from tools.memory.memory_cascade import recall, store

# Search across all tiers
result = recall("adaptive alpha reranking")
# → {"answer": {...}, "tier": "T5:lancedb", "latency_ms": 340, "confidence": 0.87}

# Write to the right storage automatically
store("pattern", content="...", tags=["ml", "reranking"])

CLI

# Search
python memory_cascade.py recall "query here"

# View tier resolution stats
python memory_cascade.py stats

# Run self-evolution (learn shortcuts + tier skips)
python memory_cascade.py evolve

recall() Options

recall(
    query: str,
    min_confidence: float = 0.5,  # stop escalating at this threshold
    max_tier: int = 6             # 0=T0 only, 6=all tiers
)

Returns:

{
  "answer": { "source": "...", "text": "..." },
  "confidence": 0.87,
  "latency_ms": 340.2,
  "tiers_tried": 3,
  "resolved_at": "T5:lancedb",
  "shortcut": null,
  "question_routed": false,
  "escalation": [...]
}

Requirements

  • Python 3.10+ (stdlib only for core cascade logic)
  • Optional: hive_commons for LanceDB tiers (T5/T5b)
  • Optional: tools.memory.memory_router for store() routing

The cascade degrades gracefully — if LanceDB is unavailable, it stops at grep (T4). All stdlib tiers (T0–T4) work with zero dependencies.

Architecture Notes

  • Thread-safe: Session cache uses threading.Lock; stats writes use separate locks
  • Cross-process safe: JSONL writes use OS-level file locking (msvcrt on Windows, fcntl on Unix)
  • Confidence scoring: Term overlap × score × content richness → normalized 0–1
  • Stats persistence: knowledge/SYSTEM/cascade_stats.json accumulates hit rates per tier

Built with MidOS. 1 of 200+ skills. Full ecosystem at midos.dev/pro

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs