🧪 Skills

Markdown Docs Full-Text Search

Full-text search across structured Markdown documentation archives using SQLite FTS5. Use when you need to search large collections of Markdown articles that...

v1.0.2
❤️ 0
⬇️ 226
👁 2
Share

Description


name: md-docs-search description: Full-text search across structured Markdown documentation archives using SQLite FTS5. Use when you need to search large collections of Markdown articles that are separated by "---" delimiters and contain source URLs (marked with "*Source:" pattern). Provides fast BM25-ranked search with automatic source URL extraction for citations. Ideal for research, documentation lookups, and knowledge base exploration. Requires indexing documentation first with docs.py index.

Markdown Documentation Full-Text Search

Fast, indexed full-text search across Markdown documentation archives using SQLite FTS5 with BM25 relevance ranking.

When to Use

  • Searching documentation archives for specific features, capabilities, or information
  • Finding official source URLs to cite in reports
  • Looking up technical specifications or configuration details
  • Research across multiple documentation sources

Document Format Expected

Articles separated by --- delimiter with *Source: URL:

# Article Title

*Source: https://docs.example.com/path/to/article.html*

Article content here...

---

# Next Article Title

*Source: https://docs.example.com/another/article.html*

More content...

Quick Start

# 1. Index the documentation (one-time or when docs change)
scripts/docs.py index ./docs

# 2. Search
scripts/docs.py search "kubernetes backup" --max 5

# 3. Check index status
scripts/docs.py status

Primary Tool: docs.py

The unified CLI handles all operations:

Indexing

# Index documentation directory
scripts/docs.py index ./docs

# Force full rebuild
scripts/docs.py index ./docs --rebuild

# Custom database location
scripts/docs.py index ./docs --db /path/to/custom.db

Searching

# Basic search
scripts/docs.py search "kubernetes backup"

# Boolean operators
scripts/docs.py search "AWS AND S3 AND snapshot"

# Phrase search
scripts/docs.py search '"exact phrase match"'

# Prefix search
scripts/docs.py search "kube*"

# Exclude terms
scripts/docs.py search "backup NOT restore"

# Title-only search
scripts/docs.py search "kubernetes" --title-only

# Output formats
scripts/docs.py search "kubernetes" --format json
scripts/docs.py search "kubernetes" --format markdown

# More context around matches
scripts/docs.py search "kubernetes" --context 400

# Include full content in JSON
scripts/docs.py search "kubernetes" --format json --full-content

FTS5 Query Syntax

Syntax Meaning
term1 term2 Documents with term1 OR term2 (ranked)
term1 AND term2 Documents with both terms
term1 OR term2 Documents with either term
"exact phrase" Exact phrase match
prefix* Words starting with prefix
term1 NOT term2 term1 without term2
title:term Search only titles

Getting Specific Articles

# Get article by partial URL or title
scripts/docs.py get "system_requirements" --full

# Find all matching articles
scripts/docs.py get "backup" --all

Status

# Check index statistics
scripts/docs.py status

Workflow for Research Tasks

Discovery Phase

# Check what's indexed
scripts/docs.py status

# Explore topics with broad searches
scripts/docs.py search "<feature>" --max 20

Research Phase

# Narrow down with boolean operators
scripts/docs.py search "<feature> AND <platform>"

# Find specific information
scripts/docs.py search "limitation OR restriction OR 'not supported'"

Citation Phase

Every search result includes the Source: URL — use this in your reports:

According to documentation, [finding]...

Source: https://docs.example.com/path/to/article.html

Multi-Source Setup

Each agent or project can have their own documentation and index:

~/docs/VendorA/
    ├── docs_part_01.md
    ├── docs.db      # Index lives with docs
    └── ...

~/docs/VendorB/
    ├── docs.md
    ├── docs.db
    └── ...

The docs.py script auto-detects the database location.

Advanced Scripts

For specialized needs:

  • scripts/fts_search.py — Direct FTS5 search with more options
  • scripts/index_docs.py — Standalone indexing
  • scripts/list_sources.py — List all source URLs
  • scripts/get_article.py — Direct article retrieval
  • scripts/search_docs.py — Regex-based search (no index needed)

Research Patterns

For common search patterns (feature research, architecture, security, etc.), see references/search-patterns.md.

Example Session

# What's available?
scripts/docs.py status
# Output: Files indexed: 37, Articles indexed: 32065

# Find information
scripts/docs.py search "kubernetes backup" --max 5

# Narrow to specific platform
scripts/docs.py search "kubernetes AND AWS" --max 5

# Find limitations
scripts/docs.py search "limitation OR 'not supported'"

# Get full article for citation
scripts/docs.py get "system_requirements" --full

Best Practices

  1. Index once, search many times — FTS5 is fast because it's indexed
  2. Use boolean operatorsAND, OR, NOT for precision
  3. Phrase search for exact terms"exact match" with quotes
  4. Always cite sources — Include Source: URLs in reports
  5. Rebuild periodically — Re-index when documentation updates
  6. Use JSON for analysis — Pipe to jq or other tools for processing

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs