🧪 Skills

XPR Web Scraping

Tools for fetching and extracting cleaned text, metadata, and links from single or multiple web pages with format options and link filtering.

v0.2.11
❤️ 0
⬇️ 1.4k
👁 2
Share

Description


name: web-scraping description: Web scraping tools for fetching and extracting data from web pages

Web Scraping

You have web scraping tools for fetching and extracting data from web pages:

Single page:

  • scrape_url — fetch a URL and get cleaned text content + metadata (title, description, link count)
    • Use format="text" (default) for most tasks — strips all HTML
    • Use format="markdown" to preserve headings, links, lists, bold/italic
    • Use format="html" only when you need raw HTML

Link discovery:

  • extract_links — fetch a page and extract all links with text and type (internal/external)
    • Use the pattern parameter to filter by regex (e.g. "\\.pdf$" for PDF links)
    • Links are deduplicated and resolved to absolute URLs

Multi-page research:

  • scrape_multiple — fetch up to 10 URLs in parallel for comparison/research
    • One failure doesn't block others (uses Promise.allSettled)

Best practices:

  • Prefer "text" format for content extraction, "markdown" for preserving structure
  • Don't scrape the same domain more than 5 times per minute
  • Combine with store_deliverable to save scraped content as job evidence
  • For very large pages, the content is limited to 5MB

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs