🧪 Skills
defuddle-web-cleaner
Extract and clean readable article content, metadata, and markdown from URLs or HTML for research, note taking, and web scraping.
v1.0.0
Description
name: defuddle-web-cleaner description: extract clean article content from web pages using defuddle. use when a user provides a url or html and wants the readable article text, markdown version, or structured metadata. helpful for web scraping, research workflows, note taking, obsidian clipping, and converting web pages to markdown.
Defuddle Web Cleaner
Extract the main readable content from a web page.
This skill removes unnecessary elements such as:
- navigation bars
- sidebars
- ads
- comments
- footers
- social buttons
The result is clean article content.
Supported Inputs
- URL
- Raw HTML
- Web page text
Output Format
Default output:
Title
Author
Site
Published date
Markdown article content
Alternative output (JSON):
{ title, author, site, description, published, content, contentMarkdown }
Processing Steps
- Detect input type
- Load page HTML
- Run Defuddle parser
- Extract metadata
- Convert to Markdown if requested
- Return clean content
Example
Input:
Output:
Title: AI is Changing Everything
Author: Jane Smith
Site: Example Blog
Markdown:
AI is Changing Everything
Artificial intelligence is transforming industries...
Tips
Use this skill when:
- saving articles to Obsidian
- building research datasets
- cleaning webpages for LLM processing
- summarizing articles
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!