File to Markdown Converter
Convert documents, spreadsheets, images, and structured files into clean, structured Markdown optimized for AI processing without authentication.
Description
File to Markdown — Skill
Overview
Convert files into clean, structured, AI-ready Markdown using the markdown.new API powered by Cloudflare Workers AI toMarkdown().
Supports 20+ formats including documents, spreadsheets, images, and structured data.
No authentication required (500 requests/day per IP).
When to Use This Skill
Use this skill whenever you need to:
- Extract text from files for LLM processing
- Convert PDFs or Office files into Markdown
- Normalize data into structured text
- Process uploaded user files
- Scrape webpage content into Markdown
- Convert images into AI-generated descriptions + content
Common AI workflows:
- RAG ingestion pipelines
- Knowledge base creation
- Document summarization
- Dataset extraction
- Spreadsheet analysis
- OCR-like extraction from images
Supported Formats
Documents
.pdf.docx.odt
Spreadsheets
.xlsx.xls.xlsm.xlsb.et.ods.numbers
Images
.jpg.jpeg.png.webp.svg
Text & Structured Data
.txt.md.csv.json.xml.html.htm
Notes:
- Image conversion uses AI object detection + summarization.
- HTML URL conversion uses a web page pipeline.
- Uploaded HTML uses Workers AI conversion.
API Base URL
https://markdown.new
Endpoints
1️⃣ Convert Remote File (Simple GET)
Returns plain Markdown text.
GET /:file-url
Example:
curl -s "https://markdown.new/https://example.com/report.pdf"
2️⃣ Convert Remote File (JSON Response)
Returns metadata + Markdown.
GET /:file-url?format=json
Example:
curl -s "https://markdown.new/https://example.com/report.pdf?format=json"
3️⃣ Convert Remote File via POST
Use when you want structured JSON response.
POST /
Content-Type: application/json
Body:
{
"url": "https://example.com/report.pdf"
}
Example:
curl -s https://markdown.new/ \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/report.pdf"}'
4️⃣ Upload Local File
Use when file is not publicly accessible.
POST /convert
multipart/form-data
Example:
curl -s https://markdown.new/convert \
-F "file=@document.pdf"
Response Formats
URL Conversion Response
{
"success": true,
"url": "https://example.com/report.pdf",
"title": "Quarterly Report",
"content": "# Quarterly Report\n\n...",
"method": "Workers AI (file)",
"duration_ms": 1200,
"tokens": 850
}
Upload Conversion Response
{
"success": true,
"data": {
"title": "Q4 Report",
"content": "# Q4 Report\n\n...",
"filename": "report.xlsx",
"file_type": ".xlsx",
"tokens": 1250,
"processing_time_ms": 320
}
}
Best Practices for AI Agents
Prefer GET for Simple Workflows
Use:
GET /:url
When:
- You only need Markdown text
- Speed is important
- No metadata required
Prefer POST for Structured Pipelines
Use POST when:
- Metadata is needed
- Token counts are required
- Monitoring or logging is implemented
- Building automation workflows
File Upload Strategy
Use /convert only if:
- File is local
- File is private
- File requires authentication to access
Otherwise always prefer URL conversion.
Error Handling Strategy
Agents should:
- Check
"success": true - Retry once if network failure
- Validate content length > 0
- Fallback to alternate extraction if needed
Rate Limits
- 500 requests/day per IP without API key
- No signup required
Agents should:
- Cache results when possible
- Avoid duplicate conversions
Integration Examples
JavaScript (Node.js)
const res = await fetch("https://markdown.new/", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
url: "https://example.com/file.pdf"
})
});
const data = await res.json();
console.log(data.content);
Python
import requests
res = requests.post(
"https://markdown.new/",
json={"url": "https://example.com/file.pdf"}
)
data = res.json()
print(data["content"])
Agent Decision Tree
If user provides:
| Input Type | Action |
|---|---|
| Public file URL | Use GET or POST |
| Local file | Use POST /convert |
| Image | Convert then summarize |
| Spreadsheet | Convert then analyze |
| Webpage | Convert URL HTML |
Output Expectations
The Markdown should be:
- Clean
- Structured
- AI-friendly
- Minimal noise
- Ready for LLM ingestion
Limitations
- Complex PDF layouts may lose formatting
- Large spreadsheets may be truncated
- Images rely on AI interpretation accuracy
- Token limits may apply
Summary
This skill provides a universal file-to-Markdown conversion layer for AI systems with:
- No authentication
- Simple HTTP interface
- Multi-format support
- Structured output
- Fast processing
Ideal for document ingestion, RAG pipelines, and automation agents.
Reviews (0)
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!