🧪 Skills

File to Markdown Converter

Convert documents, spreadsheets, images, and structured files into clean, structured Markdown optimized for AI processing without authentication.

v1.0.0
❤️ 0
⬇️ 340
👁 1
Share

Description

File to Markdown — Skill

Overview

Convert files into clean, structured, AI-ready Markdown using the markdown.new API powered by Cloudflare Workers AI toMarkdown().

Supports 20+ formats including documents, spreadsheets, images, and structured data.

No authentication required (500 requests/day per IP).


When to Use This Skill

Use this skill whenever you need to:

  • Extract text from files for LLM processing
  • Convert PDFs or Office files into Markdown
  • Normalize data into structured text
  • Process uploaded user files
  • Scrape webpage content into Markdown
  • Convert images into AI-generated descriptions + content

Common AI workflows:

  • RAG ingestion pipelines
  • Knowledge base creation
  • Document summarization
  • Dataset extraction
  • Spreadsheet analysis
  • OCR-like extraction from images

Supported Formats

Documents

  • .pdf
  • .docx
  • .odt

Spreadsheets

  • .xlsx
  • .xls
  • .xlsm
  • .xlsb
  • .et
  • .ods
  • .numbers

Images

  • .jpg
  • .jpeg
  • .png
  • .webp
  • .svg

Text & Structured Data

  • .txt
  • .md
  • .csv
  • .json
  • .xml
  • .html
  • .htm

Notes:

  • Image conversion uses AI object detection + summarization.
  • HTML URL conversion uses a web page pipeline.
  • Uploaded HTML uses Workers AI conversion.

API Base URL

https://markdown.new

Endpoints

1️⃣ Convert Remote File (Simple GET)

Returns plain Markdown text.

GET /:file-url

Example:

curl -s "https://markdown.new/https://example.com/report.pdf"

2️⃣ Convert Remote File (JSON Response)

Returns metadata + Markdown.

GET /:file-url?format=json

Example:

curl -s "https://markdown.new/https://example.com/report.pdf?format=json"

3️⃣ Convert Remote File via POST

Use when you want structured JSON response.

POST /
Content-Type: application/json

Body:

{
  "url": "https://example.com/report.pdf"
}

Example:

curl -s https://markdown.new/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'

4️⃣ Upload Local File

Use when file is not publicly accessible.

POST /convert
multipart/form-data

Example:

curl -s https://markdown.new/convert \
  -F "file=@document.pdf"

Response Formats

URL Conversion Response

{
  "success": true,
  "url": "https://example.com/report.pdf",
  "title": "Quarterly Report",
  "content": "# Quarterly Report\n\n...",
  "method": "Workers AI (file)",
  "duration_ms": 1200,
  "tokens": 850
}

Upload Conversion Response

{
  "success": true,
  "data": {
    "title": "Q4 Report",
    "content": "# Q4 Report\n\n...",
    "filename": "report.xlsx",
    "file_type": ".xlsx",
    "tokens": 1250,
    "processing_time_ms": 320
  }
}

Best Practices for AI Agents

Prefer GET for Simple Workflows

Use:

GET /:url

When:

  • You only need Markdown text
  • Speed is important
  • No metadata required

Prefer POST for Structured Pipelines

Use POST when:

  • Metadata is needed
  • Token counts are required
  • Monitoring or logging is implemented
  • Building automation workflows

File Upload Strategy

Use /convert only if:

  • File is local
  • File is private
  • File requires authentication to access

Otherwise always prefer URL conversion.


Error Handling Strategy

Agents should:

  1. Check "success": true
  2. Retry once if network failure
  3. Validate content length > 0
  4. Fallback to alternate extraction if needed

Rate Limits

  • 500 requests/day per IP without API key
  • No signup required

Agents should:

  • Cache results when possible
  • Avoid duplicate conversions

Integration Examples

JavaScript (Node.js)

const res = await fetch("https://markdown.new/", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com/file.pdf"
  })
});

const data = await res.json();
console.log(data.content);

Python

import requests

res = requests.post(
    "https://markdown.new/",
    json={"url": "https://example.com/file.pdf"}
)

data = res.json()
print(data["content"])

Agent Decision Tree

If user provides:

Input Type Action
Public file URL Use GET or POST
Local file Use POST /convert
Image Convert then summarize
Spreadsheet Convert then analyze
Webpage Convert URL HTML

Output Expectations

The Markdown should be:

  • Clean
  • Structured
  • AI-friendly
  • Minimal noise
  • Ready for LLM ingestion

Limitations

  • Complex PDF layouts may lose formatting
  • Large spreadsheets may be truncated
  • Images rely on AI interpretation accuracy
  • Token limits may apply

Summary

This skill provides a universal file-to-Markdown conversion layer for AI systems with:

  • No authentication
  • Simple HTTP interface
  • Multi-format support
  • Structured output
  • Fast processing

Ideal for document ingestion, RAG pipelines, and automation agents.


Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs