HTML to Markdown {API}

Convert any URL to LLM-ready Markdown with one call.

Strip ads, nav chrome, and HTML noise to get clean GitHub Flavored Markdown ready for LLMs and RAG. Handles JavaScript rendering and bot-protection bypass automatically.

Tunable via query params: includeLinks, includeImages, shortenBase64Images.

No credit card required
View Documentation
Daydream logo
Kovai logo
Passionfroot logo
Orange logo
SendX logo
Klarna logo
Super.com logo
Daydream logo
Kovai logo
Passionfroot logo
Orange logo
SendX logo
Klarna logo
Super.com logo
Daydream logo
Kovai logo
Passionfroot logo
Orange logo
SendX logo
Klarna logo
Super.com logo
Daydream logo
Kovai logo
Passionfroot logo
Orange logo
SendX logo
Klarna logo
Super.com logo

What You Get

Each request converts a live webpage into structured, LLM-ready Markdown.

GitHub Flavored Markdown

Tables, headings, lists, and code blocks fully converted

Configurable links & images

Control whether hyperlinks and image references are preserved

Base64 image shortening

Prevent token bloat from inline image data in AI pipelines

Automatic proxy escalation

Scrapes blocked and protected sites transparently

How It Works

We fetch the page, handle proxy escalation, and convert HTML to Markdown for you.

— step 01

Send a URL with preferences

Specify includeLinks, includeImages, and shortenBase64Images as query params

— step 02

Page is fetched

Proxy escalation handles any blocks automatically, no configuration needed

— step 03

HTML converted to GFM

The full HTML document is parsed and converted to clean Markdown

— step 04

Markdown returned

Ready to pass directly to any LLM, vector store, or content pipeline

API Response

Extracted Markdown for context.dev

GET /v1/web/scrape/markdown?url=https://context.dev&includeLinks=true
{
  "success": true,
  "url": "https://context.dev",
  "markdown": "# Context.dev — The Internet's Brand API\n\nAPI to personalize your product with logos, colors,\nand company info from any domain.\n\n## Features\n\n- **Company Logos** — Fetch high-res logos from any domain\n- **Brand Colors** — Extract full color palettes\n- **Company Data** — Address, socials, description and more\n..."
}

Frequently asked questions

Common questions about the Context.dev HTML to Markdown API.

Am I billed for failed requests?
No. You are not billed for failed requests or requests where we are blocked (rarely happens). Credits are only consumed on successful responses.
How do I convert HTML to Markdown from a URL?
Send a GET request to /v1/web/scrape/markdown?url=<target>. The API fetches the page (with proxy bypass if needed), parses the HTML, and returns clean GitHub Flavored Markdown. Tables, headings, code blocks, and lists all convert correctly. One call, no setup.
Why convert HTML to Markdown for LLMs and RAG?
LLMs process Markdown ~5x more efficiently than HTML. HTML wastes tokens on tags, classes, inline styles, and tracking pixels — content that doesn't help the model and inflates context cost. Markdown preserves the meaningful structure (headings, lists, emphasis) that the model actually needs.
What Markdown format does the API return?
GitHub Flavored Markdown (GFM). It's the de facto standard for LLM-readable text, supports tables and code blocks, and renders correctly in Claude, ChatGPT, Gemini, Cursor, and most documentation systems out of the box.
Can I include or exclude links and images?
Yes. includeLinks (default: true) keeps hyperlinks as Markdown links so the LLM sees source citations. includeImages (default: false) controls inline images. shortenBase64Images replaces long base64 data URIs with short references, which can save thousands of tokens per page.
How is this different from Jina Reader or Firecrawl?
Same idea — URL in, Markdown out — but Context.dev bundles it with the rest of the brand-data stack (logos, colors, NAICS/SIC, transaction enrichment) on a single key. Pricing is per-call with a real free tier; bot-protection bypass is automatic, not a paid add-on.
Does it work on JavaScript-heavy sites?
Yes. The fetcher renders JavaScript when the target is a single-page app, then converts the post-render DOM to Markdown. You don't pick a mode — the API picks the right strategy based on the URL.
Is it free for AI agent use?
Yes. The free tier covers thousands of monthly Markdown conversions, which is plenty to wire up an MCP server, an agent tool, or a small RAG ingestion pipeline. Production volume scales linearly with no annual contract.

Ship an agent that actually knows things.

Free tier, 10-minute integration, and the same API powering agents at Mintlify, daily.dev, and Propane. No credit card to start.