HTML
Markdown

How to convert HTML to Markdown

  1. Paste any HTML fragment — a blog post, a div block, or a full document body — into the input field on the left.
  2. Click "Convert" — the clean Markdown output appears instantly in the right panel.
  3. Switch to "Markdown → HTML" using the toggle above the fields if you need the reverse direction.
  4. Click "Copy" to copy the output to your clipboard, or "Download .md" to save the file locally.
  5. Paste the Markdown directly into your static site generator, documentation tool, or GitHub README.

How the HTML to Markdown converter works

The converter runs entirely in your browser using Turndown, a JavaScript library that traverses the parsed HTML DOM tree and maps each element to its Markdown equivalent. When you click Convert, the input string is parsed into a document fragment, then each node — headings, paragraphs, lists, code blocks, links, images — is walked recursively and replaced with the corresponding Markdown syntax. The result is written directly to the output field without any network request. Your content never leaves your machine.

Turndown follows a ruleset system: each HTML element has a filter that matches it and a replacement function that produces Markdown. For elements outside the supported set — div, span, custom attributes, inline styles — the default rule strips the tag and keeps only the text content. This produces clean, portable Markdown without residual HTML noise.

How Turndown converts HTML to Markdown
// Turndown runs this logic in your browser — no server involved:
import TurndownService from 'turndown'

const td = new TurndownService({
  headingStyle: 'atx',       // # H1, ## H2  (not underline style)
  bulletListMarker: '-',     // - item        (not * or +)
  codeBlockStyle: 'fenced',  // ```code```  (not indented)
})

const html = '<h1>Hello</h1><p>A <strong>bold</strong> word.</p>'
const markdown = td.turndown(html)
// Output:
// # Hello
//
// A **bold** word.

Who uses HTML to Markdown conversion

Markdown is the native input format for static site generators (Hugo, Jekyll, Astro, Eleventy), documentation platforms (Docusaurus, MkDocs, GitBook), and developer collaboration tools (GitHub, GitLab, Notion). If your content currently lives as HTML — in a CMS, a legacy site, an email template, or a scraped web page — converting it to Markdown is the fastest route to reuse.

  • Static site migration — converting WordPress or Drupal post bodies to Markdown for Hugo, Astro, or Jekyll.
  • Documentation rebuild — transforming Confluence or legacy HTML docs into MkDocs or Docusaurus source files.
  • README and wiki creation — cleaning up copied web content into a readable GitHub README.
  • Email to documentation — converting HTML email templates into plain, editable Markdown records.
  • CMS export cleanup — post-processing HTML exports from headless CMS platforms into portable Markdown.
  • Developer tooling — preprocessing HTML scraped from web pages before feeding it to Markdown-first editors or LLMs.

HTML elements and their Markdown output

Markdown covers the most common block and inline elements. The table below shows exactly what the converter produces for each HTML tag. Elements not in this set — div, span, section, article, aside, data attributes, inline styles — are stripped to their text content only.

HTML tagMarkdown outputNotes
<h1>–<h6># through ######ATX-style headings
<p>Blank-line-separated blocksStandard paragraph separation
<strong>, <b>**text**Bold emphasis
<em>, <i>*text*Italic emphasis
<a href="url">text</a>[text](url)Inline link; href preserved
<img src="…" alt="…">![alt](src)Alt text and src preserved
<ul><li>- itemUnordered list
<ol><li>1. itemOrdered list; numbers preserved
<code>`code`Inline code
<pre><code>```\ncode\n```Fenced code block
<blockquote>> textBlock quote
<hr>---Horizontal rule
<del>, <s>~~text~~Strikethrough (GFM)
<table>Pipe tableGitHub Flavored Markdown format

When to convert to Markdown — and when not to

Convert to Markdown when:

  • Target platform is Markdown-native — static site generators, GitHub READMEs, Notion, Obsidian, Bear, Typora.
  • Version control matters — Markdown diffs cleanly in git; HTML tags create visual noise in pull request reviews.
  • Non-developer editors — writers find Markdown syntax easier to read and write than raw HTML tags.
  • Content portability — Markdown is a long-lived plain-text format with no vendor lock-in.
  • Documentation pipelines — tools like Docusaurus, MkDocs, and VitePress use Markdown as their primary source format.

Keep HTML when:

  • Precise layout is required — multi-column grids, absolute positioning, complex CSS class structures.
  • Interactive elements are embedded — forms, custom widgets, iframes, JavaScript components inside the content.
  • The target renders HTML directly — email clients, legacy CMS systems, platforms that do not process Markdown.
  • Complex nested tables — Markdown tables have no merged cells, rowspan, or colspan support.
  • Custom attributes matter — data-*, aria-*, and class values are stripped during conversion.

Conversion edge cases and what to expect

HTML is a superset of what Markdown can express, so some information is always lost in conversion. Understanding these edge cases helps you decide when post-processing is needed and prevents surprises in the output.

  • Inline styles are stripped — <p style="color:red"> becomes a plain paragraph. CSS formatting has no Markdown equivalent.
  • Class and id attributes are dropped — <div class="highlight"> loses its class. If you rely on these for client-side JavaScript, conversion is not appropriate.
  • Nested block elements — a <div> wrapping a <p> is unwrapped; the paragraph text is preserved, the div discarded.
  • Image dimensions — <img width="800"> loses its size attributes. Only src and alt are preserved in the output.
  • Script and style tags — completely stripped including all their content. Run conversion on content HTML only, not full pages.
Round-trip example: HTML → Markdown
// Input HTML (with class, style, and wrapper div):
<div class="post" style="padding: 20px">
  <h2 id="title">Developer Guide</h2>
  <p>A <strong>bold</strong> and <em>italic</em> sentence.</p>
  <ul>
    <li>First item</li>
    <li>Second item</li>
  </ul>
</div>

// After HTML → Markdown conversion:
// ## Developer Guide
//
// A **bold** and *italic* sentence.
//
// - First item
// - Second item
//
// Note: class, id, inline style, and <div> wrapper are removed.
// Text content and semantic structure are fully preserved.

Frequently Asked Questions

What HTML elements does the converter support?
Headings (h1–h6), paragraphs, bold and italic text (strong, b, em, i), links (a), images (img), ordered and unordered lists, inline code and code blocks (code, pre), blockquotes, horizontal rules, strikethrough (del, s), and tables. Elements outside this set — div, span, section, article, custom elements — are stripped to their inner text content.
Will inline styles be preserved in the Markdown output?
No. Markdown has no mechanism for inline CSS, so all style attributes are dropped during conversion. If you need colour, font size, or other visual formatting preserved, Markdown is not the right output format — keep the HTML instead or post-process the result by hand.
How does the converter handle images?
Each <img> tag is converted to the Markdown image syntax: ![alt text](src). The src and alt attributes are preserved exactly. Width, height, class, style, loading, and any other attributes are discarded. If the alt attribute is empty or absent, the output is ![](src).
What happens to links with relative URLs?
Relative URLs are preserved as-is. A link like <a href="/about">About</a> becomes [About](/about) — the path is not resolved to an absolute URL. If you are moving content to a different domain, you will need to update relative links after conversion.
Does the converter support nested lists?
Yes. Nested <ul> and <ol> elements are converted to indented Markdown lists. Each level of nesting is indented by two spaces, which is the standard for GitHub Flavored Markdown and most static site generators.
Can I convert a full HTML page including the <head> section?
Technically yes, but the result will be messy. The <head>, <style>, and <script> tags are stripped — only their text content (if any) passes through. For best results, copy only the content portion of the page — the main body, article, or post — not the full HTML document.
How are HTML tables converted?
Tables are converted to GitHub Flavored Markdown (GFM) pipe table syntax. A two-column table with a header row becomes: | Header 1 | Header 2 | / |---|---| / | Cell 1 | Cell 2 |. Merged cells (colspan, rowspan) are not supported in GFM and will be flattened — the cell content is preserved but the merge is lost.
Is any data sent to a server during conversion?
No. The entire conversion happens inside your browser using a JavaScript library. No data is transmitted over the network. There are no file size limits, no rate limits, and no privacy concerns — the content of your HTML is never visible to any server.
What is GitHub Flavored Markdown and does this converter produce it?
GitHub Flavored Markdown (GFM) is a superset of standard Markdown that adds tables, fenced code blocks, and strikethrough. This converter outputs GFM-compatible Markdown — fenced code blocks, pipe tables, and ~~strikethrough~~ are all supported, which is also the format used by Hugo, Gatsby, Docusaurus, and most modern static site generators.
Why does some HTML not convert cleanly to Markdown?
Markdown is intentionally limited in scope — it was designed for readable plain text, not full document layout. HTML elements that have no Markdown counterpart (div, span, section, custom elements, CSS classes) are stripped. If your HTML relies heavily on class-based styling or complex layout structure, the Markdown output will be plainer than the original.
Can I convert HTML from a webpage by pasting the source?
Yes. Paste the raw HTML markup (not the rendered text) into the input field. If you want to convert the visible content of a page, use "View Page Source" in the browser, copy the relevant portion (typically the <main> or <article> content), then paste it here.
Does the downloaded file use the correct encoding?
Yes. The downloaded .md file is encoded as UTF-8, which is the standard encoding for Markdown files. All Unicode characters — accented letters, CJK characters, emoji — are preserved correctly in the output.