HTML to Portable Text

Convert HTML into Portable Text — an open specification for structured block content.

 

How it works

The converter is intentionally lossy — messy HTML in, clean structured JSON out. It uses DOMParser to parse your HTML, walks the tree recursively, and maps each supported element to a Portable Text block or mark. Everything else is either recursed into (container elements like div) or silently dropped (images, iframes, inline styles, class attributes).

Element mapping

HTML Portable Text
<p> block, style: "normal"
<h1> – <h6> block, style: "h1" – "h6"
<blockquote> block, style: "blockquote"
<ul><li> block, listItem: "bullet", level: 1+
<ol><li> block, listItem: "number", level: 1+
<pre><code> block, style: "code"
<strong>, <b> mark: "strong"
<em>, <i> mark: "em"
<code> mark: "code"
<s>, <del> mark: "strike-through"
<u> mark: "underline"
<a href="..."> mark annotation: { _type: "link", href }
<div>, <section>, ... recursed — children promoted to top level

Marks and nested marks

Portable Text flattens nested inline formatting onto each span rather than nesting elements. A span can carry multiple marks at once:

{ "_type": "span", "text": "hello", "marks": ["strong", "em"] }

Link marks are stored as keys that reference a markDefs entry on the block, keeping the inline span small while the full link data lives alongside it.

Further reading