Using unified
This guide shows how unified can be used to transform a markdown file to HTML. It also shows how to generate a table of contents and sidesteps into checking prose.
Stuck? Have an idea for another guide? See
support.md
.
Contents
Tree transformations
For this example we start out with markdown content and then turn it into HTML. We need something to parse markdown and something to compile (stringify) HTML for that. The relevant projects are respectively remark-parse
and rehype-stringify
. To transform between the two syntaxes we use remark-rehype
. Finally, we use unified itself to glue these together.
First set up a project. Create a folder example
, enter it, and initialize a new package:
mkdir example
cd example
npm init -y
Then make sure the project is a module so that import
and export
work by specifying "type": "module"
:
--- a/package.json
+++ b/package.json
@@ -1,6 +1,7 @@
{
"name": "example",
"version": "1.0.0",
+ "type": "module",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
Now let’s install the needed dependencies with npm, which comes bundled with Node.js.
npm install rehype-stringify remark-parse remark-rehype unified
Now create a markdown file, example.md
, that we’re going to transform.
# Pluto
Pluto is an dwarf planet in the Kuiper belt.
## Contents
## History
### Discovery
In the 1840s, Urbain Le Verrier used Newtonian mechanics to predict the
position of…
### Name and symbol
The name Pluto is for the Roman god of the underworld, from a Greek epithet for
Hades…
### Planet X disproved
Once Pluto was found, its faintness and lack of a viewable disc cast doubt…
## Orbit
Pluto’s orbital period is about 248 years…
Then create index.js
as well. It transforms markdown to HTML:
import fs from 'node:fs/promises'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import {unified} from 'unified'
const document = await fs.readFile('example.md', 'utf8')
const file = await unified()
.use(remarkParse)
.use(remarkRehype)
.use(rehypeStringify).process(document)
console.log(String(file))
(alias) module "node:fs/promises"
import fs
(alias) const rehypeStringify: Plugin<[(Options | null | undefined)?], Root, string>
import rehypeStringify
Plugin to add support for serializing as HTML.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(alias) const remarkParse: Plugin<[(Readonly<Options> | null | undefined)?], string, Root>
import remarkParse
Add support for parsing from markdown.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(alias) function remarkRehype(processor: Processor, options?: Readonly<Options> | null | undefined): TransformBridge (+1 overload)
import remarkRehype
Turn markdown into HTML.
Notes
Signature
- if a processor is given, runs the (rehype) plugins used on it with a hast tree, then discards the result (bridge mode)
- otherwise, returns a hast tree, the plugins used after
remarkRehype
are rehype plugins (mutate mode)
👉 Note: It’s highly unlikely that you want to pass a
processor
.
HTML
Raw HTML is available in mdast as html
nodes and can be embedded in hast as semistandard raw
nodes. Most plugins ignore raw
nodes but two notable ones don’t:
rehype-stringify
also has an optionallowDangerousHtml
which will output the raw HTML. This is typically discouraged as noted by the option name but is useful if you completely trust authorsrehype-raw
can handle the raw embedded HTML strings by parsing them into standard hast nodes (element
,text
, etc). This is a heavy task as it needs a full HTML parser, but it is the only way to support untrusted content
Footnotes
Many options supported here relate to footnotes. Footnotes are not specified by CommonMark, which we follow by default. They are supported by GitHub, so footnotes can be enabled in markdown with remark-gfm
.
The options footnoteBackLabel
and footnoteLabel
define natural language that explains footnotes, which is hidden for sighted users but shown to assistive technology. When your page is not in English, you must define translated values.
Back references use ARIA attributes, but the section label itself uses a heading that is hidden with an sr-only
class. To show it to sighted users, define different attributes in footnoteLabelProperties
.
Clobbering
Footnotes introduces a problem, as it links footnote calls to footnote definitions on the page through id
attributes generated from user content, which results in DOM clobbering.
DOM clobbering is this:
<p id=x></p>
<script>alert(x) // `x` now refers to the DOM `p#x` element</script>
Elements by their ID are made available by browsers on the window
object, which is a security risk. Using a prefix solves this problem.
More information on how to handle clobbering and the prefix is explained in Example: headings (DOM clobbering) in rehype-sanitize
.
Unknown nodes
Unknown nodes are nodes with a type that isn’t in handlers
or passThrough
. The default behavior for unknown nodes is:
- when the node has a
value
(and doesn’t havedata.hName
,data.hProperties
, ordata.hChildren
, see later), create a hasttext
node - otherwise, create a
<div>
element (which could be changed withdata.hName
), with its children mapped from mdast to hast as well
This behavior can be changed by passing an unknownHandler
.
- @overload
- @overload
- @param destination Processor or configuration (optional).
- @param options When a processor was given, configuration (optional).
- @returns Transform.
(alias) const unified: Processor<undefined, undefined, undefined, undefined, undefined>
import unified
Create a new processor.
- @example This example shows how a new processor can be created (from
remark
) and linked to stdin(4) and stdout(4).import process from 'node:process' import concatStream from 'concat-stream' import {remark} from 'remark' process.stdin.pipe( concatStream(function (buf) { process.stdout.write(String(remark().processSync(buf))) }) )
- @returns New unfrozen processor (
processor
). This processor is configured to work the same as its ancestor. When the descendant processor is configured in the future it does not affect the ancestral processor.
const document: string
(alias) module "node:fs/promises"
import fs
function readFile(path: PathLike | fs.FileHandle, options: ({
encoding: BufferEncoding;
flag?: OpenMode | undefined;
} & EventEmitter<T extends EventMap<T> = DefaultEventMap>.Abortable) | BufferEncoding): Promise<string> (+2 overloads)
Asynchronously reads the entire contents of a file.
- @param path A path to a file. If a URL is provided, it must use the
file:
protocol. If aFileHandle
is provided, the underlying file will not be closed automatically. - @param options An object that may contain an optional flag. If a flag is not provided, it defaults to
'r'
.
const file: VFile
(alias) unified(): Processor<undefined, undefined, undefined, undefined, undefined>
import unified
Create a new processor.
- @example This example shows how a new processor can be created (from
remark
) and linked to stdin(4) and stdout(4).import process from 'node:process' import concatStream from 'concat-stream' import {remark} from 'remark' process.stdin.pipe( concatStream(function (buf) { process.stdout.write(String(remark().processSync(buf))) }) )
- @returns New unfrozen processor (
processor
). This processor is configured to work the same as its ancestor. When the descendant processor is configured in the future it does not affect the ancestral processor.
(method) Processor<undefined, undefined, undefined, undefined, undefined>.use<[], string, Root>(plugin: Plugin<[], string, Root>, ...parameters: [] | [boolean]): Processor<Root, undefined, undefined, undefined, undefined> (+2 overloads)
Configure the processor to use a plugin, a list of usable values, or a preset.
If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.
Note:
use
cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.
- @example There are many ways to pass plugins to
.use()
. This example gives an overview:import {unified} from 'unified' unified() // Plugin with options: .use(pluginA, {x: true, y: true}) // Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`): .use(pluginA, {y: false, z: true}) // Plugins: .use([pluginB, pluginC]) // Two plugins, the second with options: .use([pluginD, [pluginE, {}]]) // Preset with plugins and settings: .use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}}) // Settings only: .use({settings: {position: false}})
- @template {Array} [Parameters=[]]
- @template {Node | string | undefined} [Input=undefined]
- @template [Output=Input]
- @overload
- @overload
- @overload
- @param value Usable value.
- @param parameters Parameters, when a plugin is given as a usable value.
- @returns Current processor.
(alias) const remarkParse: Plugin<[(Readonly<Options> | null | undefined)?], string, Root>
import remarkParse
Add support for parsing from markdown.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(method) Processor<Root, undefined, undefined, undefined, undefined>.use<[], Root, Root>(plugin: Plugin<[], Root, Root>, ...parameters: [] | [boolean]): Processor<Root, Root, Root, undefined, undefined> (+2 overloads)
Configure the processor to use a plugin, a list of usable values, or a preset.
If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.
Note:
use
cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.
- @example There are many ways to pass plugins to
.use()
. This example gives an overview:import {unified} from 'unified' unified() // Plugin with options: .use(pluginA, {x: true, y: true}) // Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`): .use(pluginA, {y: false, z: true}) // Plugins: .use([pluginB, pluginC]) // Two plugins, the second with options: .use([pluginD, [pluginE, {}]]) // Preset with plugins and settings: .use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}}) // Settings only: .use({settings: {position: false}})
- @template {Array} [Parameters=[]]
- @template {Node | string | undefined} [Input=undefined]
- @template [Output=Input]
- @overload
- @overload
- @overload
- @param value Usable value.
- @param parameters Parameters, when a plugin is given as a usable value.
- @returns Current processor.
(alias) function remarkRehype(processor: Processor, options?: Readonly<Options> | null | undefined): TransformBridge (+1 overload)
import remarkRehype
Turn markdown into HTML.
Notes
Signature
- if a processor is given, runs the (rehype) plugins used on it with a hast tree, then discards the result (bridge mode)
- otherwise, returns a hast tree, the plugins used after
remarkRehype
are rehype plugins (mutate mode)
👉 Note: It’s highly unlikely that you want to pass a
processor
.
HTML
Raw HTML is available in mdast as html
nodes and can be embedded in hast as semistandard raw
nodes. Most plugins ignore raw
nodes but two notable ones don’t:
rehype-stringify
also has an optionallowDangerousHtml
which will output the raw HTML. This is typically discouraged as noted by the option name but is useful if you completely trust authorsrehype-raw
can handle the raw embedded HTML strings by parsing them into standard hast nodes (element
,text
, etc). This is a heavy task as it needs a full HTML parser, but it is the only way to support untrusted content
Footnotes
Many options supported here relate to footnotes. Footnotes are not specified by CommonMark, which we follow by default. They are supported by GitHub, so footnotes can be enabled in markdown with remark-gfm
.
The options footnoteBackLabel
and footnoteLabel
define natural language that explains footnotes, which is hidden for sighted users but shown to assistive technology. When your page is not in English, you must define translated values.
Back references use ARIA attributes, but the section label itself uses a heading that is hidden with an sr-only
class. To show it to sighted users, define different attributes in footnoteLabelProperties
.
Clobbering
Footnotes introduces a problem, as it links footnote calls to footnote definitions on the page through id
attributes generated from user content, which results in DOM clobbering.
DOM clobbering is this:
<p id=x></p>
<script>alert(x) // `x` now refers to the DOM `p#x` element</script>
Elements by their ID are made available by browsers on the window
object, which is a security risk. Using a prefix solves this problem.
More information on how to handle clobbering and the prefix is explained in Example: headings (DOM clobbering) in rehype-sanitize
.
Unknown nodes
Unknown nodes are nodes with a type that isn’t in handlers
or passThrough
. The default behavior for unknown nodes is:
- when the node has a
value
(and doesn’t havedata.hName
,data.hProperties
, ordata.hChildren
, see later), create a hasttext
node - otherwise, create a
<div>
element (which could be changed withdata.hName
), with its children mapped from mdast to hast as well
This behavior can be changed by passing an unknownHandler
.
- @overload
- @overload
- @param destination Processor or configuration (optional).
- @param options When a processor was given, configuration (optional).
- @returns Transform.
(method) Processor<Root, Root, Root, undefined, undefined>.use<[], Root, string>(plugin: Plugin<[], Root, string>, ...parameters: [] | [boolean]): Processor<Root, Root, Root, Root, string> (+2 overloads)
Configure the processor to use a plugin, a list of usable values, or a preset.
If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.
Note:
use
cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.
- @example There are many ways to pass plugins to
.use()
. This example gives an overview:import {unified} from 'unified' unified() // Plugin with options: .use(pluginA, {x: true, y: true}) // Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`): .use(pluginA, {y: false, z: true}) // Plugins: .use([pluginB, pluginC]) // Two plugins, the second with options: .use([pluginD, [pluginE, {}]]) // Preset with plugins and settings: .use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}}) // Settings only: .use({settings: {position: false}})
- @template {Array} [Parameters=[]]
- @template {Node | string | undefined} [Input=undefined]
- @template [Output=Input]
- @overload
- @overload
- @overload
- @param value Usable value.
- @param parameters Parameters, when a plugin is given as a usable value.
- @returns Current processor.
(alias) const rehypeStringify: Plugin<[(Options | null | undefined)?], Root, string>
import rehypeStringify
Plugin to add support for serializing as HTML.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(method) Processor<Root, Root, Root, Root, string>.process(file?: Compatible | undefined): Promise<VFile> (+1 overload)
Process the given file as configured on the processor.
Note:
process
freezes the processor if not already frozen.
Note:
process
performs the parse, run, and stringify phases.
- @overload
- @overload
- @param file File (optional); typically
string
orVFile
]; any value accepted asx
innew VFile(x)
. - @param done Callback (optional).
- @returns Nothing if
done
is given. Otherwise a promise, rejected with a fatal error or resolved with the processed file. The parsed, transformed, and compiled value is available atfile.value
(see note).Note: unified typically compiles by serializing: most compilers return
string
(orUint8Array
). Some compilers, such as the one configured withrehype-react
, return other values (in this case, a React tree). If you’re using a compiler that doesn’t serialize, expect different result values.To register custom results in TypeScript, add them to {@linkcode CompileResultMap}.
const document: string
namespace console
var console: Console
The console
module provides a simple debugging console that is similar to the JavaScript console mechanism provided by web browsers.
The module exports two specific components:
- A
Console
class with methods such asconsole.log()
,console.error()
andconsole.warn()
that can be used to write to any Node.js stream. - A global
console
instance configured to write toprocess.stdout
andprocess.stderr
. The globalconsole
can be used without importing thenode:console
module.
Warning: The global console object's methods are neither consistently synchronous like the browser APIs they resemble, nor are they consistently asynchronous like all other Node.js streams. See the note on process I/O
for more information.
Example using the global console
:
console.log('hello world');
// Prints: hello world, to stdout
console.log('hello %s', 'world');
// Prints: hello world, to stdout
console.error(new Error('Whoops, something bad happened'));
// Prints error message and stack trace to stderr:
// Error: Whoops, something bad happened
// at [eval]:5:15
// at Script.runInThisContext (node:vm:132:18)
// at Object.runInThisContext (node:vm:309:38)
// at node:internal/process/execution:77:19
// at [eval]-wrapper:6:22
// at evalScript (node:internal/process/execution:76:60)
// at node:internal/main/eval_string:23:3
const name = 'Will Robinson';
console.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to stderr
Example using the Console
class:
const out = getStreamSomehow();
const err = getStreamSomehow();
const myConsole = new console.Console(out, err);
myConsole.log('hello world');
// Prints: hello world, to out
myConsole.log('hello %s', 'world');
// Prints: hello world, to out
myConsole.error(new Error('Whoops, something bad happened'));
// Prints: [Error: Whoops, something bad happened], to err
const name = 'Will Robinson';
myConsole.warn(`Danger ${name}! Danger!`);
// Prints: Danger Will Robinson! Danger!, to err
- @see source
(method) Console.log(message?: any, ...optionalParams: any[]): void
Prints to stdout
with newline. Multiple arguments can be passed, with the first used as the primary message and all additional used as substitution values similar to printf(3)
(the arguments are all passed to util.format()
).
const count = 5;
console.log('count: %d', count);
// Prints: count: 5, to stdout
console.log('count:', count);
// Prints: count: 5, to stdout
See util.format()
for more information.
- @since v0.1.100
var String: StringConstructor
(value?: any) => string
Allows manipulation and formatting of text strings and determination and location of substrings within strings.
const file: VFile
Now, running our module with Node:
node index.js
…gives us an example.html
file that looks as follows:
<h1>Pluto</h1>
<p>Pluto is an dwarf planet in the Kuiper belt.</p>
<h2>Contents</h2>
<h2>History</h2>
<h3>Discovery</h3>
<p>In the 1840s, Urbain Le Verrier used Newtonian mechanics to predict the
position of…</p>
<h3>Name and symbol</h3>
<p>The name Pluto is for the Roman god of the underworld, from a Greek epithet for
Hades…</p>
<h3>Planet X disproved</h3>
<p>Once Pluto was found, its faintness and lack of a viewable disc cast doubt…</p>
<h2>Orbit</h2>
<p>Pluto’s orbital period is about 248 years…</p>
👉 Note that
remark-rehype
doesn’t deal with HTML inside the markdown. See HTML and remark for more info.
🎉 Nifty! It doesn’t do much yet. We’ll get there. In the next section, we make this more useful by introducing plugins.
Plugins
We’re still missing some things Notably a table of contents and proper HTML document structure.
We can use rehype-slug
and remark-toc
for the former and rehype-document
for the latter task.
npm install rehype-document rehype-slug remark-toc
Let’s now use those two as well, by modifying our index.js
file:
--- a/index.js
+++ b/index.js
@@ -1,14 +1,20 @@
import fs from 'node:fs/promises'
+import rehypeDocument from 'rehype-document'
+import rehypeSlug from 'rehype-slug'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
+import remarkToc from 'remark-toc'
import {unified} from 'unified'
const document = await fs.readFile('example.md', 'utf8')
const file = await unified()
.use(remarkParse)
+ .use(remarkToc)
.use(remarkRehype)
+ .use(rehypeSlug)
+ .use(rehypeDocument, {title: 'Pluto'})
.use(rehypeStringify)
.process(document)
We pass options to rehype-document
. In this case, we use that to make sure we get a proper <title>
element in our <head>
, as required by the HTML specification. More options are accepted by rehype-document
, such as which language tag to use. These are described in detail in its readme.md
. Many other plugins accept options as well, so make sure to read through their docs to learn more.
👉 Note that remark plugins work on a markdown tree. rehype plugins work on an HTML tree. It’s important that you place your
.use
calls in the correct places: plugins are order sensitive!
When running our module like before, we’d get the following example.html
file:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Pluto</title>
<meta content="width=device-width, initial-scale=1" name="viewport">
</head>
<body>
<h1 id="pluto">Pluto</h1>
<p>Pluto is an dwarf planet in the Kuiper belt.</p>
<h2 id="contents">Contents</h2>
<ul>
<li><a href="#history">History</a>
<ul>
<li><a href="#discovery">Discovery</a></li>
<li><a href="#name-and-symbol">Name and symbol</a></li>
<li><a href="#planet-x-disproved">Planet X disproved</a></li>
</ul>
</li>
<li><a href="#orbit">Orbit</a></li>
</ul>
<h2 id="history">History</h2>
<h3 id="discovery">Discovery</h3>
<p>In the 1840s, Urbain Le Verrier used Newtonian mechanics to predict the
position of…</p>
<h3 id="name-and-symbol">Name and symbol</h3>
<p>The name Pluto is for the Roman god of the underworld, from a Greek epithet for
Hades…</p>
<h3 id="planet-x-disproved">Planet X disproved</h3>
<p>Once Pluto was found, its faintness and lack of a viewable disc cast doubt…</p>
<h2 id="orbit">Orbit</h2>
<p>Pluto’s orbital period is about 248 years…</p>
</body>
</html>
👉 Note that the document isn’t formatted nicely. There’s a plugin for that though! Feel free to add
rehype-format
to the plugins. Right afterrehypeDocument
!
💯 You’re acing it! This is getting pretty useful, right?
In the next section, we lay the groundwork for creating a report.
Reporting
Before we check some prose, let’s first switch up our index.js
file to print a pretty report.
We can use to-vfile
to read and write virtual files from the file system. Then we can use vfile-reporter
to report messages relating to those files. Let’s install those.
npm install to-vfile vfile-reporter
…and then use vfile in our example instead, like so:
--- a/index.js
+++ b/index.js
@@ -1,21 +1,24 @@
-import fs from 'node:fs/promises'
import rehypeDocument from 'rehype-document'
import rehypeSlug from 'rehype-slug'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
import remarkToc from 'remark-toc'
+import {read, write} from 'to-vfile'
import {unified} from 'unified'
+import {reporter} from 'vfile-reporter'
-const document = await fs.readFile('example.md', 'utf8')
+const file = await read('example.md')
-const file = await unified()
+await unified()
.use(remarkParse)
.use(remarkToc)
.use(remarkRehype)
.use(rehypeSlug)
.use(rehypeDocument, {title: 'Pluto'})
.use(rehypeStringify)
- .process(document)
+ .process(file)
-console.log(String(file))
+console.error(reporter(file))
+file.extname = '.html'
+await write(file)
If we now run our module on its own we get a report showing everything’s fine:
$ node index.js
example.md: no issues found
But everything’s not fine: there’s a typo in the markdown! The next section shows how to detect prose errors by adding retext.
Checking prose
I did notice a typo in there. So let’s check some prose to prevent that from happening in the future. We can use retext and its ecosystem for our natural language parsing. As we’re writing in English, we use retext-english
specifically to parse English natural language. The problem in our example.md
file is that it has an dwarf planet
instead of a dwarf planet
, which is conveniently checked for by retext-indefinite-article
. To bridge from markup to prose we use remark-retext
. Let’s install these dependencies as well.
npm install remark-retext retext-english retext-indefinite-article
…and change our index.js
like so:
--- a/index.js
+++ b/index.js
@@ -3,7 +3,10 @@ import rehypeSlug from 'rehype-slug'
import rehypeStringify from 'rehype-stringify'
import remarkParse from 'remark-parse'
import remarkRehype from 'remark-rehype'
+import remarkRetext from 'remark-retext'
import remarkToc from 'remark-toc'
+import retextEnglish from 'retext-english'
+import retextIndefiniteArticle from 'retext-indefinite-article'
import {read, write} from 'to-vfile'
import {unified} from 'unified'
import {reporter} from 'vfile-reporter'
@@ -12,6 +15,8 @@ const file = await read('example.md')
await unified()
.use(remarkParse)
+ // @ts-expect-error: fine.
+ .use(remarkRetext, unified().use(retextEnglish).use(retextIndefiniteArticle))
.use(remarkToc)
.use(remarkRehype)
.use(rehypeSlug)
As the code shows, remark-retext
receives another unified
pipeline. A natural language pipeline. The plugin will transform the origin syntax (markdown) with the parser defined on the given pipeline. Then it runs the attached plugins on the natural language syntax tree.
Now when running our module one final time:
$ node index.js
example.md
3:10-3:12 warning Unexpected article `an` before `dwarf`, expected `a` retext-indefinite-article retext-indefinite-article
⚠ 1 warning
…we get a useful message.
💃 You’ve got a really cool system set up already. Nicely done! That’s a wrap though, check out the next section for further exercises and resources.
Further exercises
Finally, check out the lists of available plugins for retext, remark, and rehype, and try some of them out.
If you haven’t already, check out the other articles in the learn section!