unified

Learn/Guide/Create a rehype plugin

Create a rehype plugin

This guide shows how to create a plugin for rehype that adds id attributes to headings.

Stuck? Have an idea for another guide? See support.md.

Contents

Case

Before we start, let’s first outline what we want to make. Say we have the following file:

<h1>Solar System</h1>
<h2>Formation and evolution</h2>
<h2>Structure and composition</h2>
<h3>Orbits</h3>
<h3>Composition</h3>
<h3>Distances and scales</h3>
<h3>Interplanetary environment</h3>
<p>…</p>

And we’d like to turn that into:

<h1 id="solar-system">Solar System</h1>
<h2 id="formation-and-evolution">Formation and evolution</h2>
<h2 id="structure-and-composition">Structure and composition</h2>
<h3 id="orbits">Orbits</h3>
<h3 id="composition">Composition</h3>
<h3 id="distances-and-scales">Distances and scales</h3>
<h3 id="interplanetary-environment">Interplanetary environment</h3>
<p>…</p>

In the next step we’ll write the code to use our plugin.

Setting up

Let’s set up a project. Create a folder, example, enter it, and initialize a new project:

mkdir example
cd example
npm init -y

Then make sure the project is a module, so that import and export work, by changing package.json:

--- a/package.json
+++ b/package.json
@@ -1,6 +1,7 @@
 {
   "name": "example",
   "version": "1.0.0",
+  "type": "module",
   "main": "index.js",
   "scripts": {
     "test": "echo \"Error: no test specified\" && exit 1"

Make sure input.html exists with:

<h1>Solar System</h1>
<h2>Formation and evolution</h2>
<h2>Structure and composition</h2>
<h3>Orbits</h3>
<h3>Composition</h3>
<h3>Distances and scales</h3>
<h3>Interplanetary environment</h3>
<p>…</p>

Now, let’s create an example.js file that will process our file and report any found problems.

import fs from 'node:fs/promises'
import {rehype} from 'rehype'
import rehypeSlug from './plugin.js'

const document = await fs.readFile('input.html', 'utf8')

const file = await rehype()
  .data('settings', {fragment: true})
  .use(rehypeSlug)
  .process(document)

await fs.writeFile('output.html', String(file))
(alias) module "node:fs/promises"
import fs
(alias) const rehype: Processor<Root, undefined, undefined, Root, string>
import rehype

Create a new unified processor that already uses rehype-parse and rehype-stringify.

(alias) function rehypeSlug(): (tree: Root) => undefined
import rehypeSlug
const document: string
(alias) module "node:fs/promises"
import fs
function readFile(path: PathLike | fs.FileHandle, options: ({
    encoding: BufferEncoding;
    flag?: OpenMode | undefined;
} & EventEmitter<T extends EventMap<T> = DefaultEventMap>.Abortable) | BufferEncoding): Promise<string> (+2 overloads)

Asynchronously reads the entire contents of a file.

  • @param path A path to a file. If a URL is provided, it must use the file: protocol. If a FileHandle is provided, the underlying file will not be closed automatically.
  • @param options An object that may contain an optional flag. If a flag is not provided, it defaults to 'r'.
const file: VFile
(alias) rehype(): Processor<Root, undefined, undefined, Root, string>
import rehype

Create a new unified processor that already uses rehype-parse and rehype-stringify.

(method) Processor<Root, undefined, undefined, Root, string>.data<"settings">(key: "settings", value: Settings | undefined): Processor<Root, undefined, undefined, Root, string> (+3 overloads)

Configure the processor with info available to all plugins. Information is stored in an object.

Typically, options can be given to a specific plugin, but sometimes it makes sense to have information shared with several plugins. For example, a list of HTML elements that are self-closing, which is needed during all phases.

Note: setting information cannot occur on frozen processors. Call the processor first to create a new unfrozen processor.

Note: to register custom data in TypeScript, augment the

{@linkcode Data } interface.

  • @example This example show how to get and set info:
    import {unified} from 'unified'
    
    const processor = unified().data('alpha', 'bravo')
    
    processor.data('alpha') // => 'bravo'
    
    processor.data() // => {alpha: 'bravo'}
    
    processor.data({charlie: 'delta'})
    
    processor.data() // => {charlie: 'delta'}
    
  • @template {keyof Data} Key
  • @overload
  • @overload
  • @overload
  • @overload
  • @param key Key to get or set, or entire dataset to set, or nothing to get the entire dataset (optional).
  • @param value Value to set (optional).
  • @returns The current processor when setting, the value at key when getting, or the entire dataset when getting without key.
(property) fragment?: boolean | null | undefined

Specify whether to parse a fragment, instead of a complete document (default: false).

In document mode, unopened html, head, and body elements are opened in just the right places.

(method) Processor<Root, undefined, undefined, Root, string>.use<[], Root, undefined>(plugin: Plugin<[], Root, undefined>, ...parameters: [] | [boolean]): Processor<Root, Root, undefined, Root, string> (+2 overloads)

Configure the processor to use a plugin, a list of usable values, or a preset.

If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.

Note: use cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.

  • @example There are many ways to pass plugins to .use(). This example gives an overview:
    import {unified} from 'unified'
    
    unified()
      // Plugin with options:
      .use(pluginA, {x: true, y: true})
      // Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`):
      .use(pluginA, {y: false, z: true})
      // Plugins:
      .use([pluginB, pluginC])
      // Two plugins, the second with options:
      .use([pluginD, [pluginE, {}]])
      // Preset with plugins and settings:
      .use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}})
      // Settings only:
      .use({settings: {position: false}})
    
  • @template {Array} [Parameters=[]]
  • @template {Node | string | undefined} [Input=undefined]
  • @template [Output=Input]
  • @overload
  • @overload
  • @overload
  • @param value Usable value.
  • @param parameters Parameters, when a plugin is given as a usable value.
  • @returns Current processor.
(alias) function rehypeSlug(): (tree: Root) => undefined
import rehypeSlug
(method) Processor<Root, Root, undefined, Root, string>.process(file?: Compatible | undefined): Promise<VFile> (+1 overload)

Process the given file as configured on the processor.

Note: process freezes the processor if not already frozen.

Note: process performs the parse, run, and stringify phases.

  • @overload
  • @overload
  • @param file File (optional); typically string or VFile]; any value accepted as x in new VFile(x).
  • @param done Callback (optional).
  • @returns Nothing if done is given. Otherwise a promise, rejected with a fatal error or resolved with the processed file. The parsed, transformed, and compiled value is available at file.value (see note).

    Note: unified typically compiles by serializing: most compilers return string (or Uint8Array). Some compilers, such as the one configured with rehype-react, return other values (in this case, a React tree). If you’re using a compiler that doesn’t serialize, expect different result values.

    To register custom results in TypeScript, add them to {@linkcode CompileResultMap}.

const document: string
(alias) module "node:fs/promises"
import fs
function writeFile(file: PathLike | fs.FileHandle, data: string | NodeJS.ArrayBufferView | Iterable<string | NodeJS.ArrayBufferView> | AsyncIterable<string | NodeJS.ArrayBufferView> | internal.Stream, options?: (ObjectEncodingOptions & {
    mode?: Mode | undefined;
    flag?: OpenMode | undefined;
    flush?: boolean | undefined;
} & EventEmitter<T extends EventMap<...> = DefaultEventMap>.Abortable) | BufferEncoding | null): Promise<void>

Asynchronously writes data to a file, replacing the file if it already exists. data can be a string, a buffer, an AsyncIterable, or an Iterable object.

The encoding option is ignored if data is a buffer.

If options is a string, then it specifies the encoding.

The mode option only affects the newly created file. See fs.open() for more details.

Any specified FileHandle has to support writing.

It is unsafe to use fsPromises.writeFile() multiple times on the same file without waiting for the promise to be settled.

Similarly to fsPromises.readFile - fsPromises.writeFile is a convenience method that performs multiple write calls internally to write the buffer passed to it. For performance sensitive code consider using fs.createWriteStream() or filehandle.createWriteStream().

It is possible to use an AbortSignal to cancel an fsPromises.writeFile(). Cancelation is "best effort", and some amount of data is likely still to be written.

import { writeFile } from 'node:fs/promises';
import { Buffer } from 'node:buffer';

try {
  const controller = new AbortController();
  const { signal } = controller;
  const data = new Uint8Array(Buffer.from('Hello Node.js'));
  const promise = writeFile('message.txt', data, { signal });

  // Abort the request before the promise settles.
  controller.abort();

  await promise;
} catch (err) {
  // When a request is aborted - err is an AbortError
  console.error(err);
}

Aborting an ongoing request does not abort individual operating system requests but rather the internal buffering fs.writeFile performs.

  • @since v10.0.0
  • @param file filename or FileHandle
  • @return Fulfills with undefined upon success.
var String: StringConstructor
(value?: any) => string

Allows manipulation and formatting of text strings and determination and location of substrings within strings.

const file: VFile

Don’t forget to npm install rehype!

If you read the guide on using unified, you’ll see some familiar statements. First, we load dependencies, then we read the file in. We process that file with the plugin we’ll create next and finally we write it out again.

Note that we directly depend on rehype. This is a package that exposes a unified processor, and comes with the HTML parser and HTML compiler attached.

Now we’ve got everything set up except for the plugin itself. We’ll do that in the next section.

Plugin

We’ll need a plugin and for our case also a transform. Let’s create them in our plugin file plugin.js:

/**
 * @import {Root} from 'hast'
 */

/**
 * Add `id`s to headings.
 *
 * @returns
 *   Transform.
 */
export default function rehypeSlug() {
  /**
   * @param {Root} tree
   * @return {undefined}
   */
  return function (tree) {
  }
}
function rehypeSlug(): (tree: Root) => undefined

Add ids to headings.

  • @returns Transform.
(parameter) tree: Root
  • @param tree

That’s how most plugins start. A function that returns another function.

Next, for this use case, we can walk the tree and change nodes with unist-util-visit. That’s how many plugins work.

Let’s start there, to use unist-util-visit to look for headings:

--- a/plugin.js
+++ b/plugin.js
@@ -2,6 +2,8 @@
  * @import {Root} from 'hast'
  */

+import {visit} from 'unist-util-visit'
+
 /**
  * Add `id`s to headings.
  *
@@ -14,5 +16,17 @@ export default function rehypeSlug() {
    * @return {undefined}
    */
   return function (tree) {
+    visit(tree, 'element', function (node) {
+      if (
+        node.tagName === 'h1' ||
+        node.tagName === 'h2' ||
+        node.tagName === 'h3' ||
+        node.tagName === 'h4' ||
+        node.tagName === 'h5' ||
+        node.tagName === 'h6'
+      ) {
+        console.log(node)
+      }
+    })
   }
 }

Don’t forget to npm install unist-util-visit!

If we now run our example with Node.js, we’ll see that console.log is called:

node example.js
{
  type: 'element',
  tagName: 'h1',
  properties: {},
  children: [ { type: 'text', value: 'Solar System', position: [Object] } ],
  position: …
}
{
  type: 'element',
  tagName: 'h2',
  properties: {},
  children: [
    {
      type: 'text',
      value: 'Formation and evolution',
      position: [Object]
    }
  ],
  position: …
}
…

This output shows that we find our heading element. That’s what we want.

Next we want to get a string representation of what is inside the headings. There’s another utility for that: hast-util-to-string.

--- a/plugin.js
+++ b/plugin.js
@@ -2,6 +2,7 @@
  * @import {Root} from 'hast'
  */

+import {toString} from 'hast-util-to-string'
 import {visit} from 'unist-util-visit'

 /**
@@ -25,7 +26,8 @@ export default function rehypeSlug() {
         node.tagName === 'h5' ||
         node.tagName === 'h6'
       ) {
-        console.log(node)
+        const value = toString(node)
+        console.log(value)
       }
     })
   }

Don’t forget to npm install hast-util-to-string!

If we now run our example with Node.js, we’ll see the text printed:

node example.js
Solar System
Formation and evolution
Structure and composition
Orbits
Composition
Distances and scales
Interplanetary environment

Then we want to turn that text into slugs. You have many options here. For this case, we’ll use github-slugger.

--- a/plugin.js
+++ b/plugin.js
@@ -3,6 +3,7 @@
  */

 import {toString} from 'hast-util-to-string'
+import Slugger from 'github-slugger'
 import {visit} from 'unist-util-visit'

 /**
@@ -17,6 +18,8 @@ export default function rehypeSlug() {
    * @return {undefined}
    */
   return function (tree) {
+    const slugger = new Slugger()
+
     visit(tree, 'element', function (node) {
       if (
         node.tagName === 'h1' ||
@@ -27,7 +30,8 @@ export default function rehypeSlug() {
         node.tagName === 'h6'
       ) {
         const value = toString(node)
-        console.log(value)
+        const id = slugger.slug(value)
+        console.log(id)
       }
     })
   }

Don’t forget to npm install github-slugger!

The reason const slugger = new Slugger() is there, is because we want to create a new slugger for each document. If we’d create it outside of the function, we’d reuse the same slugger for each document, which would lead to slugs from different documents being mixed. That becomes a problem for documents with the same headings.

If we now run our example with Node.js, we’ll see the slugs printed:

node example.js
solar-system
formation-and-evolution
structure-and-composition
orbits
composition
distances-and-scales
interplanetary-environment

Finally, we want to add the id to the heading elements. This is also a good time to make sure we don’t overwrite existing ids.

--- a/plugin.js
+++ b/plugin.js
@@ -22,16 +22,17 @@ export default function rehypeSlug() {

     visit(tree, 'element', function (node) {
       if (
-        node.tagName === 'h1' ||
-        node.tagName === 'h2' ||
-        node.tagName === 'h3' ||
-        node.tagName === 'h4' ||
-        node.tagName === 'h5' ||
-        node.tagName === 'h6'
+        !node.properties.id &&
+        (node.tagName === 'h1' ||
+          node.tagName === 'h2' ||
+          node.tagName === 'h3' ||
+          node.tagName === 'h4' ||
+          node.tagName === 'h5' ||
+          node.tagName === 'h6')
       ) {
         const value = toString(node)
         const id = slugger.slug(value)
-        console.log(id)
+        node.properties.id = id
       }
     })
   }

If we now run our example again with Node…

node example.js

…and open output.html, we’ll see that the IDs are there!

<h1 id="solar-system">Solar System</h1>
<h2 id="formation-and-evolution">Formation and evolution</h2>
<h2 id="structure-and-composition">Structure and composition</h2>
<h3 id="orbits">Orbits</h3>
<h3 id="composition">Composition</h3>
<h3 id="distances-and-scales">Distances and scales</h3>
<h3 id="interplanetary-environment">Interplanetary environment</h3>
<p>…</p>

That’s it! For a complete version of this plugin, see rehype-slug.

If you haven’t already, check out the other articles in the learn section!