After reading this guide you will:
- Understand what unified does
- Get a taste of the ecosystem
- Know how it can be used
- Know what parts (processors) you need for your (future) use case
- Have a list of resources to continue learning or get started
unified is a friendly interface backed by an ecosystem of plugins built for creating and manipulating content. It does this by taking Markdown, HTML, or plain text prose, turning it into structured data, and making it available to over 100 plugins. Plugins for example do tasks such as spellchecking, linting, or minifying.
unified itself is a rather small module that acts as an interface to unify the handling of different content formats. Around a certain format, there sits an ecosystem, such as remark for Markdown. Several ecosystems exist for unified. Together with other tools and specifications, they form the unified collective.
The unified collective spans like-minded organizations. These organizations have the shared goal to innovate content processing. Seamless, interchangeable, and pluggable tooling is how that’s achieved.
Depending on what you want to do, you will reference different organizations. So let’s start off with an introduction round.
The specifications for syntax trees:
- unist — Universal Syntax Tree
- mdast — Markdown Abstract Syntax Tree format
- hast — HTML Abstract Syntax Tree format
- xast — XML Abstract Syntax Tree format
- esast — ECMAScript Abstract Syntax Tree format
- nlcst — Natural Language Concrete Syntax Tree format
Other building blocks:
- syntax-tree — Low-level utilities for building plugins
- vfile — Virtual file format for text processing
- MDX — Markdown and JSX
These processors, specifications, and tools come together in a three part act. The process of a processor:
- Parse: Whether your input is Markdown, HTML, or prose — it needs to be parsed to a workable format. Such a format is called a syntax tree. The specifications (for example mdast) define how such a syntax tree looks. The processors (such as remark for mdast) are responsible for creating them.
- Transform: This is where the magic happens. Users compose plugins and the order they run in. Plugins plug into this phase and transform and inspect the format they get.
- Stringify: The final step is to take the (adjusted) format and stringify it to Markdown, HTML, or prose (which could be different from the input format!)
unified can be used programmatically in Node.js. With a build step, it can be used in browsers as well. CLI versions, Grunt plugins, and Gulp plugins of processors also exist.
What makes unified unique is that it can switch between formats, such as Markdown to HTML, in the same process. This allows for even more powerful compositions.
The following plugins bridge formats:
remark-rehype— Markdown to HTML
rehype-remark— HTML to Markdown
remark-retext— Markdown to prose
rehype-retext— HTML to prose
Whenever you think about processing content — you can think of unified. It’s a powerful tool, so for some tasks, such as transforming Markdown to HTML, you could use simpler tools like
marked as well. Where unified really shines is when you want to go further than one single task. For example, when you want to enforce format rules, check spelling, generate a table of contents, and (potentially) much more: that’s when to opt for unified.
A large part of MDX’s success has been leveraging the unified and remark ecosystem. I was able to get a prototype working in a few hours because I didn’t have to worry about Markdown parsing: remark gave it to me for free. It provided the primitives to build on.
To further speak to one’s imagination, here are the more common plugins used in unified pipelines to do interesting things:
remark-toc— Generate a table of contents
rehype-prism— Highlight code in HTML with Prism
retext-spell— Check spelling
remark-lint— Check Markdown code style
retext-equality— Check possibly insensitive language
remark-math— Support math in Markdown / HTML
for forrepeated words
rehype-minify— Minify HTML
- …explore all remark, rehype or retext plugins
- unified is a friendly interface backed by an ecosystem of plugins built for creating and manipulating content. You don’t have to worry about parsing as you have the primitives to build on
- Hundreds of plugins are available
- remark is used for Markdown, rehype for HTML, and retext for natural language
- unified’s plugin pipeline lets you typically write one line of code to chain a feature into the process, such as bridging formats (such as Markdown to HTML)