unified

Learn/Guide/Intro to unified

Intro to unified

After reading this guide you will:

Contents

Intro

unified is a friendly interface backed by an ecosystem of plugins built for creating and manipulating content. It does this by taking markdown, HTML, plain text, or other content, then turning it into structured data, thus making it available to over 100 plugins. Plugins for example do tasks such as spellchecking, linting, or minifying.

With unified you don’t manually handle syntax or parsing. Instead you typically write one line of code to chain a plugin into unified’s process.

unified itself is a rather small module that acts as an interface to unify the handling of different content formats. Around a certain format there sits an ecosystem. Such as remark for markdown. Several ecosystems exist for unified. Together with other tools and specifications they form the unified collective.

Collective

The unified collective spans like-minded organizations. These organizations have the shared goal to innovate content processing. Seamless, interchangeable, and plugable tooling is how that’s achieved.

Depending on what you want to do you reference different organizations. So let’s start off with an introduction round.

The ecosystems:

The specifications for syntax trees:

Other building blocks:

We’ll get to how these come together in the next section. If you are already feeling adventurous, you can go directly to Using unified or How to get started with plugins.

How it comes together

These processors, specifications, and tools come together in a three part act. The process of a processor:

  1. parse: whether your input is markdown, HTML, or prose, it needs to be parsed to a workable format; such a format is called a syntax tree; the specifications (for example mdast) define how such a tree looks; the processors (such as remark for mdast) are responsible for creating them
  2. transform: this is where the magic happens; users compose plugins and the order they run in; plugins plug into this phase and transform and inspect the format they get
  3. stringify: the final step is to take the (adjusted) format and stringify it to markdown, HTML, or prose (which could be different from the input format!)

unified can be used programmatically in Bun, Deno, or Node.js. With a build step or through a CDN (such as esm.sh), it can be used in browsers as well. CLI versions, Grunt plugins, and Gulp plugins of processors also exist.

What makes unified unique is that it can switch between formats, such as markdown to HTML, in the same process. This allows for even more powerful compositions.

The following plugins bridge formats:

Use cases

Whenever you think about processing content — you can think of unified. It’s a powerful tool. So for some tasks, such as transforming markdown to HTML, you could use simpler tools like marked as well. Where unified really shines is when you want to go further than one single task. For example, when you want to enforce format rules, check spelling, generate a table of contents, and (potentially) much more: that’s when to opt for unified.

A large part of MDXs success has been leveraging the unified and remark ecosystem. I was able to get a prototype working in a few hours because I didn’t have to worry about markdown parsing: remark gave it to me for free. It provided the primitives to build on.

John Otander, author of mdx-js/mdx

To further speak to one’s imagination, here are the more common plugins used in unified pipelines to do interesting things:

Summary

Next steps