The unified library itself is a small module. It’s a rather small API. Plugins do everything else: minify HTML, lint markdown, check indefinite articles (“a”, “an”), and more.
Three syntaxes are connected to unified, each coming with a syntax tree definition, and a parser and stringifier: mdast with remark for Markdown, nlcst with retext for prose, and hast with rehype for HTML.
unified defers part of its logic to vfile, which is a virtual file format representing documents being processed, and unist, a schema for syntax trees.
vfile stores metadata about documents being processed (often, but not always, from the file system). Mainly, it houses a path to files, and their contents. Additionally, it tracks messages associated with files and where they occurred. This powers code linting, shown below with remark-cli, remark-validate-links, and remark-preset-lint-consistent.
debugger by Mozilla uses unified to check their markup and prose
Gatsby uses unified to process markdown and MDX for blazing fast static site generation
opensource.guide by GitHub (and you) uses unified to check markup and prose style
unist discloses documents as syntax trees. Syntax trees come in two flavours: Concrete (CST) and Abstract (AST). The first has all information needed to restore the original document completely, the latter does not. ASTs can recreate an exact syntactic representation. For example, CSTs house info on style such as tabs or spaces, but ASTs do not. This makes ASTs often easier to work with.
For example, say we have the following HTML element: