How to walk a tree
Contents
Tree traversal
Tree traversal is a common task when working with syntax trees. The term tree here means a node and all its descendants (all the nodes inside it). Traversal means stopping at every node to do something. So, tree traversal means doing something for every node in a tree.
Tree traversal is often also called “walking a tree”. Or “visiting a tree”.
To learn more, continue reading, but when working with unist (unified’s trees) you probably need either:
Set up
Glad you’re still here! Let’s say we have the following fragment of HTML, in a file example.html:
<p>
<!-- A comment. -->
Some <strong>strong importance</strong>, <em>emphasis</em>, and a dash of
<code>code</code>.
</p>
You could parse that with the following code (using unified and rehype-parse):
import fs from 'node:fs/promises'
import rehypeParse from 'rehype-parse'
import {unified} from 'unified'
const document = await fs.readFile('example.html')
const tree = unified().use(rehypeParse, {fragment: true}).parse(document)
console.log(tree)
(alias) module "node:fs/promises"
import fs
The fs/promises API provides asynchronous file system methods that return promises.
The promise APIs use the underlying Node.js threadpool to perform file system operations off the event loop thread. These operations are not synchronized or threadsafe. Care must be taken when performing multiple concurrent modifications on the same file or data corruption may occur.
(alias) const rehypeParse: Plugin<[(Options | null | undefined)?], string, Root>
import rehypeParse
Plugin to add support for parsing from HTML.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(alias) const unified: Processor<undefined, undefined, undefined, undefined, undefined>
import unified
const document: NonSharedBuffer
(alias) module "node:fs/promises"
import fs
The fs/promises API provides asynchronous file system methods that return promises.
The promise APIs use the underlying Node.js threadpool to perform file system operations off the event loop thread. These operations are not synchronized or threadsafe. Care must be taken when performing multiple concurrent modifications on the same file or data corruption may occur.
function readFile(path: PathLike | fs.FileHandle, options?: ({
encoding?: null | undefined;
flag?: OpenMode | undefined;
} & EventEmitter<T extends EventMap<T> = any>.Abortable) | null): Promise<NonSharedBuffer> (+2 overloads)
Asynchronously reads the entire contents of a file.
If no encoding is specified (using options.encoding), the data is returned as a Buffer object. Otherwise, the data will be a string.
If options is a string, then it specifies the encoding.
When the path is a directory, the behavior of fsPromises.readFile() is platform-specific. On macOS, Linux, and Windows, the promise will be rejected with an error. On FreeBSD, a representation of the directory's contents will be returned.
An example of reading a package.json file located in the same directory of the running code:
import { readFile } from 'node:fs/promises';
try {
const filePath = new URL('./package.json', import.meta.url);
const contents = await readFile(filePath, { encoding: 'utf8' });
console.log(contents);
} catch (err) {
console.error(err.message);
}
It is possible to abort an ongoing readFile using an AbortSignal. If a request is aborted the promise returned is rejected with an AbortError:
import { readFile } from 'node:fs/promises';
try {
const controller = new AbortController();
const { signal } = controller;
const promise = readFile(fileName, { signal });
// Abort the request before the promise settles.
controller.abort();
await promise;
} catch (err) {
// When a request is aborted - err is an AbortError
console.error(err);
}
Aborting an ongoing request does not abort individual operating system requests but rather the internal buffering fs.readFile performs.
Any specified FileHandle has to support reading.
- @since v10.0.0
- @param path filename or
FileHandle - @return Fulfills with the contents of the file.
(alias) unified(): Processor<undefined, undefined, undefined, undefined, undefined>
import unified
(method) Processor<undefined, undefined, undefined, undefined, undefined>.use<[{
fragment: boolean;
}], string, Root>(plugin: Plugin<[{
fragment: boolean;
}], string, Root>, ...parameters: [boolean] | [{
fragment: boolean;
}]): Processor<Root, undefined, undefined, undefined, undefined> (+2 overloads)
Configure the processor to use a plugin, a list of usable values, or a preset.
If the processor is already using a plugin, the previous plugin configuration is changed based on the options that are passed in. In other words, the plugin is not added a second time.
Note: use cannot be called on frozen processors. Call the processor first to create a new unfrozen processor.
- @example There are many ways to pass plugins to
.use(). This example gives an overview:import {unified} from 'unified'
unified()
// Plugin with options:
.use(pluginA, {x: true, y: true})
// Passing the same plugin again merges configuration (to `{x: true, y: false, z: true}`):
.use(pluginA, {y: false, z: true})
// Plugins:
.use([pluginB, pluginC])
// Two plugins, the second with options:
.use([pluginD, [pluginE, {}]])
// Preset with plugins and settings:
.use({plugins: [pluginF, [pluginG, {}]], settings: {position: false}})
// Settings only:
.use({settings: {position: false}})
- @template {Array} [Parameters=[]]
- @template {Node | string | undefined} [Input=undefined]
- @template [Output=Input]
- @overload
- @overload
- @overload
- @param value Usable value.
- @param parameters Parameters, when a plugin is given as a usable value.
- @returns Current processor.
(alias) const rehypeParse: Plugin<[(Options | null | undefined)?], string, Root>
import rehypeParse
Plugin to add support for parsing from HTML.
- @this processor.
- @param Configuration (optional).
- @returns Nothing.
(property) fragment: boolean
(method) Processor<Root, undefined, undefined, undefined, undefined>.parse(file?: Compatible | undefined): Root
Parse text to a syntax tree.
Note: parse freezes the processor if not already frozen.
Note: parse performs the parse phase, not the run phase or other phases.
- @param file file to parse (optional); typically
string or VFile; any value accepted as x in new VFile(x). - @returns Syntax tree representing
file.
const document: NonSharedBuffer
(method) console.Console.log(...data: any[]): void
Which would yield (ignoring positional info for brevity):
{
type: 'root',
children: [
{
type: 'element',
tagName: 'p',
properties: {},
children: [
{ type: 'text', value: '\n ' },
{ type: 'comment', value: ' A comment. ' },
{ type: 'text', value: '\n Some ' },
{
type: 'element',
tagName: 'strong',
properties: {},
children: [ { type: 'text', value: 'strong importance' } ]
},
{ type: 'text', value: ', ' },
{
type: 'element',
tagName: 'em',
properties: {},
children: [ { type: 'text', value: 'emphasis' } ]
},
{ type: 'text', value: ', and a dash of\n ' },
{
type: 'element',
tagName: 'code',
properties: {},
children: [ { type: 'text', value: 'code' } ]
},
{ type: 'text', value: '.\n' }
]
},
{ type: 'text', value: '\n' }
],
data: { quirksMode: false }
}
As we are all set up, we can traverse the tree.
Traverse the tree
A useful utility for that is unist-util-visit, and it works like so:
import {visit} from 'unist-util-visit'
// …
visit(tree, function (node) {
console.log(node.type)
})
(alias) function visit<Tree extends Node, Check extends Test>(tree: Tree, check: Check, visitor: BuildVisitor<Tree, Check>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(parameter) node: Root | Doctype | ElementContent
(method) console.Console.log(...data: any[]): void
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"
Node type of hast root.
Node type of HTML comments in hast.
Node type of HTML document types in hast.
Node type of elements.
Node type of HTML character data (plain text) in hast.
root
element
text
comment
text
element
text
text
element
text
text
element
text
text
text
We traversed the entire tree, and for each node, we printed its type.
Visiting a certain kind of node
To “visit” only a certain type of node, pass a test to unist-util-visit like so:
visit(tree, 'element', function (node) {
console.log(node.tagName)
})
(alias) visit<Root, "element">(tree: Root, check: "element", visitor: BuildVisitor<Root, "element">, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(parameter) node: Element
(method) console.Console.log(...data: any[]): void
(parameter) node: Element
(property) Element.tagName: string
Tag name (such as 'body') of the element.
p
strong
em
code
You can do this yourself as well. The above works the same as:
visit(tree, function (node) {
if (node.type === 'element') {
console.log(node.tagName)
}
})
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(parameter) node: Root | Doctype | ElementContent
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"
Node type of hast root.
Node type of HTML comments in hast.
Node type of HTML document types in hast.
Node type of elements.
Node type of HTML character data (plain text) in hast.
(method) console.Console.log(...data: any[]): void
(parameter) node: Element
(property) Element.tagName: string
Tag name (such as 'body') of the element.
But the test passed to visit can be more advanced, such as the following to visit different kinds of nodes.
visit(tree, ['comment', 'text'], function (node) {
console.log([node.value])
})
(alias) visit<Root, string[]>(tree: Root, check: string[], visitor: BuildVisitor<Root, string[]>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(parameter) node: Root | Doctype | ElementContent
(method) console.Console.log(...data: any[]): void
(parameter) node: Root | Doctype | ElementContent
Property 'value' does not exist on type 'Root | Doctype | ElementContent'.
Property 'value' does not exist on type 'Root'. (2339)
[ '\n ' ]
[ ' A comment. ' ]
[ '\n Some ' ]
[ 'strong importance' ]
[ ', ' ]
[ 'emphasis' ]
[ ', and a dash of\n ' ]
[ 'code' ]
[ '.\n' ]
[ '\n' ]
Sadly, TypeScript isn’t great with arrays and discriminated unions. When you want to do more complex tests with TypeScript, it’s recommended to omit the test and use explicit if statements:
visit(tree, function (node) {
if (node.type === 'comment' || node.type === 'text') {
console.log([node.value])
}
})
(alias) visit<Root, Test>(tree: Root, visitor: BuildVisitor<Root, Test>, reverse?: boolean | null | undefined): undefined (+1 overload)
import visit
Visit nodes.
This algorithm performs depth-first tree traversal in preorder (NLR) or if reverse is given, in reverse preorder (NRL).
You can choose for which nodes visitor is called by passing a test. For complex tests, you should test yourself in visitor, as it will be faster and will have improved type information.
Walking the tree is an intensive task. Make use of the return values of the visitor when possible. Instead of walking a tree multiple times, walk it once, use unist-util-is to check if a node matches, and then perform different operations.
You can change the tree. See Visitor for more info.
- @overload
- @overload
- @param tree Tree to traverse.
- @param testOrVisitor
unist-util-is-compatible test (optional, omit to pass a visitor). - @param visitorOrReverse Handle each node (when test is omitted, pass
reverse). - @param maybeReverse Traverse in reverse preorder (NRL) instead of the default preorder (NLR).
- @returns Nothing.
- @template {UnistNode} Tree Node type.
- @template {Test} Check
unist-util-is-compatible test.
(parameter) node: Root | Doctype | ElementContent
(parameter) node: Root | Doctype | ElementContent
(property) type: "root" | "comment" | "doctype" | "element" | "text"
Node type of hast root.
Node type of HTML comments in hast.
Node type of HTML document types in hast.
Node type of elements.
Node type of HTML character data (plain text) in hast.
(parameter) node: Root | Doctype | Element | Text
(property) type: "root" | "doctype" | "element" | "text"
Node type of hast root.
Node type of HTML document types in hast.
Node type of elements.
Node type of HTML character data (plain text) in hast.
(method) console.Console.log(...data: any[]): void
(parameter) node: Comment | Text
(property) Literal.value: string
Code that is more explicit and is understandable by TypeScript, is often also easier to understand by humans.
Read more about unist-util-visit in its readme.