unified

Project: micromark/micromark

Package: micromark@2.8.1

  1. Dependencies: 2·Dependents: 7
  2. small commonmark compliant markdown parser with positional info and concrete tokens
  1. remark 186
  2. unified 154
  3. plugin 131
  4. markdown 115
  5. html 110
  6. parse 22
  7. gfm 17
  8. markup 12
  9. process 7
  10. md 5
  11. commonmark 2
  12. parser 2

micromark

Build Coverage Downloads Size Sponsors Backers Chat

smol markdown parser that’s different (open beta)

Intro

micromark is a long awaited markdown parser. It uses a state machine to parse the entirety of markdown into tokens. It’s the smallest CommonMark compliant markdown parser in JavaScript. It’ll replace the internals of remark-parse, the most popular markdown parser. Its interface is optimized to compile to HTML, but its parts can be used to generate syntax trees or compile to other output formats too. It’s in open beta: up next are extensions (GFM, MDX), integration in remark, performance, CSTs, and docs.

Contents

Checklist

Install

npm:

npm install micromark

Use

var micromark = require('micromark')

console.log(micromark('## Hello, *world*!'))

Yields:

<h2>Hello, <em>world</em>!</h2>

Or (streaming interface):

var fs = require('fs')
var micromark = require('micromark/stream')

fs.createReadStream('example.md').pipe(micromark()).pipe(process.stdout)

Or (extensions, in this case micromark-extension-gfm):

var micromark = require('micromark')
var gfmSyntax = require('micromark-extension-gfm')
var gfmHtml = require('micromark-extension-gfm/html')

var doc = '* [x] contact@example.com ~~strikethrough~~'

var result = micromark(doc, {
  extensions: [gfmSyntax()],
  htmlExtensions: [gfmHtml]
})

console.log(result)
<ul>
<li><input checked="" disabled="" type="checkbox"> <a href="mailto:contact@example.com">contact@example.com</a> <del>strikethrough</del></li>
</ul>

Or use remark, which will soon include micromark and is pretty stable.

API

Note that there are more APIs than listed here currently. Those are considered to be in progress.

micromark(doc[, encoding][, options])

Compile markdown to HTML.

Parameters
doc

Markdown to parse (string or Buffer)

encoding

Character encoding to understand doc as when it’s a Buffer (string, default: 'utf8').

options.defaultLineEnding

Value to use for line endings not in doc (string, default: first line ending or '\n').

Generally, micromark copies line endings ('\r', '\n', '\r\n') in the markdown document over to the compiled HTML. In some cases, such as > a, CommonMark requires that extra line endings are added: <blockquote>\n<p>a</p>\n</blockquote>.

options.allowDangerousHtml

Whether to allow embedded HTML (boolean, default: false).

options.allowDangerousProtocol

Whether to allow potentially dangerous protocols in links and images (boolean, default: false). URLs relative to the current protocol are always allowed (such as, image.jpg). For links, the allowed protocols are http, https, irc, ircs, mailto, and xmpp. For images, the allowed protocols are http and https.

options.extensions

Array of syntax extensions (Array.<SyntaxExtension>, default: []).

options.htmlExtensions

Array of HTML extensions (Array.<HtmlExtension>, default: []).

Returns

string — Compiled HTML.

createSteam(options?)

Streaming version of micromark. Compiles markdown to HTML. options are the same as the buffering API above. Available at require('micromark/stream').

Extensions

There are two types of extensions for micromark: SyntaxExtension and HtmlExtension. They can be passed in extensions or htmlExtensions, respectively.

SyntaxExtension

A syntax extension is an object whose fields are the names of tokenizers: content (a block of, well, content: definitions and paragraphs), document (containers such as block quotes and lists), flow (block constructs such as ATX and setext headings, HTML, indented and fenced code, thematic breaks), string (things that work in a few places such as destinations, fenced code info, etc: character escapes and -references), or text (rich inline text: autolinks, character escapes and -references, code, hard breaks, HTML, images, links, emphasis, strong).

The values at such objects are character codes, mapping to constructs. The built in constructs are an extension. See it and the existing extensions for inspiration.

HtmlExtension

An HTML extension is an object whose fields are either enter or exit (reflecting whether a token is entered or exited). The values at such objects are names of tokens mapping to handlers. See the existing extensions for inspiration.

List of extensions

Version

The open beta of micromark starts at version 2.0.0 (there was a different package published on npm as micromark before). micromark will adhere to semver at 3.0.0. Use tilde ranges for now: "micromark": "~2.8.0".

Security

It’s safe to compile markdown to HTML if it does not include embedded HTML nor uses dangerous protocols in links (such as javascript: or data:). micromark is safe by default if embedded HTML or dangerous protocols are used too, as it encodes or drops them. Turning on the allowDangerousHtml or allowDangerousProtocol options for user-provided markdown opens you up to cross-site scripting (XSS) attacks.

For more information on markdown sanitation, see improper-markup-sanitization.md by @chalker.

See security.md in micromark/.github for how to submit a security report.

Contribute

See contributing.md in micromark/.github for ways to get started. See support.md for ways to get help.

This project has a code of conduct. By interacting with this repository, organisation, or community you agree to abide by its terms.

Support this effort and give back by sponsoring on OpenCollective!


Salesforce 🏅

Gatsby 🥇

Vercel 🥇

Netlify

Holloway

ThemeIsle

BoostIO

Expo


You?

License

MIT © Titus Wormer