rehype-infer-description-meta
rehype plugin to infer the description of a document.
Contents
What is this?
This package is a unified (rehype) plugin to infer the description of a document. It supports different methods: a specific element, everything up to a comment, or up to a certain number of characters.
unified is a project that transforms content with abstract syntax trees (ASTs). rehype adds support for HTML to unified. vfile is the virtual file interface used in unified. hast is the HTML AST that rehype uses. This is a rehype plugin that inspects hast and adds metadata to vfiles.
When should I use this?
This plugin is particularly useful in combination with rehype-meta
. When both are used together, a <meta name=description>
is populated with the document’s description.
Install
This package is ESM only. In Node.js (version 16+), install with npm:
npm install rehype-infer-description-meta
In Deno with esm.sh
:
import rehypeInferDescriptionMeta from 'https://esm.sh/rehype-infer-description-meta@2'
In browsers with esm.sh
:
<script type="module">
import rehypeInferDescriptionMeta from 'https://esm.sh/rehype-infer-description-meta@2?bundle'
</script>
Use
Say our module example.js
contains:
import rehypeDocument from 'rehype-document'
import rehypeFormat from 'rehype-format'
import rehypeInferDescriptionMeta from 'rehype-infer-description-meta'
import rehypeMeta from 'rehype-meta'
import rehypeParse from 'rehype-parse'
import rehypeStringify from 'rehype-stringify'
import {unified} from 'unified'
const examples = [
// 1. Example where the description is in a certain element.
`<h1>Hello, world!</h1>
<p class="byline">Lorem ipsum</p>
<p>Dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>`,
// 2. Example where the description runs from the start to a comment.
`<h1>Hello, world!</h1>
<p>Lorem ipsum<!--more--> dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>`,
// 3. Example where the description runs from the start to a certain number of characters.
`<h1>Hello, world!</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>`
]
const promises = examples.map(function (example) {
return (async function () {
const file = await unified()
.use(rehypeParse, {fragment: true})
.use(rehypeInferDescriptionMeta, {selector: '.byline'})
.use(rehypeDocument)
.use(rehypeMeta)
.use(rehypeFormat)
.use(rehypeStringify)
.process(example)
console.log(String(file))
})()
})
await Promise.all(promises)
…then running node example.js
yields:
👉 Note:
meta[name="description"]
is derived from.byline
:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1" name="viewport">
<meta name="description" content="Lorem ipsum">
</head>
<body>
<h1>Hello, world!</h1>
<p class="byline">Lorem ipsum</p>
<p>Dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
</body>
</html>
👉 Note:
meta[name="description"]
is derived from content before<!--more-->
:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1" name="viewport">
<meta name="description" content="Lorem ipsum">
</head>
<body>
<h1>Hello, world!</h1>
<p>Lorem ipsum<!--more--> dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
</body>
</html>
👉 Note:
meta[name="description"]
is truncated from the document:
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1" name="viewport">
<meta name="description" content="Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad…">
</head>
<body>
<h1>Hello, world!</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
</body>
</html>
API
This package exports no identifiers. The default export is rehypeInferDescriptionMeta
.
unified().use(rehypeInferDescriptionMeta[, options])
Infer file metadata from the main title of a document.
The result is stored on file.data.meta.description
(and file.data.meta.descriptionHast
).
Parameters
options
(Options
, optional) — configuration
Returns
Transform (Transformer
).
Notes
The description is inferred through three strategies:
- If
options.selector
is set and an element for that found, then the description is the text of that element - Otherwise, if a comment is found with the text of
options.comment
, then the description is the text up to that comment - Otherwise, the description is the text up to
options.truncateSize
Options
Configuration (TypeScript type).
Fields
comment
(string
, default:'more'
) — string to look for in a comment; one of the strategies is to look for this comment, everything before it is the descriptionignoreSelector
(string
, default:'h1, script, style, noscript, template'
) — CSS selector of nodes to ignore; used when looking for an excerpt comment or truncating the documentinferDescriptionHast
(boolean
, default:false
) — whether to exposefile.data.meta.descriptionHast
; this is not used byrehype-meta
, but could be useful to other plugins; the value contains the rich HTML elements rather than the plain text contentmainSelector
(string
, optional) — CSS selector to body of content; useful to exclude other things, such as the head, ads, styles, scripts, and other random stuff, by focussing all strategies in one elementmaxExcerptSearchSize
(number
, default:2048
) — how far to search for the excerpt comment before bailing; the goal of explicit excerpts is that they are assumed to be somewhat reasonably placed; this option prevents searching giant documents for some comment that probably won’t be found at the endselector
(string
, optional) — CSS selector to the description; one of the strategies is to look for a certain element, useful if the description is nicely encoded in one elementtruncateSize
(number
, default:140
) — number of characters to truncate to; one of the strategies is to truncate the document to a certain number of characters
Types
This package is fully typed with TypeScript. It exports the additional type Options
.
It also registers file.data.meta
with vfile
. If you’re working with the file, make sure to import this plugin somewhere in your types, as that registers the new fields on the file.
/**
* @typedef {import('rehype-infer-description-meta')}
*/
import {VFile} from 'vfile'
const file = new VFile()
console.log(file.data.meta.description) //=> TS now knows that this is a `string?`.
Compatibility
Projects maintained by the unified collective are compatible with maintained versions of Node.js.
When we cut a new major release, we drop support for unmaintained versions of Node. This means we try to keep the current release line, rehype-infer-description-meta@^2
, compatible with Node.js 16.
This plugin works with rehype-parse
version 3+, rehype-stringify
version 3+, rehype
version 4+, and unified
version 6+.
Security
Use of rehype-infer-description-meta
is safe.
Related
rehype-document
— wrap a fragment in a documentrehype-meta
— add metadata to the head of a documentunified-infer-git-meta
— infer file metadata from Gitrehype-infer-title-meta
— infer file metadata from the title of a documentrehype-infer-reading-time-meta
— infer file metadata from the reading time of a document
Contribute
See contributing.md
in rehypejs/.github
for ways to get started. See support.md
for ways to get help.
This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.