Supported formats reference
docmeta reads metadata from a document using a per-format extractor. The
extractor is chosen from the file extension, or forced with
--as <format>. Every extractor below is
implemented and counts toward directory and glob walks.
Formats
Section titled “Formats”Format (--as name) |
Extensions | Metadata source |
|---|---|---|
markdown |
.md, .markdown |
Leading YAML frontmatter (--- … ---). |
mdx |
.mdx |
Leading YAML frontmatter (--- … ---). |
asciidoc |
.adoc, .asciidoc |
YAML frontmatter, or the native header: = Title plus :key: value attributes. |
rst |
.rst |
YAML frontmatter, or the native section title plus :key: value docinfo fields. |
xml |
.xml |
Attributes of the root element. |
html |
.html, .htm |
<title> plus <meta name="…" content="…"> tags. |
The --as name is the extractor name in the first column. The extension match
is case-insensitive.
How each format is read
Section titled “How each format is read”Markdown and MDX
Section titled “Markdown and MDX”Both read a leading YAML frontmatter block delimited by --- on its own line,
closed by a matching --- or .... The block is parsed as YAML, so values keep
their YAML types (strings, numbers, booleans, lists, maps). MDX uses the same
frontmatter logic as Markdown; export const meta = {…} is not read.
---type: guidetitle: Getting startedtags: [setup, onboarding]---Malformed YAML in the block is reported as a per-file parse error rather than stopping the run. A file with no frontmatter block reports its metadata as not present.
AsciiDoc
Section titled “AsciiDoc”AsciiDoc accepts two metadata styles. If the file opens with a complete YAML frontmatter block, that block is used. Otherwise docmeta reads the native document header — the lines from the top of the file down to the first blank line:
- A leading
= Titleline becomestitle. - Each
:name: valueline becomes anamekey. A:name:with no value istrue; an unset attribute (:!name:or:name!:) isfalse. - Other header lines (such as author or revision lines) are ignored.
= Getting started:type: guide:draft: falsereStructuredText
Section titled “reStructuredText”reStructuredText also accepts two styles. A complete leading YAML frontmatter block is used when present (as some MyST setups produce it). Otherwise docmeta reads the native page metadata:
- A leading section title (a line underlined — and optionally overlined — with
punctuation) becomes
title. - The docinfo field list that follows — a run of
:name: valuefields — becomes the remaining keys. A:name:with no value istrue. An explicit:title:field takes precedence over the heading.
Getting started===============
:type: guide:tags: [setup, onboarding]XML metadata comes from the attributes of the root element. Namespace
declarations (xmlns and xmlns:*) are dropped as transport noise.
<document type="concept" version="2" />This yields type: "concept" and version: 2. Malformed XML is reported as a
per-file parse error.
HTML metadata comes from the document head:
<title>…</title>becomestitle(the first<title>wins; its text is kept verbatim).<meta name="X" content="Y">becomesX: Y.property="X"is accepted in place ofnamefor OpenGraph-style tags.<meta>tags with neithernamenorproperty(such ascharsetorhttp-equiv) carry no metadata and are skipped. For duplicate keys, the last tag wins.
<title>Getting started</title><meta name="type" content="guide"><meta name="draft" content="false">HTML parsing recovers from malformed markup, so extraction does not throw a parse error.
Type coercion
Section titled “Type coercion”For every format except Markdown and MDX, individual values are parsed as YAML scalars, matching frontmatter typing. This means string-looking inputs are coerced to their natural types:
| Raw value | Becomes | Type |
|---|---|---|
2 |
2 |
number |
true |
true |
boolean |
[a, b] |
["a", "b"] |
array |
An explicitly empty value (title="" in XML, content="" in HTML) stays the
empty string rather than becoming null. Markdown and MDX get their types from
parsing the whole frontmatter block as YAML, so the same coercions apply there
through normal YAML rules.