Glossary
The terms below are defined as docmeta uses them. Where a concept has a fuller reference page, the entry links to it.
frontmatter
Section titled “frontmatter”A block of metadata at the very top of a document, set off from the body. In
Markdown and MDX it’s a YAML block fenced by ---; other formats carry it
differently (AsciiDoc attributes, an RST field list, XML or HTML elements). docmeta
reads the frontmatter, not the body. The block must be the first thing in the
file, or docmeta won’t recognize it. See Supported formats
for where each format keeps its frontmatter.
metadata
Section titled “metadata”The key/value pairs docmeta validates — type, title, timestamp, and so on.
Metadata is what an extractor pulls out of a document’s
frontmatter; everything in docmeta after extraction operates on
this generic key/value shape, regardless of the original format. docmeta checks
the presence and format of metadata, never the document’s prose.
extractor
Section titled “extractor”The format-specific component that reads metadata out of a document. docmeta has
one extractor per input format — Markdown, MDX, AsciiDoc, reStructuredText, XML,
and HTML. The extractor is the only part of docmeta that knows about a file’s
format; once it produces metadata, validation and reporting work the same for
every format. You can force a specific extractor with --as, which is required
when reading from stdin. See Supported formats.
schema
Section titled “schema”A JSON Schema document that declares which metadata fields are required, which are optional, and what shape each value must take. docmeta validates a file’s metadata against one or more schemas. A schema can be shipped with docmeta (a builtin), a local file, or a remote URL.
schema set
Section titled “schema set”The one-or-more schemas resolved for a single file. docmeta resolves a schema set per file, then validates the file’s metadata against every schema in it. A file passes only when it satisfies all schemas in its set, and each reported error is tagged with the schema that produced it. See Schema resolution.
schema reference
Section titled “schema reference”A pointer to a schema — what appears in a --schema flag, a document’s
$schema key, or config. docmeta classifies each
reference into one of three kinds by its shape:
- builtin — a
vendor:name:versionid such asgoogle:okf:0.1, resolved to a schema shipped with docmeta. - file — a
.jsonpath or any path with a separator, such as./my.schema.json, resolved from the local filesystem. - url — an
http://orhttps://URL, fetched over the network.
See Reference kinds.
$schema (document key)
Section titled “$schema (document key)”A reserved key a document can put in its own metadata to name the schema(s) it
should be validated against — a single reference or a list. Because it’s a
docmeta directive rather than real metadata, $schema is stripped before
validation, so a schema with additionalProperties: false won’t flag it. In
the precedence chain it sits above config but below a --schema
flag. See The $schema key in a file.
$schema (meta-schema URI)
Section titled “$schema (meta-schema URI)”A different use of the same key, found inside a schema rather than a document.
There, $schema is the meta-schema URI that declares the schema’s
dialect — for example
https://json-schema.org/draft/2020-12/schema. docmeta reads it to decide which
validator to compile the schema with. Don’t confuse it with the document
$schema above: one selects a dialect, the other selects
which schema validates a document.
dialect
Section titled “dialect”A version of the JSON Schema specification — Draft 2020-12, 2019-09, draft-07,
draft-06, or draft-04. docmeta auto-detects a schema’s dialect from its
$schema meta-schema URI and compiles it with the matching
validator; a missing or unrecognized URI falls back to Draft 2020-12. Schemas of
different dialects can coexist in one run. See Dialects.
JSON Pointer
Section titled “JSON Pointer”The standard (RFC 6901) syntax
docmeta uses to name the field an error is about. A pointer like /tags/0 means
“the first item of the tags array”; an empty pointer (shown as (root) in
pretty output) means the document as a whole — most often a required field that’s
missing entirely. Pointers appear in error output and in the instancePath field
of JSON output.
The Open Knowledge Format, whose v0.1 schema (google:okf:0.1) ships with
docmeta as the built-in default. OKF requires only type and treats every other
field — title, description, resource, tags, timestamp — as recommended,
tolerating unknown keys. It’s what validates your files when nothing else is
configured. See the built-in OKF schema.
override
Section titled “override”In docmeta, “override” has two related senses:
- A config
overridesentry assigns a schema set to files matching a glob, so different folders can use different schemas. See Apply different schemas to different folders. - More broadly, schema resolution is a chain of overrides: a
--schemaflag overrides a document’s$schema, which overrides configoverrides, which overrides configschemas, which overrides the built-in default. The first source that yields a schema wins. See Precedence.
validation vs. extraction
Section titled “validation vs. extraction”Two distinct operations, exposed as two commands. Extraction reads metadata
out of a document — the get command prints named field values and validates
nothing. Validation checks that extracted metadata satisfies its
schema set — the validate command, the one CI gates on. Every
validation begins with an extraction, but you can extract without validating. See
the CLI reference.