ParsedDocument
The ParsedDocument class defines a parsed Markdown document with its text, metadata, and convenience methods.
4 minute read
Definition
Source Code
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
using module ./DocumentLink.psm1
class ParsedDocument {
[System.IO.FileInfo]$FileInfo
[string]$RawContent
[Markdig.Syntax.MarkdownDocument]$ParsedMarkdown
[System.Collections.Specialized.OrderedDictionary]$FrontMatter
[string]$Body
[DocumentLink[]]$Links
hidden [bool]$HasParsedLinks
ParsedDocument() {}
hidden ParseLinksFromBody() {
$this.Links = [DocumentLink]::Parse($this.Body)
| ForEach-Object -Process {
# Add the file info to each link
$_.Position.FileInfo = $FileInfo
# Emit the link for the list
$_
}
$this.HasParsedLinks = $true
}
[DocumentLink[]] ParsedLinks() {
if (!$this.HasParsedLinks) {
$this.ParseLinksFromBody()
}
return $this.Links
}
[DocumentLink[]] ParsedLinks([bool]$Force) {
if (!$this.HasParsedLinks -or $Force) {
$this.ParseLinksFromBody()
}
return $this.Links
}
[DocumentLink[]] InlineLinks() {
return [DocumentLink]::FilterForInlineLinks($this.Links)
}
[DocumentLink[]] ReferenceLinks() {
return [DocumentLink]::FilterForReferenceLinks($this.Links)
}
[DocumentLink[]] ReferenceDefinitions() {
return [DocumentLink]::FilterForReferenceDefinitions($this.Links)
}
[DocumentLink[]] ReferenceLinksAndDefinitions() {
return [DocumentLink]::FilterForReferenceLinksAndDefinitions($this.Links)
}
[DocumentLink[]] UndefinedReferenceLinks() {
return [DocumentLink]::FilterForUndefinedReferenceLinks($this.Links)
}
[DocumentLink[]] UnusedReferenceLinkDefinitions() {
return [DocumentLink]::FilterForUnusedReferenceLinkDefinitions($this.Links)
}
[DocumentLink[]] ValidReferenceLinksAndDefinitions() {
return [DocumentLink]::FilterForValidReferenceLinksAndDefinitions($this.Links)
}
[string] ToDecoratedString() {
return $this.Body
| ConvertFrom-Markdown -AsVT100EncodedString
| Select-Object -ExpandProperty VT100EncodedString
}
}
The ParsedDocument class is used throughout the Documentarian module as the model and interface representing a Markdown file. It includes the file’s metadata, raw content, the Markdown AST for the document, its front matter, body text, and the list of links in the document. It also includes several convenience methods for inspecting the document.
Examples
1. Getting the parsed changelog
This example creates a ParsedDocument from the project’s changelog, which you can then inspect with its properties and methods.
Get-Document ./CHANGELOG.md
FileInfo : C:\code\pwsh\Documentarian\Source\Modules\Documentaria
n\CHANGELOG.md
RawContent : ---
title: Changelog
weight: 0
description: |
All notable changes to the **Documentarian** module
are documented in this file.
This changelog's format is based on [Keep a
Changelog][01] and this project adheres to
[Semantic Versioning][02].
For releases before `1.0.0`, this project uses the
following convention:
- While the major version is `0`, the code is
considered unstable.
- The minor version is incremented when a
backwards-incompatible change is introduced.
- The patch version is incremented when a
backwards-compatible change or bug fix is introduced.
[01]: https://keepachangelog.com/en/1.0.0/
[02]: https://semver.org/spec/v2.0.0.html
---
## Unreleased
- Scaffolded initial project.
ParsedMarkdown : {Markdig.Extensions.Yaml.YamlFrontMatterBlock,
Markdig.Syntax.HeadingBlock,
Markdig.Syntax.ListItemBlock, Markdig.Extensions.AutoI
dentifiers.HeadingLinkReferenceDefinition}
FrontMatter : {[title, Changelog], [weight, 0], [description, All
notable changes to the **Documentarian** module are
documented in this file.
This changelog's format is based on [Keep a
Changelog][01] and this project adheres to
[Semantic Versioning][02].
For releases before `1.0.0`, this project uses the
following convention:
- While the major version is `0`, the code is
considered unstable.
- The minor version is incremented when a
backwards-incompatible change is introduced.
- The patch version is incremented when a
backwards-compatible change or bug fix is introduced.
[01]: https://keepachangelog.com/en/1.0.0/
[02]: https://semver.org/spec/v2.0.0.html
]}
Body : ## Unreleased
- Scaffolded initial project.
Links :
Constructors
ParsedDocument()
- Initializes a new instance of the ParsedDocument class.
Methods
InlineLinks()
- Returns every inline (non-reference) link from the document.
ParsedLinks()
- Returns the parsed links from the document, parsing if needed.
ReferenceDefinitions()
- Returns every reference link definition from the document.
ReferenceLinks()
- Returns every reference link from the document.
ReferenceLinksAndDefinitions()
- Returns every reference link and reference link definition from the document.
ToDecoratedString()
- Returns the VT100-encoded string representing the rendered markdown for the document.
UndefinedReferenceLinks()
- Returns every reference link that doesn’t have a matching reference link definition from the document.
UnusedReferenceLinkDefinitions()
- Returns every reference link definition that doesn’t have at least one matching reference link from the document.
ValidReferenceLinksAndDefinitions()
- Returns every reference link that isn’t undefined and every reference link definition that isn’t unused from the document.
Properties
- Body
- The Body property contains the Markdown content of the document as a single string with the front matter removed.
- FileInfo
- The FileInfo property contains the document’s metadata from the file system.
- FrontMatter
- The FrontMatter property contains the key-value data from the document’s frontmatter. The data is stored as an ordered dictionary so it can be modified and written back to the file without changing the order of the keys in the document.
- Links
- The Links property contains the list of all discovered links from the document’s Markdown content.
- ParsedMarkdown
- The ParsedMarkdown property contains the abstract syntax tree (AST) representation of the document’s Markdown returned by Markdig.
- RawContent
- The RawContent property contains the document’s content as a single string, including the frontmatter and Markdown exactly as it existed in the file when it was parsed.
Last modified March 3, 2023: (MAINT) Rename Source folder to Projects (8b45aed)