Pandoc Backend¶

Overview¶

The pandoc path is **parse with pandoc, then apply snapper** — not “guess markdown lines like the built-in parsers.”

Pandoc reads the file (any format it supports) into its document AST.
Snapper walks that AST and reflows only prose-bearing nodes (Para / Plain).
Nodes pandoc already classified as structure (Header, CodeBlock, Table, …) are left alone.

That is how snapper can format everything pandoc can read (typst, asciidoc, docx, html, …) without inventing per-format line heuristics.

How to run step 1 (same step-2 walker):

Auto — --pandoc-backend auto (default): prefer in-process FFI when libsnapper_pandoc loads, else CLI.

This is the successor path for multi-format work (amortizes RTS; avoids process-per-file spawn).

CLI — --pandoc-backend cli: pandoc -t json (full reader set of the installed binary).
FFI — --pandoc-backend ffi: require libsnapper_pandoc (explicit error if missing).

Requirements¶

**CLI backend.** Pandoc 2.x or 3.x on PATH. With --use-pandoc, a missing or failing pandoc is an explicit error (no silent all-prose).

**FFI backend (dynamic).** Build native/snapper-pandoc, set SNAPPER_PANDOC_LIB to libsnapper_pandoc.so (or put it on the search path). A missing library with --pandoc-backend ffi errors with pandoc FFI library unavailable (fail-closed; never silent all-prose).

**FFI backend (co-linked / one binary).** Feature pandoc-colink is a build path: produce libsnapper_pandoc.a with native/snapper-pandoc/build-static.sh (ghc -staticlib), then Cargo absorbs that archive into a single snapper executable (--gc-sections). No SNAPPER_PANDOC_LIB discovery and no multi-hundred-MB libHS* RUNPATH graph.

cd native/snapper-pandoc && ./build-static.sh
export SNAPPER_PANDOC_LIB_DIR=$PWD/lib
cargo build --release --features "cli,pandoc,pandoc-colink"
snapper --use-pandoc --pandoc-backend ffi paper.md

See native/snapper-pandoc/README.md.

Usage¶

# Pandoc parses; snapper reflows prose from the AST (auto = FFI if available)
snapper --use-pandoc paper.typ
snapper --use-pandoc --pandoc-backend auto paper.md
snapper --use-pandoc --pandoc-backend cli guide.adoc
snapper --use-pandoc --pandoc-backend ffi paper.md

Native line parsers remain the default until product flip. Use --use-pandoc when you need formats without a built-in parser or AST-true structure; prefer FFI for latency once libsnapper_pandoc is installed.

**Speed.** Warm in-process FFI is a few times native cost; cold process reloads the GHC RTS (tens of ms). Snapper also caches pandoc JSON ASTs by content hash (memory + disk under $XDG_CACHE_HOME/snapper/pandoc-ast). Re-formatting the same source skips pandoc entirely. Disable with SNAPPER_PANDOC_CACHE=0; override dir with SNAPPER_PANDOC_CACHE_DIR.

When to use pandoc vs built-in parsers¶

**Built-in** (default): fast, source-oriented parsers for Org / LaTeX / Markdown / RST (e.g. full ATX lines as structure).
**Pandoc**: you want structure from pandoc’s model and coverage of formats without a built-in parser.

The two paths are different pipelines, not two ways to fake the same source lines.

Limitations¶

Pandoc strips comments, so snapper:off / snapper:on pragmas do not work on the pandoc path
Round-trip is through the AST (not a perfect source-preserving rewrite of every markup character)
FFI library is optional; wasm/editor bundles do not embed GHC/pandoc