Supported Formats

snapper classifies text into prose regions (reflowed at sentence boundaries) and structure regions (passed through unchanged). The classification depends on the format.

Org-mode (--format org)

Structure regions (preserved)

  • #+BEGIN_*

#+END_* blocks (source, example, quote, etc.)

  • :PROPERTIES:

:END: drawers

  • #+KEYWORD: directives (TITLE, AUTHOR, DATE, OPTIONS, etc.)

  • Table rows (lines starting with |)

  • Comment lines (starting with # but not #+ )

  • Headline stars and TODO keywords

  • List item markers (-, +, 1.)

  • LaTeX environments (\begin{equation}\end{equation}, \begin{align}, etc.)

  • Display math (\[\])

  • Inline export snippets (@@latex:\newpage@@, @@html:<br>@@)

Prose regions (reflowed)

  • Paragraph text

  • Headline text (after the stars and keyword)

  • List item text (after the marker)

Inline tokens (kept atomic)

These tokens within prose are not split across lines:

  • Links: [[url][description]]

  • Inline code: ~code~, ==verbatim==

  • Inline export snippets: @@backend:value@@

  • URLs: https://... (trailing sentence punctuation not swallowed)

LaTeX (--format latex)

Structure regions (preserved)

  • Preamble (everything before \begin{document})

  • Non-prose environments: equation, align, figure, table, tabular, lstlisting, verbatim, minted, tikzpicture, and their starred variants

  • Display math: \[...\]

  • Comment lines (starting with %)

  • \end{document}

Prose regions (reflowed)

  • Body text between structural elements

Markdown (--format markdown)

Structure regions (preserved)

  • Fenced code blocks (``` or ~~~)

  • Front matter (--- or +++ delimited at file start)

  • Heading markers (#, ##, etc.)

  • List item markers (-, \*, +, 1.)

Prose regions (reflowed)

  • Paragraph text

  • Heading text (after the marker)

  • List item text (after the marker)

reStructuredText (--format rst)

Structure regions (preserved)

  • Directives (.. code-block::, .. math::, .. image::, etc.) and their indented bodies

  • Literal blocks (text after :: with indented content)

  • Section titles and underlines (===, -----, etc.)

  • Field lists (:Author:, :Date:, etc.)

  • Comments (.. without a directive)

  • Grid and simple tables (lines starting with | or +)

Prose regions (reflowed)

  • Paragraph text between structural elements

Auto-detection

Extensions: .rst, .rest

Plaintext (--format plaintext)

Everything is prose. Blank lines are preserved as paragraph separators.

Sentence Detection

snapper uses Unicode UAX #29 sentence boundary detection as a baseline, then merges false splits caused by known abbreviations:

Titles

Mr., Mrs., Ms., Dr., Prof., Sr., Jr., St., Rev., Gen., etc.

Academic

Fig., Figs., Eq., Eqs., Ref., Refs., Tab., Sec., Ch., Vol., No., Thm., Lem., Prop., Def., Cor., Rem., Ex.

Latin

e.g., i.e., et al., cf., etc., viz., ibid., ca., approx.

Single initials

A., B., C., … Z.

Date and time

Jan., Feb., …, Dec., Mon., Tue., …, Sun., a.m., p.m.

Quoted and parenthesized punctuation

Sentence punctuation inside quotes or parentheses does not trigger a false split when the next word starts lowercase. For example, He said "wow!" and left. stays on one line because "!" followed by lowercase and signals a continuation, not a new sentence. Patterns handled: !", ?", .", !), ?), .), and similar combinations with single quotes or brackets.