Skip to content

Releases: jgm/pandoc

pandoc 3.8.3

01 Dec 10:30
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.8.3,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.8.3

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.8.3

This release adds three new input formats (asciidoc, pptx, and
xlsx) and one new output format (bbcode + variants).

It fixes a number of bugs (including some regressions in 3.8).
See the changelog for full details.

API changes:

  + New exported module Text.Pandoc.Readers.AsciiDoc,
    exporting readAsciiDoc.
  + New module Text.Pandoc.Readers.Pptx, exporting readPptx.
  + New module Text.Pandoc.Readers.Xlsx, exporting readXlsx.
  + Text.Pandoc.Readers: Export readAsciiDoc, readXlsx, readPptx.
  + New module Text.Pandoc.Writers.BBCode, exporting
    writeBBCode,  writeBBCodeSteam, writeBBCodeFluxBB,
    writeBBCodePhpBB, writeBBCodeHubzilla, writeBBCodeXenforo.
  + Text.Pandoc.Writers: Export writeBBCode,
    writeBBCodeSteam, writeBBCodeFluxBB, writeBBCodePhpBB,
    writeBBCodeHubzilla, writeBBCodeXenforo .
  + Text.Pandoc.Writers.Shared: Export insertCurrentSpansAtColumn,
    takePreviousSpansAtColumn and decrementTrailingRowSpans.
  + Text.Pandoc.Shared: Export allRowsEmpty and
    tableBodiesToRows.

Thanks to all who contributed, especially new contributors
Anton Antich, and Asliddinbek Azizovich, and James Barlow.
Click to expand changelog
  • Add asciidoc as an input format (#1456).

  • Add xlsx (Microsoft Excel) as an input format (Anton Antich). Each worksheet turns into a section containing a table.

  • Add pptx (PowerPoint) as new input format (Anton Antich).

  • Add bbcode as a new output format (#11242, reptee). Several variants of BBCode are also supported: bbcode_fluxbb (FluxBB), bbcode_phpbb (phpBB), bbcode_steam (Hubzilla), bbcode_hubzilla (Hubzilla), and bbcode_xenforo (xenForo).

  • New exported module Text.Pandoc.Readers.AsciiDoc, exporting readAsciiDoc [API change].

  • New module Text.Pandoc.Readers.Pptx, exporting readPptx (Anton Antich) [API change].

  • New module Text.Pandoc.Readers.Xlsx, exporting readXlsx (Anton Antich) [API change].

  • LaTeX reader:

    • Revert \linebreak as LineBreak (#11272). \linebreak is more of a hint, it shouldn’t produce a hard break.
    • Better handling of \makeatletter in parsing raw LaTeX (#11270).
    • Fix spurious paragraph breaks in math environments (#11265, Emmanuel Ferdman). Previously, a math environment with extra space before the \end would get rendered with a blank line, which LaTeX treats as a paragraph break.
    • Change type on rawLaTeXParser in Text.Pandoc.LaTeX.Parsing. The preparser doesn’t need to return a value.
    • Fix rawTeXParser (#11253). Make macro expansion in raw LaTeX depend on the setting of the latex_macros extension. Previously macros were always expanded, even in raw TeX in markdown. In addition, there was previously a bug that caused content to be garbled in certain cases.
    • Handle ifstrequal at a lower level, like the other if commands (#11253).
    • Move ifstrequal, iftoggle, etc., which were misplaced in environments, to blockCommands, so these commands work properly.
  • Docx reader:

    • Handle REF link instruction (#11296, Ezwal).
    • Check recursively for caption styles (Albert Krewinkel). The docx reader uses caption styles to identify figures and captioned tables. It now checks for known caption styles in the full styles hierarchy of a paragraph instead of just checking the style directly. This allows to recognize caption styles that are built on top of the basic caption style, as is sometimes the case in sophisticated styles.
  • Markdown reader:

    • Fix performance issue in links with ' (#10880).
  • Typst reader:

    • Handle document metadata and #title (jgm/typst-hs#80). Note that previously, the typst reader never returned document metadata. Now it does, even if the typst document does not contain a #title function that would result in actually printing the title block.
  • Djot reader:

    • Add Space elements (#11250). Previously we just got big Str elements with spaces included. But many pandoc writers assume that breakable spaces will be Space elements, and this is also required for automatic wrapping.
  • RST reader:

    • Correctly handle intraword emphasis (#11309).
  • Text.Pandoc.Readers:

    • Export readAsciiDoc, readXlsx, readPptx [API change].
  • New module Text.Pandoc.Writers.BBCode, exporting writeBBCode, writeBBCodeSteam, writeBBCodeFluxBB, writeBBCodePhpBB, writeBBCodeHubzilla, writeBBCodeXenforo [API change].

  • LaTeX writer:

    • Make level 1-3 headings work inside blockquotes (#11281, James Barlow).
    • Remove split from list of math environments (#11274).
    • Improve handling of math environments in tex math (#11266).
  • HTML writer:

    • Add reveal.js scroll and scrollSnap options to writer and template (#10052, Asliddinbek Azizovich).
    • Use ‘defer’ when including mathjax script, as recommended in MathJax docs (#11292).
  • ANSI writer:

    • Apply row spans in tables (#10149, Tuong Nguyen Manh). The ANSI writer is now able to keep track of row spans and apply them in rows.
  • Pptx writer:

    • Handle reference doc without slides (#7536, Tuong Nguyen Manh).
  • AsciiDoc writer:

    • Add more table features (#11267, Tuong Nguyen Manh): Row span and column span, footer row, individual horizontal cell alignment.
  • Typst template:

    • Fix font for compatibility with typst 0.14, which doesn’t permit an empty array for font (#11238).
    • Re-add columns to typst template (#11259), fixing a pandoc 3.8 regression.
    • Fix syntax for bibliography inclusion (#11233, Mickaël Canouil). Previously the syntax was wrong when multiple bibliography files were specified. Typst expects an array.
  • Text.Pandoc.Writers:

    • Export writeBBCode, writeBBCodeSteam, writeBBCodeFluxBB, writeBBCodePhpBB, writeBBCodeHubzilla, writeBBCodeXenforo [API change].
  • Text.Pandoc.Writers.Shared:

    • Add functions insertCurrentSpansAtColumn, takePreviousSpansAtColumn and decrementTrailingRowSpans for applying and keeping track of RowSpans over multiple rows (#10149, Tuong Nguyen Manh). [API change]
  • Text.Pandoc.Logging:

    • Change message for missing HTML title warning (#11307). Suggest setting the pagetitle variable instead of setting title in metadata.
  • Lua subsystem:

    • Preserve common state of custom Lua readers (Albert Krewinkel). The common state is transferred to Lua when calling a custom Lua reader, and is now also transferred back after the reader has finished. This ensures that info messages, warnings, and mediabag entries are available to the main program and all subsequent processing steps.
  • Text.Pandoc.PDF:

    • Avoid converting SVG to PDF when non-TeX PDF engine is used (#11275). This fixes a 3.8 regression, which caused documents with SVGs to raise an error when converted to PDF using WeasyPrint.
    • Fix a 3.8 regression with typst and smart quotes (#11256). Before 3.8, the default behavior when producing a PDF -t typst was to produce smart quotes according to typst’s defaults. (This could be defeated by specifying -t typst-smart.) This behavior broke in 3.8 because of a change to Text.Pandoc.PDF. This change caused smart to be disabled for all formats when producing PDFs, when before it was only disable for TeX-based formats (to avoid bad ligatures). This commit restores the old behavior. Possibly the regression also other affects other non-TeX formats, e.g. HTML.
  • Text.Pandoc.Shared:

    • Add functions allRowsEmpty and tableBodiesToRows from the RST writer for reuse in other writers. (Tuong Nguyen Manh) [API change].
  • Text.Pandoc.Citeproc:

    • Allow formatting in locator to be transmitted to citeproc. We do this indirectly, by rendering the formatting using the HTML tags that citeproc recognizes. Fixes jgm/citeproc#68 and jgm/citeproc#163. Note that formatting is only possible for locators given in the explicit form, surrounded by curly braces. It won’t work for implicit locators, since these expect number-like expressions.
  • New non-exported module Text.Pandoc.Readers.OOXML.Shared containing functions factored out from Text.Pandoc.Readers.Docx.Util (Anton Antich).

  • Tests: The common file nativeDiff has been extracted from the Docx and Pptx text files and put in Tests.Helpers.

  • Use asciidoc 0.1, djot 0.1.2.4, texmath 0.13.0.2, typst 0.8.1, citeproc 0.12.

  • MANUAL.txt:

    • Improve implicit_figure documentation (#11082).
    • Give both forms of options when referring to them (#11306).
  • Update INSTALL.md (#11271).

pandoc 3.8.2.1

20 Oct 10:48
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.8.2.1,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.8.2.1

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.8.2.1

This is primarily a bug-fix release. There are no API changes.

The main motivation for the release is to fix a serious performance
regression with --citeproc and the default CSL style,
chicago-author-date.csl. This style was recently revised in a way that
took much longer for citeproc to parse. This version of pandoc is
built with citeproc 0.11, which fixes the performance regression.

One new feature (added by Albert Krewinkel) is that the Org reader now
parses parameter lists on unknown blocks, and also supports dynamic
blocks.

Thanks to all who contributed, especially new contributors
Emmanuel Ferdman, FoxChillz, mourino, and priiduonu.
Click to expand changelog
  • HTML reader: allow blank space between open and close iframe.

  • RTF reader: improve hyperlink parsing (#11211).

  • Org reader:

    • Parse parameter lists on unknown blocks (#11188, Albert Krewinkel). The reader tries to parse the rest of the opening line of a block, e.g., #+begin_myblock …, as a parameters list. It first assumes that the parameters are in lisp-style (:key value), then alternatively tries to read python-style key-value pairs (key=value) and falls back to reading the entire remaining line as a single parameter attribute.
    • Add support for dynamic blocks.
  • Docx writer: properly handle nested comment spans (#8189, #6959, mourino).

  • RST writer: Don’t use simple tables with RowSpans (#11214, Tuong Nguyen Manh).

  • Typst writer: Escape open paren after non-space (#11210). This fixes an issue that occurs if an open paren comes right after e.g. #strong[test].

  • Typst template: ensure that title block is properly centered (#11221).

  • LaTeX writer/template: small fix for unnumbered tables for compatibility with older LaTeX installations (#11201). Thanks to @priiduonu for the solution.

  • MANUAL.txt: Fixed missing backtick (#11209, FoxChillz).

  • Correct anchor references to pandoc.text module documentation (#11111, Emmanuel Ferdman).

  • Fixed golden test regeneration in Docx reader test.

  • Allow unicode-data 0.8.

  • Use citeproc 0.11. This fixes a significant performance regression in pandoc 3.8, which was due to a rewrite of the default chicago-author-date.csl file. Performance with --citeproc is now on par with what we had in pandoc 3.7, even with the revised Chicago styles.

pandoc 3.8.2

05 Oct 21:26
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.8.2,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.8.2

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.8.2

This release fixes a regression in the typst template (since 3.8), which caused links to be omitted.

It also adds a new default Markdown extension, `table_attributes`, which allows attributes to be added to tables by putting them after the caption.

API change: Add `Ext_table_attributes` constructor to Extension.

Thanks to all who contributed.
Click to expand changelog
  • Markdown reader/writer: implement new table_attributes extension (#10884). When table_attributes is enabled (as it is by default for pandoc’s Markdown), attributes can be attached to a table by including them at the end of the caption. Previously the writer would emit an identifier in this position, but the reader didn’t handle it. Now arbitrary attributes are allowed, and they work in both the reader and writer.

  • Typst writer: don’t add superfluous semicolons (#11196). Previously we added semicolons after inline commands not followed by spaces, but mainly this was to deal with one issue: the presence of a semicolon after an inline command, which would be swallowed as a command separator (#9252). This commits adopts an approach that should avoid so many superfluous semicolons: it escapes semicolons that might come right after a command.

  • Typst template: fix 3.8 regression in which links disappear (#11194). A template change in 3.8 added a show rule for links which causes them to disappear except in special cases.

  • Text.Pandoc.Parsing: rewrite oneOfStrings more efficiently.

  • LaTeX writer: Fix strikeout in links (#11192, Tuong Nguyen Manh). As in #1294 \url and \href need to be protected inside an mbox for soul commands.

  • Text.Pandoc.Extensions: Add Ext_table_attributes constructor for Extension [API change].

  • Use released texmath 0.13.0.1.

  • Update FSF contact information in COPYING (#11183, Bensun Muite).

  • MANUAL.txt: remove some redundancy (#11178, Reuben Thomas).

pandoc 3.8.1

29 Sep 15:02
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.8.1,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.8.1

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.8.1

This release introduces a few new features and fixes some bugs
and regressions. Of special note:

- New output format vimdoc (for Vim documentation).
- --syntax-highlighting now works as expected for typst output.
- New shorthands variable allows activating babel shorthands
  in LaTeX/PDF output.
- Removed coloring for links in epub.css, so that the reader's
  defaults will be used.

API changes:

- New module Text.Pandoc.Writers.Vimdoc, exporting writeVimdoc.
- Text.Pandoc.Parsing: new functions tableWithSpans,
  tableWithSpans', toTableComponentsWithSpans, and
  toTableComponentsWithSpans'.
- Text.Pandoc.Shared: new function removeLinks.
- Text.Pandoc.Highlighting: export functions formatTypstBlock,
  formatTypstInline, styleToTypst (from skylighting).

Thanks to all who contributed, especially Tuong Nguyen Manh (who
has been working hard to improve table support) and new contributors
Raymond Berger and reptee.
Click to expand changelog
  • New output format vimdoc (Vim documentation format) (#11132, reptee).

    • [API change] Added module Text.Pandoc.Writers.Vimdoc, exporting writeVimdoc.
  • Markdown reader:

    • Improve superscript/subscript/inline note parsing (#8652). We do not allow inline notes to be followed by ( or [. Otherwise, we parse inline notes before superscripts. Also, the sub/superscript parsers have been adjusted so that they really exclude unescaped spaces (as they did not before, when the spaces occurred in nested inlines).
    • Fix simple table alignment (#11136, Tuong Nguyen Manh). Take wide characters into account when determining the alignment.
  • LaTeX reader:

    • Ignore \pandocbounded (#11140).
  • XML reader:

    • Parse <MetaString> (#11137, massifrg).
  • Typst reader:

    • Add support for reading typst pagebreak (#11101, Raymond Berger). The pagebreak is parsed as a HorizontalRule inside a wrapper Div with class page-break.
  • Docx reader:

    • Handle figures in indented paragraphs (#11028).
    • Change default for textwidth. This should only be used if sectPr is not found.
    • Properly calculate table column widths (#9837, #11147). Previously we assumed that every table took up the full text width. Now we read the text width from the document’s sectPr.
    • Use Tasty.Golden for Docx reader tests. This way we can update them with --accept.
  • RST reader:

    • Fix regression in simple table parsing (#11150).
    • SkippedContent warning if table directive contains non-tabular content.
    • Simple tables: leading space in a cell should not cause the contents to be parsed as a block quote (#11146).
    • Parse :alt: on figure (#11140). Also give a better default if alt is not specified, using the stringified caption rather than the filename.
    • Support col spans for simple tables (Tuong Nguyen Manh).
  • Markdown writer:

    • Improve handling of implicit figures (#11140). Allow implicit figures when alt text differs from caption (in this case, we use an image attribute to add the alt).
    • Use approximate pipe tables when it’s the only option (#11128). If we have a table with row/colspans that can rendered as an approximate pipe table (without row/colspans), and no other table format is enabled that could render the table, we fall back to an “approximate” pipe table, with no row/colspans.
  • RST writer:

    • Ensure blank line before directives (#11162).
    • Add col spans for simple tables (#10127, Tuong Nguyen Manh).
  • OpenDocument writer:

    • Add missing table elements (#10002, Tuong Nguyen Manh). Add missing header rows after the first one, footer rows as well as TableBody header rows.
  • Docx writer:

    • Fix regression (from 3.8) in highlighted code (#11156).
  • Powerpoint writer:

    • Handle single column (Tuong Nguyen Manh).
  • Typst writer:

    • Fix syntax highlighting (#11171, completes #10525). Previously the native typst highlighting was always used, regardless of the setting of --syntax-highlighting. With this change, --syntax-highlighting=none and --syntax-highlighting=<stylename> (with skylighting style) will work.
  • LaTeX writer:

    • Make beamer footnotes compatible with pauses (#5954). Previously they would appear before the content to which the note was attached, when there were pauses in a slide.
    • Avoid \_ in bibliography variable (#11152).
    • Ensure that unlabelled tables don’t increment counter (#11141).
    • Protect VERB in caption (#11139, Tuong Nguyen Manh).
    • Don’t add links to TOC (#11124, Albert Krewinkel).
    • Fix strikeouts in beamer title (#11168, Tuong Nguyen Manh).
  • LaTeX template: Add shorthands variable for LaTeX output (#11160). If true, pandoc will allow language-specific shorthands when loading babel. (This is helpful, for example, in getting proper spacing around French punctuation.)

  • epub.css: Remove coloring for a, a:visiting (#11174). This was causing links in iOS books app not to be distinguished in any way (since underlining is not used there).

  • Text.Pandoc.Parsing:

    • [API chage] (Tuong Nguyen Manh). New functions tableWithSpans, tableWithSpans', toTableComponentsWithSpans and toTableComponentsWithSpans' take a list of lists of (Blocks, RowSpan, ColSpan) to parse a Table with different RowSpan and ColSpan values accordingly. New helper functions singleRowSpans and singleColumnSpans help set all RowSpans or ColSpans to be 1 in case the table format only allows setting one or the other.
  • Text.Pandoc.Class:

    • Let fetchItem fail if the HTTP request is not successful (Albert Krewinkel). HTTP requests that don’t return a 200 error code are now treated as an error. This ensures that a warning is triggered when using --embed-resources or --extract-media.
  • Text.Pandoc.Writers.Shared:

    • Add new function removeLinks [API change] (Albert Krewinkel). The function converts links to spans. It is used, for example, to avoid nested links. The HTML writer used to put the description of nested links into small caps, but uses a simple span now.
  • Text.Pandoc.Highlighting: export typst functions [API change]. New exported functions formatTypstBlock, formatTypstInline, styleToTypst.

  • Text.Pandoc.XML:

    • Add fetchpriority to list of HTML attributes (#11176).
  • Allow unicode-data 0.7.

  • Use released djot 0.1.2.3. Fixes a bug in which indentation was swallowed in a code block inside a blockquote.

pandoc 3.8

06 Sep 22:24
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.8,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.8

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.8

This release comes with many small improvements and a few larger ones.
Among the more visible changes:

+ A new input/output format xml, which exactly represents a pandoc
  AST in a more easily human-readable form than JSON. The format is
  documented in doc/xml.md, and schemas can be found in tools/pandoc-xml.*.

+ A new command line option --syntax-highlighting, which takes the
  values 'none', 'default', 'idiomatic', a style name, or a path to
  a theme file.  --no-highlighting and --highlight-style are deprecated.

+ New extensions smart_quotes and special_strings for org mode.
  These allow pandoc's parsing to more closely reproduce Emacs org-mode's
  behavior.

+ The old compact_definition_lists extension has been removed.

API changes:

+ New modules Text.Pandoc.Readers.XML (exporting readXML) and
  Text.Pandoc.Writers.XML (exporting writeXML).

+ Text.Pandoc.Extensions: added constructors Ext_smart_quotes,
  Ext_special_strings; removed Ext_compact_definition_lists.

+ Text.Pandoc.App now exports versionInfo, a function that takes
  three parameters that can be filled in by pandoc-cli.

+ Text.Pandoc.Parsing: tableWith and tableWith' now return
  a list of lists of Blocks, allowing for multiple header rows.

+ Text.Pandoc.ImageSize: Add Point and Pica as constructors of
  ImageSize.  Add Avif constructor of ImageType.

+ Text.Pandoc.Class: CommonState is now opaque and does not support
  its fields.  To compensate for this, we now export several
  new functions: getRequestHeaders, setRequestHeaders, getSourceURL,
  getTrace.

Thanks to all who contributed, especially new contributors
Christopher Kenny, Erik Post, Repetitive, Reuben Thomas, Ryan Gibb,
Sean Soon, and massifrg.
Click to expand changelog
  • Add a new input and output format xml, exactly representing a Pandoc AST and isomorphic to the existing native and json formats (massifrg). XML schemas for validation can be found in tools/pandoc-xml.*. The format is documented in doc/xml.md. Pandoc now defaults to this reader and writer when the .xml extension is used.

    Two new exported modules are added [API change]: Text.Pandoc.Readers.XML, exporting readXML, and Text.Pandoc.Writers.XML, exporting writeXML. A new unexported module Text.Pandoc.XMLFormat is also added.

  • Add a new command line option --syntax-highlighting; this takes the values none, default, idiomatic, a style name, or a path to a theme file. It replaces the --no-highlighting, --highlighting-style, and --listings options, which will still work but with a deprecation warning. (Albert Krewinkel)

  • Create directory of output file if it doesn’t exist (#11040).

  • Update --version copyright dates (#10961), and use a hardcoded string “pandoc” for the program name in --version, per GNU guidelines.

  • Add smart_quotes and special_strings extensions (Albert Krewinkel). Currently these only affect org. Org mode makes a distinction between smart parsing of quotes, and smart parsing of special strings like .... The finer grained control over these features is necessary to truthfully reproduce Emacs Org mode behavior. Special strings are enabled by default, while smart quotes are disabled.

  • Remove the old compact_definition_lists extension. This was neded to preserve backwards compatibility after pandoc 1.12 was released, but at this point we can get rid of it.

  • Make -t chunkedhtml -o - output to stdout (as documented), rather than creating a directory called - (#11068).

  • RST reader: Support multiple header rows (#10338, TuongNM).

  • LaTeX reader:

    • Support soft hyphens (Albert Krewinkel).
    • Parse \minisec as unlisted level 6 headings (#10635, Albert Krewinkel).
    • Support \ifmmode (#10915).
    • Change handling of math environments (#9711, #9296). Certain environments in LaTeX will trigger math mode and can’t occur within math mode: e.g., align or equation. Previously we “downshifted” these, parsing an align environment as a Math element with aligned, and an equation environment as a regular display math element. With this shift, we put these in Math inlines but retain the original environments. texmath and MathJax both handle these environments well.
  • Typst reader:

    • Fix addition of image path prefix to use posix separator.
    • Properly resolve image paths in included files (#11090).
    • Handle inline-level show rules on block content (#11017). Typst allows things like smallcaps to be applied to block-level content like headings. This produces a type mismatch in pandoc, so before processing the output of typst-hs, we transform it, pulling the block-level elements outside of the inline-level elements.
  • Org reader:

    • Improve sub- and superscript parsing (Albert Krewinkel). Sub- and superscript must be preceded by a string in Org mode. Some text preceded by space or at the start of a paragraph was previously parsed incorrectly as sub- or superscript.
    • Allow “greater block” names to contain any non-space char (#4287, Albert Krewinkel).
    • Accept quoted values as argument values (#8869, Albert Krewinkel).
    • Recognize “fast access” characters in TODO state definitions (#10990, Ryan Gibb).
    • Improve org-cite parsing: Handle global prefix and suffix properly. Use all and only the styles mentioned in oc-basic.el. Allow space after ;.
  • HTML reader:

    • Don’t drop the initial newline in a pre element (#11064).
  • DocBook reader:

    • Add rowspan support (#10981, Sean Soon).
    • Be sensitive to startingnumber attribute on ordered lists (#10912).
  • POD reader:

    • Fix named entity lookup (#11015, Evan Silberman).
  • Man reader:

    • Support header and footer reader (Sean Soon).
  • Markdown reader:

    • Don’t confuse a span after an author-in-text citation with a locator. E.g. @foo [test]{.bar}. See #9080 (comment).
    • Make definition lists behave like other lists (#10889). If the four_space_rule extension is not enabled, figure out the indentation needed for child blocks dynamically, by looking at the first nonspace content after the : marker. Previously the four-space rule was always obeyed.
    • Fix tight/loose detection for definition lists, to conform to the documentation.
  • ODT reader:

    • Support table-header-rows (Tuong Nguyen Manh).
  • Docx reader:

    • Don’t add highlighting if highlight color is “none” (#10900).
    • Handle strict OpenXML as well as transitional (#7691).
    • Fix stringToInteger (#9184). It previously converted things like 11ccc to an integer; now it requires that the whole string be parsable as an integer.
    • Improve handling of AlternateContent. This fixes handling of one representation of emojis in Word (#11113).
  • LaTeX writer:

    • Control figure placement with attribute (#10369, Sean Soon). If a latex-placement attribute is present on a figure, it will be used as the optional positioning hint in LaTeX (e.g. ht). With implicit figures, latex-placement will be added to the figure (and removed from the image) if it is present on the image.
    • Include cancel package only if there is math that contains \cancel, \bcancel, or \xcancel.
    • Add braces around comments in title-meta (#10501). This is needed to prevent PDFs from interpreting this as a sequence of titles.
    • Set pdf-trailer-id if SOURCE_DATE_EPOCH envvar is set (#6539, Albert Krewinkel). The SOURCE_DATE_EPOCH environment variable is used to trigger reproducible PDF compilation, i.e., PDFs that are identical down to the byte level for repeated runs.
    • Be more conservative about using \url (#8802). We only use it when the URL is all ASCII, since the \url macro causes problems when used with some non-ASCII characters.
    • Support soft hyphens (Albert Krewinkel).
    • Change handling of math environments (#9711, #9296). When certain math environments (e.g. align) are found in Math elements, we emit them “raw” instead of putting them in $..$.
  • Typst writer:

    • Check XID_Continue in identifiers (Tuong Nguyen Manh).
    • Add escapes to prevent inadvertent lists due to automatic wrapping (#10047). Also simplify existing code that was meant to do this.
    • Add parentheses around typst-native year-only citations (#11044).
    • Add native Typst support for nocite (#10680, Albert Krewinkel). The nocite metadata field can now be used to supply additional citations that don’t appear in the text, just as with citeproc and LaTeX’s bibtex and natbib.
    • Set lang attribute in Divs (#10965).
    • Rename numbering variable to section-numbering (Albert Krewinkel). This is the name expected by the default template.
    • Add support for custom and/or translated “Abstract” titles (Albert Krewinkel, #9724).
  • Org writer:

    • Don’t wrap link descriptions (#9000). Org doesn’t reliable display these as links if they have hard breaks.
    • Disable smart quotes by default (Albert Krewinkel).
  • Markdown writer:

    • Better handling of pandoc-generated code blocks (#10926). Omit the wrapper sourceCode divs added by pandoc around code blocks. More intelligently identify which class to use for the one class allowed in GFM code blocks. If there is a class of form language-X, use X; otherwise use the first class other than sourceCode.
    • Use fenced divs even with empty attributes (#10955, Carlos Scheidegger). Previously fenced divs were not used in this case, causing the writer to fall back to raw HTML.
    • Match indents in definition items (#10890, Albert Krewinkel). Previously, the first line of a defini...
Read more

pandoc 3.7.0.2

29 May 07:26
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.7.0.2,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.7.0.2

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.7.0.2

This release fixes some regressions in grid table rendering introduced
in 3.7. There are a few other nice improvements as well; see the
changelog for details.

Thanks to all who contributed, especially new contributor GHyman83.

Click to expand changelog
  • RST writer:

    • Don’t emit alignment markers in grid tables (#10857).
  • Asciidoc writer:

    • Add support for sidebars (GHyman83).
  • LaTeX writer:

    • Include alt option in \includegraphics (#6095).
  • Markdown writer:

    • Preserve figure attributes (Nikolay Yakimov, #10867). Fixes a regression introduced by 0d2114e, which caused the Markdown writer to ignore attributes on the figure if it has class or key-value attributes set.
  • HTML writer:

    • Use the ID prefix in the ID for the footnotes section (Benjamin Esham).
  • Text.Pandoc.Writers.Shared:

    • gridTable: fix (3.7) regression with missing cell alignments (#10853).
    • gridTable: fix headings with colspans (#10855). If the heading contains a colspan, we still need to include information in the header line about the colspecs.
    • gridTable: fix headerless tables. The top line should encode colspan information.
  • Text.Pandoc.SelfContained:

    • Fix handling of empty script element (#10862). Previously in this case the closing tag was dropped.
    • Do not drop data- attributes in script tags (#10861).
  • Lua subsystem (Albert Krewinkel):

    • Add function pandoc.mediabag.make_data_uri (#10876). The function takes a MIME type and raw data from which it creates an RFC 2397 data URI.
  • tools/update-lua-module-docs: fix handling of wikilinks (Albert Krewinkel).

  • doc/lua-filters.md: add missing docs for pandoc.Caption (Albert Krewinkel).

  • Require texmath 0.12.10.3, typst 0.8.0.1

pandoc 3.7.0.1

17 May 20:04
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.7.0.1,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.7.0.1

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.7.0.1

This release fixes some serious problems with the new grid table writer
introduced in 3.7. If you installed 3.7, I recommend you upgrade.

It also fixes tagging with -t context+tagging.

Click to expand changelog
  • Text.Pandoc.Shared.Writer: Fix numerous problems with gridTable and add tests (#10848). These fixes affect the Markdown, RST, and Muse writers.

  • Fix context writer/template to produce tagged PDFs (#10846). As before, the tagging extension must be enabled. We now add the command that tells ConTeXt to start tagging.

pandoc 3.7

15 May 05:42
@jgm jgm

Choose a tag to compare

I'm pleased to announce the release of pandoc 3.7,
available in the usual places:

Binary packages & changelog:
https://github.com/jgm/pandoc/releases/tag/3.7

Source & API documentation:
http://hackage.haskell.org/package/pandoc-3.7

  • New command-line option --variable-json. This allows non-string
    values (such as booleans or maps) to be given to template variables
    on the command line.
  • --pdf-engine will now accept groff as a value.
  • Markdown and RST writers now allow row/colspans in grid tables.
    In addition, table column widths will expand if needed to
    contain text that can't be wrapped, avoiding the introduction
    of unwanted whitespace.
  • The four_space_rule extension now works for plain output.
  • Roff formats now use the most portable syntax possible.
  • Improved handling of inline TeX in Org-mode.
  • In Lua filters, pandoc.read can now be used in "sandboxed"
    mode, restricting file or network access, by passing in a
    list of accessible files as a fourth parameter.

API changes:

  • Text.Pandoc.Writers.Shared: new function delimited.
  • Text.Pandoc.Writers.Shared: new version of gridTable with
    changed parameters.
  • Text.Pandoc.Class: new exported function sandboxWithFileTree.

Thanks to all who contributed, especially new contributors
Manolis Stamatogiannakis, Mohamed Akram, and Niklas Eicker.

Click to expand changelog
  • Add new command-line option --variable-json (#10341). This allows non-string values (booleans, lists, maps) to be given to template variables on the command line.

  • The --pdf-engine option can now take groff as a value.

  • Markdown writer:

    • Avoid spaces after/before open/close delimiters (#10696). E.g. instead of rendering x<em> space </em>y as x* space *y we render it as x *space* y.
    • Handle row/colspans in grid tables, and expand cells when it isn’t possible to lay them out without breaking string of non-whitespace.
    • Render a figure with Para caption as implicit figure (#10755).
    • When falling back to a Div with class figure for a figure that can’t be represented any other way, include a Div with class caption containing the caption.
    • Improve use of implicit figures when possible (#10758). When the alt differs from the caption, but only as regards formatting, we still use an implicit figure.
    • Omit initial newlines in gfm math blocks to avoid an ugly blank line.
    • Support the four_space_rule extension for plain output (#10813, Manolis Stamatogiannakis).
  • RST writer:

    • Handle row/colspans in grid tables, and expand cells when it isn’t possible to lay them out without breaking string of non-whitespace.
  • Muse writer:

    • Handle row/colspans in grid tables, and expand cells when it isn’t possible to lay them out without breaking string of non-whitespace.
  • JATS writer:

    • Fix escaping for writing-review-editing role (#10744).
  • HTML writer:

    • Remove trailing slash from default revealjs URL (#8749). This avoids a double slash in the URL’s path component.
  • LaTeX writer:

    • Make alignment work within multirow in tables (#10772).
  • Typst writer:

    • Support mark class on spans (#10747).
    • Add equation label if math contains \label{..} (#10805).
  • Roff format writers (man, ms):

    • Use the most compatible form for roff escapes (#10716). For example, \(xy instead of \[xy]. This was the original AT&T troff form and is the most widely supported. The bracketed form causes problem for some tools, e.g. makewhatis on macOS. And emit e followed by an escape for a unicode combining accent rather than the form \[e aa], which works for groff but not e.g. on macOS’s man. This change affects Text.Pandoc.RoffChar, Text.Pandoc.Writers.Roff, and the Man and Ms writers.
  • Docx writer:

    • Ensure that figures and tables with custom styles are not dropped (#10705).
    • Preserve Relationships for images from reference docx (#10759). This should allow one to include an image in a reference.docx and reference it in an openxml template.
    • Don’t renumber rels (#10769). We used to renumber the Relationships so they didn’t conflict with the set of fixed Relationships we imposed. We are now preserving the ids from the reference doc’s document.xml.refs, so we shouldn’t renumber them or references introduced by the user (e.g. in a template) will fail.
  • Ms writer:

    • Improve PDF TOC labels. We now use the plain writer to render these, so that Greek characters etc. will show up properly.
    • When no pdf-engine variable is specified, do not use the .pdfhref macros at all (#10738). This gives better results for links in formats other than PDF, since the link text would simply disappear if it exists only in a .pdfhref macro. When a PDF engine is specified, escape the argument of .pdfhref O in a way that is appropriate.
  • OpenDocument writer:

    • Fix character styles in footnotes (#10791). Character styles governing the position of the footnote reference should not be imposed on the footnote text.
  • Powerpoint writer:

    • Use reference-doc font for captions (#9896, R. N. West).
  • DocBook writer:

    • Use literallayout element for LineBlock (#10825).
  • MediaWiki reader/writer:

    • Allow definition on same line as term (#10708).
  • LaTeX reader:

    • Skip at most one argument to LaTeX tabular newline (#7512, Evan Silberman).
    • Disable ligatures inside \texttt (#10781).
    • Support more symbol commands (#10782).
  • Commonmark Reader:

    • Handle GFM math irregularity with braces (#10631). In GFM, you need to use \\{ rather than \{ for a literal brace.
  • DocBook reader:

    • Improve handling of literallayout (#10825). This is now only made a CodeBlock when there is a monospaced class. Otherwise it is made a LineBlock.
  • Org reader:

    • Add AVIF to Org Reader image extensions (#10736, Christian Christiansen).
    • Don’t include newlines in inine code/verbatim (#10730). Convert newlines to spaces as we do in other formats.
    • Change handling of inline TeX (#10836). Previously inline TeX was handled in a way that was different from org’s own export, and that could lead to information loss. This was particularly noticeable for inline math environments such as equation. Previously, an equation environment starting at the beginning of a line would create a raw block, splitting up the paragraph containing it (see #10836). On the other hand, an equation environment not at the beginning of a line would be turned into regular inline elements representing the math. (This would cause the equation number to go missing and in some cases degrade the math formatting.) Now, we parse all of these as raw “latex” inlines, which will be omitted when converting to formats other than LaTeX (and other formats like pandoc’s Markdown that allow raw LaTex).
  • Beamer template: fix regression in 3.6.4, reverting the omission of \date when the document does not have a date. By default, beamer will display a date when no \date is present in the title block, so this was an unintended behavior change. The reverted change was motivated by the desire to include a custom \date in the frontmatter via header-includes. This can be achieved more simply by simply setting the date variable. In markdown you can even use date in metadata and put some raw LaTeX there.

  • Ms template:

    • Use T rather than P as default font family (#10738).
    • Put PDF-specific things under a conditional. Don’t include them if pdf-engine isn’t set.
  • Upgrade reveal.js URL to v5 (#10740, Kolen Cheung). v4 is no longer available on unpkg.com.

  • Text.Pandoc.PDF: Allow groff to be used as --pdf-engine with ms (#10738). When groff is used as a PDF engine, the groff extension to ms is automatically enabled. Limitations:

    • groff currently produces larger PDFs than pdfroff.
    • With groff, a table of contents produced with --table-of-contents/--toc will always be placed at the end of the document.
    • Certain characters (e.g. Greek characters) may be dropped in the PDF outline.
  • Text.Pandoc.Writers.Shared:

    • Export delimited [API change].
    • New version of gridTable (#6344) [API change]. This handles row and colspans. It also ensures that cells won’t wrap text in places where it wouldn’t normally wrap, even if this means making the cells wider than requested by the colspec (#9001, #7641). Because the parameters are different, this is a breaking API change.
  • Text.Pandoc.App: set pdf-engine variable. If --pdf-engine is specified or if a PDF is being produced, we set the pdf-engine variable. This allows writers and templates to behave differently depending on the PDF engine.

  • Text.Pandoc.Class and Text.Pandoc.URI:

    • Fix parsing of base64 data URIs to allow URI escapes and whitespace (which will be ignored) (#10704).
    • Handle percent encoding in pBase64URI instead of unescaping later, for efficiency (#10704).
  • Text.Pandoc.Citeproc.BibTeX:

    • Recognize en as a langid in biblatex bibliographies (#10764).
  • Text.Pandoc.MIME:

    • Add mime type and extension for avif (#10704).
    • Handle apng, avif, jxl (#10704).
  • Text.Pandoc.Readers.LaTeX.Math: export inlineEnvironmentNames. Internal module, not a change to the public API.

  • reference.docx (Andrew Dunning):

    • Remove extra spaces around text placeholders.
    • Add footnote block text sample.
  • Text.Pandoc.Class.Sandbox:

    • Add sandboxWithFileTree function [API change] (Albert Krewinkel).
  • Lua subsystem (Albert Krewinkel):

    • pandoc-lua-engine: add all test files to the cabal file.
    • Allow pandoc.read to be called in “sandbox” mode for added security (#10831). Readers running in a sandbox will not be able to access the network or file system. The sandbox is enabled if the fourth parameter ...
Read more

pandoc 3.6.4

16 Mar 19:05
@jgm jgm

Choose a tag to compare

Click to expand changelog
  • Disable citations extension in writers if --citeproc is used (#10662). Otherwise we get undesirable results, as the format’s native citation mechanism is used instead of (or in addition to) the citeproc-generated citations.

  • Markdown reader:

    • Allow line break between URL and title of link (#10621).

    • Give better position information when YAML metadata parsing fails with a YAML exception (#10231).

    • Fixed escapedChar' parser (#10672). It should not accept escaped newlines.

    • Remove some misguided list fanciness (#9865, #7778, cf. #5628). Previously we tried to handle things like commented out list items:

      - one
      <!--
      - two
      -->
      - three
      

      and also things like:

      - one `and
      - two` and
      

      But the code we added to handle these cases caused problems with other, more straightforward things, like:

      - one
      - ```
        code
        ```
      - three
      

      So we are rolling back all the fanciness, so that the markdown parser now behaves more like the commonmark parser, in which indicators of block-level structure always take priority over indicators of inline structure.

  • HTML reader:

    • Skip MathJaX-introduced cruft (#10673).
    • Ignore style tags in the body (#10643).
  • LaTeX reader:

    • Better handle comments/whitespace in option lists and includes (#10659).
    • Support \newline, \linebreak.
  • Docx reader/writer:

    • Revert commit adding row heads (cbe67b9) (#10627). Word sets w:firstColumn="1" by default for tables. You have to find the Table Design tab and explicitly uncheck “First Column” to make this go away. In most cases, I don’t think writers intend to designate the first column as a row head, so this commit is going to produce unexpected results. In addition, because of the table normalization done by pandoc-type’s tableWith, any table containing a colspanned cell in the left-hand column will get broken if the first column is designated a row head. For these reasons it seems best to revert this change, which was made in response to #9495.
  • LaTeX writer and template:

    • Remove selnolig-langs (#9863). We now specify the language as a global option again, so we no longer need to specify it when invoking selnolig.
    • Use babel options shorthands=off (#6817).
    • Use * for multirow width when no colwidth specified (#10685). Otherwise the multirow will be excessively wide.
    • Protect \phantomsection (#10688, etclub).
  • Markdown writer:

    • Omit extra space after bullets (#7172). Those who want the old behavior can obtain it by using -t markdown+four_space_rule.
    • Treat Emph [Emph ils]] as ils (#10642). Otherwise we get **content** which means strong emphasis.
  • EPUB writer:

    • Use a nonbreaking space after section number in nav.xhtml. This seems to be required for iOS books app to display the space.
  • Typst writer:

    • Better heuristics for escaping potential list markers (#10650).
    • Ensure that citation-style works as well as csl (#10661).
  • Powerpoint writer:

    • Avoid extra blank lines before author when there is no subtitle (#10619).
  • JATS template:

    • Fix typo in author prefix in article.jats_publishing template (#10622, Tiago-Manzato).
  • Text.Pandoc.Parsing:

    • Smart quote parsing: ignore curly quotes (#10610). Previously we tried to match curly quotes as well as straight quotes, producing Quoted inlines. But it seems better just to assume that those who use curly quotes want them passed through verbatim. This also fixes an (unintended) bug whereby curly single left quotes would sometimes be changed to single right quotes.
  • Text.Pandoc.Shared:

    • makeSections: put some attributes on section element only. Certain role and epub:type attributes should only be on the section (and indeed, many roles give a validation error if left on the heading element).
  • Text.Pandoc.Logging:

    • Change NoTitleElement from WARNING to INFO (#10671). Users commonly complain about the warning when producing HTML documents without an explicit title. It seems that an info message is more appropriate, since pandoc’s default here (using the input’s base name) ensures compliance with the standard and many users are happy with that default. Those who want to make sure the message is seen can use --verbose.
  • Beamer template: only emit \date if set (#10687, josch).

  • Fix invalid OOXML in definition_list.docx test (#10394).

  • MANUAL.txt:

    • Correct typo: ‘date’ for doubled ‘title’ (#10654, Olivier Dossmann).
    • Add note about template variable for typst.
    • Change maxwidth default in MANUAL.txt (#10683).
    • Improve EPUB metadata documentation.
    • In Security section, alert readers to a threat relating to iframe in HTML, and add LaTeX, Typst to the list of formats that have an include (#10682).
  • doc/lua-filters.md: Add missing html_math_method ‘katex’ (R. N. West).

  • Use texmath 0.12.9.

  • Use typst 0.7. Fixes an issue with package loading, a regression in pandoc 3.6.3.

pandoc 3.6.3

09 Feb 22:11
@jgm jgm

Choose a tag to compare

Click to expand changelog
  • Track wikilinks with a class instead of a title (Evan Silberman). Previously wikilinks were distinguished by giving them the title wikilink. Now that we have link attributes, it makes more sense to give them the class wikilink. This change affects all readers and writers that support wikilinks.

  • DocBook reader:

    • Handle title inside orderedlist (#10594). Also some other elements that allow title: blockquote, calloutlist, etc.
    • Better handle informalequation (#10592, tombolano). Include id attribute.
    • Better handle formalpara, example, and sidebar (#8666, tombolano). Include identifiers and titles in each case.
  • Markdown reader:

    • Simplify and fix normal citation parsing (#10584). This fixes a bug that causes some normal citations to be parsed as bracketed regular citations.
  • ODT reader:

    • Create Figure elements for images that are figures (#10567).
    • Avoid producing spurious blockquotes in list items (#9505).
    • Fix unwanted block quotes (#10575). Previously the reader created block quotes whenever a paragraph was marked indented (even though this just affects the first line). With this change we still generate block quotes for content that has an altered left margin, but not for indented paragraphs.
  • Docx reader:

    • Do not issue warning for comments with +styles (#10571, Stephen Reindl).
  • LaTeX reader:

    • Test {,re}newcommand arguments (#4470, Evan Silberman).
  • Pod reader:

    • Consume blanks after =encoding in pod reader (#10537, Evan Silberman).
  • JATS writer:

    • Add CRediT roles to JATS (Charles Tapley Hoyt and Jez Cope, #10152). Enable annotating author roles using the Contribution Role Taxonomy (CRediT) and export this information in conformant JATS.
  • LaTeX writer/templates:

    • Improve babel support (#8283). Previously we used the .ini files for every language, but for European languages these tend to provide inferior results to the .ldf files used by classic Babel. Currently Babel documentation recommends using the classic system for European languages written in Latin and Cyrillic scripts and Vietnamese. So the LaTeX writer and template now follow this guidance.

      Main languages in the list of languages with good “classic” support are added to global documentclass options and will be automatically handled by Babel using the .ldf files.

      If the main language is not in this list, the babeloptions variable will be set to provide=*, which will cause support to be loaded from the .ini file rather than an .ldf. So, for example, setting -V babeloptions='' with a polytonic Greek document will cause the .ldf support to be used instead of the .ini.

      The default setting of this variable can be overwritten, but in most cases the default should give good results.

    • Allow csquotesoptions to be specified.

    • Fix indentation bugs in font-settings.latex.

  • Docx writer:

    • Repeat reference doc’s sectPr for each new section (#10577). Previously we were only carrying over the reference doc’s sectPr at the end of the document, so it wouldn’t affect the intermediate sections that are now added if --top-level-division is chapter or part. This could lead to bad results (e.g. page numbering starting only on the last chapter).
    • Create section divisions with --top-level-division=part (#10576).
    • Improve title style in reference.docx; base Author and Date on Title; remove condensed spacing (Andrew Dunning, #10581).
  • Typst writer:

    • Brace tables with typst:no-figure and typst:text attributes (#10563, Gordon Woodhull).
  • Ms writer:

    • Fix escaping of - (#10536). - should now be escaped in man output but not in ms output (where \- is a unicode minus sign).
  • HTML styles: fix style of hr so it works when printed (#10535, Hendrik Erz). Previously background-color was used to style the hr, but this gets ignored when printing. This commit uses border-top instead.

  • Text.Pandoc.Shared:

    • Handle <abbr> as a span-like inline in htmlSpanLikeElements (#5793, Evan Silberman).
  • Text.Pandoc.MediaBag:

    • Prefer MIME type when determining extensions for MediaBag items (#10557, Max Heller). This should give different results for remote images that are served at URLs that do not contain misleading extensions (e.g. shields.io).
  • Text.Pandoc.Citeproc:

    • Fix moving punctuation before citation notes. This previously worked with regular citations, but not author-in-text citations. Now it works with both.
  • doc/lua-filters.md:

    • Correct luacheck URL (#10589, R. N. West).
    • Add static analysis paragraph to debugging section (#10568, R. N. West).
    • Add note about extensions handling in read and write (Albert Krewinkel).
  • doc/extras.md:

    • Add entry for pandoc-subfigs (R. N. West).
    • Update diagram Lua filter URL and description (R. N. West).
  • MANUAL.txt:

    • Add note on using typst to produce pdf/a-2b.
    • Document top-level-division functionality with Docx (#10579, Andrew Dunning).
  • Raise xml-conduit upper bound.

  • Depend on latest commonmark-pandoc, commonmark-extensions, citeproc, typst.

  • Makefile: make make binpath quiet.