Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and reorganize Lua filter introduction #9106

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 35 additions & 36 deletions doc/lua-filters.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,34 +6,19 @@ date: 'January 10, 2020'
title: Pandoc Lua Filters
---

# Introduction
Create custom outputs with pandoc's embedded Lua engine.

Pandoc has long supported filters, which allow the pandoc
abstract syntax tree (AST) to be manipulated between the parsing
and the writing phase. [Traditional pandoc
filters](https://pandoc.org/filters.html) accept a JSON
representation of the pandoc AST and produce an altered JSON
representation of the AST. They may be written in any
programming language, and invoked from pandoc using the
`--filter` option.

Although traditional filters are very flexible, they have a
couple of disadvantages. First, there is some overhead in
writing JSON to stdout and reading it from stdin (twice, once on
each side of the filter). Second, whether a filter will work
will depend on details of the user's environment. A filter may
require an interpreter for a certain programming language to be
available, as well as a library for manipulating the pandoc AST
in JSON form. One cannot simply provide a filter that can be
used by anyone who has a certain version of the pandoc
executable.

Starting with version 2.0, pandoc makes it possible to write
filters in Lua without any external dependencies at all. A Lua
## Introduction

With Lua filters, you can write Pandoc filters without any
external dependencies. Besides the simpler set-up, Lua filters are
generally faster and can access utility functions to manipulate
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

faster than what? Since the text mentioning JSON filters was removed, this is no longer clear.

document elements.

Since Pandoc 2.0, the pandoc executable has a built-in Lua
interpreter (version 5.4) and a Lua library for creating pandoc
filters is built into the pandoc executable. Pandoc data types
are marshaled to Lua directly, avoiding the overhead of writing
JSON to stdout and reading it from stdin.
filters. Pandoc data types are marshaled to Lua directly, avoiding
the overhead of writing JSON to stdout and reading it from stdin.

Here is an example of a Lua filter that converts strong emphasis
to small caps:
Expand Down Expand Up @@ -62,17 +47,31 @@ replace it with a SmallCaps element with the same content.
To run it, save it in a file, say `smallcaps.lua`, and invoke
pandoc with `--lua-filter=smallcaps.lua`.

## Why Lua filters over JSON?

[JSONfilters](https://pandoc.org/filters.html) accept a JSON
representation of the pandoc AST and produce an altered JSON
representation of the AST. They may be written in any programming
language, and invoked from pandoc using the `--filter` option.

However, JSON filters have limitations:

- Writing JSON to stdout and reading it from stdin (twice, once
on each side of the filter) is inefficient.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line should be indented so it lines up to the list content.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, I think the parenthetical comment could be removed

- External dependencies vary between users, and universal JSON
filters are not possible.
Comment on lines +61 to +62
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will be clear to readers what is meant by "universal JSON filters" or why dependency variation is important. I think the original text on this was clearer.


Here's a quick performance comparison, converting the pandoc
manual (MANUAL.txt) to HTML, with versions of the same JSON
filter written in compiled Haskell (`smallcaps`) and interpreted
Python (`smallcaps.py`):

Command Time
--------------------------------------- -------
`pandoc` 1.01s
`pandoc --filter ./smallcaps` 1.36s
`pandoc --filter ./smallcaps.py` 1.40s
`pandoc --lua-filter ./smallcaps.lua` 1.03s
manual (MANUAL.txt) to HTML, with versions of the same JSON filter
written in compiled Haskell (`smallcaps`) and interpreted Python
(`smallcaps.py`):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space at end of line


Command Time
--------------------------------------- -------
`pandoc` 1.01s
`pandoc --filter ./smallcaps` 1.36s
`pandoc --filter ./smallcaps.py` 1.40s
`pandoc --lua-filter ./smallcaps.lua` 1.03s

Comment on lines +69 to 75
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is indented 4 spaces and thus a code block instead of a table? What this change?

As you can see, the Lua filter avoids the substantial overhead
associated with marshaling to and from JSON over a pipe.
Expand Down