Parinfer

Best viewed on wide desktop browsers. Sorry for the inconvenience.

Let's simplify the way we write Lisp...

Try it! Interrupt the animations below to try it for yourself. Click outside to restore it.
Parinfer's "Indent Mode" rearranges parens when you change indentation:
Insert or delete a line without rearranging parens:
Comment a line without rearranging parens:
Basic Paredit without hotkeys thanks to simple paren inference. (wrap, splice, slurp, barf)
Preserve indentation when editing in "Paren Mode":

Parinfer is a proof-of-concept editor mode for Lisp programming languages. It simplifies the way we write Lisp by auto-adjusting parens when indentation changes and vice versa. The hope is to make basic Lisp-editing easier for newcomers and experts alike, while still allowing existing plugins like Paredit to satisfy the need for more advanced operations.

Read on for a full exploration of the motivations, rules, and effects of Parinfer.

Introduction

Lisp is often dismissed as an undesirable programming syntax due to excessive parentheses. Those who have adopted Lisp have long recognized its amazing strengths, but there is still a widely held uncertainty among newcomers about how or even why we must manage so many parens. As a result, Lisp's unique power remains invisible to most under this guise of difficulty.

Newcomers aren't satisfied with the current tools designed for editing Lisp. Expert-level editors (Emacs, Vim) and advanced hotkeys (Paredit) are powerful but steepen the learning curve. And alternative syntaxes (Lisps without parens) have faltered since they sacrifice some of Lisp's power that seasoned users aren't willing to part with.

Parinfer is a new system that tries instead to fix this problem at its source. We formally define the relationship between Parens and Indentation. With it, we create an intuitive editor mode to make paren management fun and easy without sacrificing power. We demonstrate this through interactive proof-of-concept demos here, providing capabilities for:

  • making code auto-adhere to formatting conventions
  • influencing expression-nesting with indentation
  • maintaining indentation when expressions shift
  • allowing Paredit-like features without hotkeys

NOTE: When I say "parens" (parentheses), I also mean [square] or {curly} brackets. Some Lisps (e.g. Racket, Clojure) use these extra delimiters to help visually separate certain constructs.

Uncovering Lisp

Most programming languages have several syntax rules. Lisp has one: everything is a list. The first element is a function name, and the rest are its arguments. Thus, the language is simply a collection of compile- and run-time functions, trivially extensible.

C Style expressions:
Lisp Style expressions:

But parentheses in Lisp are infamous for bunching together at the end of long expressions. This indentation convention can be jarring at first if you are used to curly braces in other languages being on their own lines:

C Style indentation (unusual in Lisp):
Lisp Style indentation:

The idea behind this convention is to make every line inform with content rather than just parens. Readability is helped by employing a Python-like indentation style. This achieves a sort of balance— Indentation allows you to skim while the parens allow you to inspect:

Skim by focusing on indentation
Inspect parens with your cursor when needed

Though both perspectives are visible at once, we must focus on one at a time. A LEGO analogy helps here. Imagine each list in the previous example as a LEGO block stacked over its parent. Checking the sides to see the layers below is like checking the parens at the end of a line.

Indentation implies nesting.
Tilting clarifies nesting (close-parens shown).

This is a physical analog to the way we read Lisp code. Now let's look at the space of tooling solutions that we use for writing Lisp code and specifically how Parinfer can help.

Tools for Writing Lisp

In the previous section, we saw how Parens and Indentation provide two perspectives for reading Lisp code. Unfortunately, this incurs some redundancy when writing Lisp since we must edit both for every change, ensuring that one will correctly imply the other.

There are existing tools to help with this. To best represent how Parinfer compares to them, we will represent this complex space of tooling in a way that can be visually compared below.

The black gear represents the current action we are taking.

The menial and default way to edit Lisp is to manually ensure our parens are balanced after inserting or removing them. Then, we adjust the indentation to match it in kind. This back-and-forth happens for most editing tasks.

Existing tools automate some of these editing tasks. For example, Paredit forces you to transform or add parens in a balanced way through special hotkeys. And Auto-indent allows you to auto-correct indentation of selected lines when desired. This automates the tasks, but the back-and-forth actions are still manually triggered.

Parinfer is a new tool to combine and simplify this type of automation by naturally keeping Parens and Indentation in lockstep. It formally infers changes to one based on the other. The back-and-forth actions have been reduced with special modes, which we will explore next.

Writing with Parinfer

As we saw with the previous visualization, Parinfer will infer some changes to keep Parens and Indentation inline with one another. To keep this inference simple and predictable, we have the user explicitly choose the degree of freedom that they want full control of, while relinquishing some control of the other to Parinfer.

Thus, Parinfer consists of two modes:

  1. Indent Mode gives you full control of indentation, while Parinfer corrects parens.
  2. Paren Mode gives you full control of parens, while Parinfer corrects indentation.
To make this easier, Indent Mode is default. Paren Mode can be an advanced option.

Some noteworthy unintended use-cases which we will explore later:

  • Indent Mode allows Paredit-like features without hotkeys.
  • Paren Mode can be used to fix incorrectly indented files before using with Indent Mode.

Mathematical Foundation

Feel free to skip this if you have no interest in the math that makes Parinfer possible.

The foundation of Parinfer relies on a few somewhat formalized definitions and properties:

We wish to define properties that each Parinfer mode should exhibit.

With the following definitions:

  • $I$ = (Indent Mode) function accepting code text and returning new text.
  • $P$ = (Paren Mode) function accepting code text and returning new text.
  • $R$ = (Reader) function accepting code text and returning its AST data.

The following should be true:

  • let $y = P(x)$
    $\leftarrow$ run Paren-Mode on $x$ to get $y$
    • $\Longrightarrow R(x) = R(y)$
      $\leftarrow$ Paren-Mode should never change the AST
    • $\Longrightarrow P(y) = y$
      $\leftarrow$ Paren-Mode should be idempotent
    • $\Longrightarrow I(y) = y$
      $\leftarrow$ Indent-Mode should never change the result of Paren-Mode
  • let $z = I(x)$
    $\leftarrow$ run Indent-Mode on $x$ to get $z$
    • $\mathrel{\rlap{\hskip .5em/}}\Longrightarrow R(x) = R(z)$
      $\leftarrow$ Indent-Mode may change the AST (by design)
    • $\Longrightarrow I(z) = z$
      $\leftarrow$ Indent-Mode should be idempotent

The actual operations performed by the modes rely on a formal definition of what it means for Lisp code to be "correctly formatted". We establish an invariant— something that must be true for every line of code. From that, Parinfer corrects indentation or parens simply by choosing correct values for $i_n$ or $p_n$ (defined later) to satisfy this invariant:

  • Indent Mode forces correct values of $p_n$
  • Paren Mode forces correct values of $i_n$

The following is a concise reference (not a guide) to establishing this invariant that Parinfer's modes rely on.

We wish to define necessary conditions for determining if a given file of Lisp code is "correctly formatted".

We start with a clarification that we only consider non-empty lines (i.e. lines that have at least one non-whitespace, non-comment token). Thus, all following references to line number $n$ will refer to the $n$th non-empty line.

To proceed, we define the following:

  • $t_n$ = (tokens) index of non-whitespace, non-comment tokens at line $n$
  • $i_n$ = (indentation) x-position of $t_n[0]$
  • $p_n$ = (parens) number of close-parens at the end of $t_n$
  • $s_n$ = (stack) index of x-positions of all open-parens unclosed before $t_n[-p_n]$

Next, we define a function which determines if a line's indentation is valid:

  • $f(i,p,s) = ( x_\text{min} \lt i \leq x_\text{max} )$
    $\leftarrow$ verifies indentation is within a threshold defined below

where:

  • $x_\text{min} = \begin{cases} s[-p-1] & \text{if } |s| > p, \\ -1 & \text{otherwise} \end{cases}$
    $\leftarrow$ x-position of open-paren of parent list if it exists
  • $x_\text{max} = \begin{cases} s[-p] & \text{if } p \gt 0, \\ \infty & \text{otherwise} \end{cases}$
    $\leftarrow$ x-position of open-paren of previous sibling if it is a list

Finally, we define that a file of Lisp code is "correctly formatted" if:

  1. for every line $n > 0$:
    • $t_n[0] \ne $ ")", and
      $\leftarrow$ no line may start with a close-paren
    • $f(i_n, p_{n-1}, s_{n-1})$ is true
      $\leftarrow$ indentation threshold that we defined above
  2. and for the last line $N$
    • $|s_N| = p_N$
      $\leftarrow$ all open-parens must be closed at the last line

These rules are necessary and sufficient for determining when indentation is what we consider "unambiguous", but they are not sufficient in determining if code is "pretty". For example:

  • single-line files are okay
  • "tab" lengths are flexible (one space, two spaces, etc)
  • indentation of sibling lines are not required to be aligned

Formal descriptions of the actual operations performed by the modes are pending, but informal ones follow in their respective sections below.

Indent Mode

Indent Mode gives you full control of indentation, while Parinfer corrects or inserts close-parens where appropriate. Specifically, it only touches the groups of close-parens at the end of each line. As a visual cue, we slightly dim these parens to signify their inferred nature.

Try it! Interrupt the animations below to try it for yourself. Click outside to restore it.
Indent to influence the structure of your code:
Indent further to reach different thresholds:
Indent multiple lines to see its effect:

You can select multiple lines and adjust their indentation the standard way using the controls below. If you are familiar with Paredit, these operations are roughly equivalent to those listed.

Controls Description Paredit equivalent
Tab indent line(s) slurp line(s) down
Shift + Tab dedent line(s) barf line(s) down

Interesting Consequences

Insert or delete a line without rearranging parens:
Comment a line without rearranging parens:

How it works

We perform the following steps to rearrange close-parens based on indentation.
We will refer to these later as rules #1, #2, #3 and #4.

  1. remove all unmatched close-parens (for housekeeping)
  2. remove all close-parens at the start and end of each line
  3. keep all close-parens inside each line
  4. for every resulting unmatched open-paren:
    • insert a matching close-paren at the end of its line or its last non-empty indented line
Try it! Edit the code below on the left to see how parens are inferred on the right.
Input: close-parens are removed or kept.
Output: close-parens are inserted.

This is the gist of what's happening. There are more steps performed, but we will just explore their effects in the next section.

Paredit emerges

You should be aware that the steps in the previous section have a side effect on what you type. Interestingly, these effects translate into four of the main Paredit operations.

Cause Effect Description
Insert ( Wrap inserts a matching ) as far as it can
i.e. "wraps" all possible elements to the right of your cursor
Insert ) Barf removes the original ) when inserted inside a matching pair
i.e. the current list "barfs" out all elements to the right of your cursor
Delete ( Splice removes the matching )
i.e. "splices" the current list into its parent (or simply "unwraps" it)
Delete ) Slurp inserts another ) as far as it can
i.e. the current list "slurps" all elements to the right of your cursor

We illustrate these operations in the following examples.

Inserting Parens

Wrap by inserting an open-paren. It will auto-close as far as it can, due to rule #4.
Barf by inserting a close-paren before another. Notice the original is removed, due to rule #1.
Why can't I insert a close-paren in certain places?
Its corresponding open-paren must be there first. (see rule #1)

Deleting Parens

Splice by removing an open-paren. Its corresponding close-paren is removed, due to rule #1.
Slurp by deleting a close-paren inside a line. It is replaced further down, due to rule #4.
Why can't I delete a close-paren in certain places?
You cannot delete an inferred close-paren. It is replaced as soon as you delete it. (see rule #4)

Knowing When Parens Move

As a courtesy, Indent Mode will not move your parens until you are done typing in front of them. Just move your cursor away when you're done. A helpful analogy might be to think of your cursor as a paperweight that keeps your parens from blowing away.

Paren displaced when your cursor moves to another line. (displaced due to indentation)
Paren not displaced since you were given the chance to block it. (paren not at end of line)

Inserting Quotes

Parinfer cannot infer anything about quote positions like it can with parens. So it doesn't try to do anything special with them, other than abandon processing if imbalanced quotes are detected.

Quote insertion allows temporary paren imbalances until quote is closed:
WARNING: Always make sure quotes are balanced when inside comments!
If there is an unclosed quote before a comment, which itself contains imbalanced quotes, they will balance each other out and fool Parinfer into thinking it is okay for processing.
BAD: An unclosed string in a comment can cause corrupted strings.
GOOD: Balance the quotes in the comment to prevent the problem.

Paren Mode

Paren Mode gives you full control of parens, while Parinfer corrects indentation. You can still adjust indentation, but you won't be able indent/dedent past certain boundaries set by parens on previous lines. As a courtesy, this mode also maintains relative indentation of child elements when their parent expressions shift.

Here are some things that cannot be done in Indent Mode:

Tune indentation without worrying about crossing a paren boundary:
Avoid fracturing a multi-line expression when pushing its open-paren forward:
Indentation is maintained when parens shift, ensuring reversible operations:
Nested expressions become automatically indented (temporary imbalances allowed):

How it works

Paren Mode performs the following steps:

  1. Move close-parens at the start of a line to the end of the previous non-empty line.
  2. Clamp the indentation of a line to the following range:
    • min: x-position of the parent open-paren (if it exists)
    • max: x-position of the open-paren belonging to the previous non-empty line's last close-paren
  3. Child elements of moved expressions should maintain their original relative indentation to them.
  4. Cancel processing if there are any unmatched parens.

Switching Modes

If there are paren imbalances in Paren Mode, the code is not processed, and you are prevented from switching to Indent Mode. This safely quarantines the imbalances that could be misinterpreted in Indent Mode. Thus, Paren Mode gives you an environment to fix them, after which the code is automatically formatted and ready for Indent Mode should you choose to switch.

Fixing existing files

We must take parens literally when opening an existing file. Incidentally, Paren Mode is perfect for this job, so we preprocess existing files with it before allowing the switch to Indent Mode.

Try it! Edit the code below on the left to see how it is formatted on the right.
Input: Any existing file. Must be correctly balanced, but can be incorrectly formatted.
Output: Made ready for Parinfer by correcting indentation and moving close-parens.

Notice that this process is NOT an invasive pretty-printer. Newline characters are never added or removed. It preserves as much as it can of the original code, only moving close-parens and changing indentation.

Parinfer should remain in Paren Mode if the file cannot be processed, due to the reasons stated in the previous section.

Knowing When Parens Move

As a courtesy, Paren Mode will not move your parens while your cursor is behind them. Just move your cursor away when you're done. A helpful analogy might be to think of your cursor as a paperweight that keeps your parens from blowing away.

Parens displaced only after they are balanced.
Parens not displaced if the cursor is behind it (allowing you to type).
Parens displaced if the cursor is no longer behind them.

Conclusions

Benefits

Inferring parentheses based on indentation seems to lead to simpler editing mechanics for Lisp code. It leads to a system that keeps our code formatted well. And it allows us to use paredit-like features without hotkeys.

I think the biggest win is its potential to quell fear of managing end-of-line parens by enforcing a direct driving relationship with indentation.

And just like Paredit it maintains paren-balanced code.

Downsides

The rules for what happens when inserting/deleting parens must be learned. Also, the case necessitating a "Paren Mode" comes at the cost of forcing the user to understand when and how to switch editing modes.

Also, the preprocessor step performed when opening files will cause more formatting-related changes in your commit history when collaborating with others not using Parinfer.

Final Thoughts

Regardless of how we choose to edit our Lisp code, there seems to always be a balancing act between maintaining the simplicity of how we interact with the editor and accepting some editor complexity to gain automation over these powerful but numerous parens.

Building the interactive examples for this page has allowed me to explore how well Parinfer can play this balancing act, but only in a demo environment. The real test will come once it becomes available to major editors. See editor plugins for progress.

Appendix

Source Code

The text formatting code is intended to be editor-agnostic. It is implemented in straightforward, imperative JavaScript, optimized for speed and designed to be easy to port to other languages. There is a test suite which is also designed to be easy to port with all test cases represented in JSON files.

The editor demos on this site are created in CodeMirror with hooks to apply our formatters and update cursor position. Source code for both the library and site are available on github:

http://github.com/shaunlebron/parinfer

Editor Plugins

Parinfer is still in early development. Several people have started integrating it into code editors at various stages of development.

Works in Progress:

Parinfer is available for some REPL environments as well:

Let me know if you're working on a plugin, or check the wiki for extra guidance. Thanks!

Acknowledgements

  • adjust-parens is an earlier idea for adjusting close-parens based on indentation in Emacs.
  • aggressive-indent-mode re-indents code after every keystroke in Emacs, to remove the back-and-forth actions mentioned in the Tools for Writing Lisp section. I believe Parinfer's Paren Mode is less aggressive in that it keeps indentation within thresholds rather than hard limits (more on this in a future section).
  • Haml, Slim, Jade — comparing these indented-templating languages to Clojure's Hiccups is what made me realize Lisp close-parens could be inferred from indentation.
  • Thanks to Anders Hovm√∂ller and Antonin Hildebrand for humoring me on an early crazy idea and connecting me to some ideas about indentation and structural editors.
  • sweet-expressions, i-expressions, indent-clj, et al explored how parentheses can be inferred (but hidden).
  • Haskell's $ operator infers a closing paren after all expressions to the right, even after subsequent indented expressions, which is basically what Indent Mode in Parinfer does.
  • Cirru Sepal in Clojure has an interesting approach to inferring Clojure parens from indentation and other syntax sugar— $, $ [], $ {} and ,.
  • It's worth mentioning Flense and Plastic as tools with ambitious structural editing concepts for Clojure text.
  • Frame-Based Editing and Scratch remove parens (or curly braces) using blocks (like the LEGO metaphor mentioned).
  • Paredit animations inspired the editor animations here.
  • The Clojure Style Guide helped me confirm some indentation conventions when formalizing Parinfer.
  • Bryan Maass explained to me how Lisp is like Lego— everything being the same shape so they can snap together any way you see fit, while other syntaxes have different shaped pieces which only fit together in special ways.
  • Thanks to Chris Oakman for proof-reading this page, pushing me to make an efficient implementation of Indent Mode, and starting the atom-parinfer plugin.
  • Thanks to my friends at PROS for looking at an early draft of this page.
  • Thanks to the Clojure/ClojureScript community for making me fall in love with Lisp.
  • When John Carmack adopted Lisp (Racket) for Oculus Rift scripting, it gave me a little more momentum to finish writing this!
  • I used the really cool gears.d3.js library for the animated gears here.
  • I built Parinfer as an augmentation to the CodeMirror editor, which I really enjoyed working with and studying to learn about editors.