Version 0.6 of aretext has been released! Aretext is a minimalist, terminal-based text editor with vim-compatible key bindings. Here’s what’s new!
Aretext now supports syntax highlighting of markdown! It looks like this:
The current implementation supports most of the CommonMark 0.30 spec, including:
- Emphasis and strong emphasis (bold and italic)
- Bulleted and numbered lists
- Code blocks
The new markdown parser has been validated against the CommonMark 0.30 test suite and extensively fuzz tested.
In previous versions, aretext would soft-wrap lines at character boundaries. This caused words to be “split” between lines, which proved distracting for prose writing. Aretext 0.6 adds a new configuration option
lineWrap: "word" that enables the Unicode line breaking algorithm. The new line wrap mode ensures that line breaks do not occur within words, providing a much smoother prose writing experience.
I spent a few weeks benchmarking and optimizing the syntax highlighting parsers for Go, Rust, C, and Python. Tweaking the implementation yielded between 26% and 39% reductions in execution time. The chart below shows the improvement on benchmarks in the aretext test suite:
(Note that each benchmark uses a different test file as input, so this chart does not show that C and Rust are faster than Go and Python!)
Additionally, a new optimization to coalesce leaf nodes in the parse tree reduced memory usage in large documents. For example, opening this 54K line file in the Kubernetes repository required 38 MiB in aretext v0.5, but only 1 MiB in aretext v0.6. That’s a 71% reduction in memory usage!
The end result is that large documents with syntax highlighting should load noticeably faster and use less memory.
While using aretext, I noticed an odd bug where the “Find files” fuzzy search would exclude specific files… but only sometimes. At first, I thought I was imagining it, but after many hours I was able to track it down to a subtle bug mutating a pointer to an element of a slice. The bug occurred only when certain file paths were added to the trie in a certain order, which would cause a slice to grow, invalidating a pointer to the slice’s previous backing array. Lesson learned: be careful mutating pointers to slice elements!
Previously, aretext used a hand-written parser to interpret user input. For example, when a user typed “2dw” in normal mode, aretext parsed “2” as a count, then matched “dw” as the command “delete word”. This worked, but made it difficult to extend the parser to support every possible valid input sequence.
In aretext 0.6, input parsing has been completely reimplemented as a virtual machine capable of recognizing any regular language. Valid command sequences are encoded as regular expressions, which are compiled to bytecode and embedded in the aretext binary. The virtual machine then processes input keys according to the bytecode program, simulating a non-deterministic finite automaton.1
While this isn’t a user-facing change, it greatly simplifies the code (all input processing goes through the input VM) and will allow aretext to process more complex input sequences in the future.
I’ve been using aretext as my primary editor for nearly 18 months, and I’ve received positive feedback from a few users. The core design and user interface have remained stable.
For this reason, I’ve (finally!) removed the “beta” designation from the aretext README.
If you’re interested in trying out the editor, please see the installation guide.
The approach is heavily inspired by Russ Cox’s blog post “Regular Expression Matching: the Virtual Machine Approach” ↩︎