Skip to content

Prior art

There are many, many pieces of software that inspired the design of BABLR and Paneditor, most projects that are labors of love in their own right. Without them to guide the way, it would not have been possible to conceive of BABLR's design.

I, Conrad, am writing this page, and the opinions expressed here are my own. I have a great deal of respect for the people who have made these things, and yet it will likely be obvious to you from the fact of my having chosen to compete with their technologies that I have critical views about how these products should be built.

Aneditor

The most direct jolt of inspiration we ever got was from Gary Bernhardt in his talk A Whole New World, a science-fiction presentation about software tools. It is well-worth the watch. Gary envisions an editing environment reinvented all the way down to the protocol layer so as to clear out decades of tech debt. CSTML is meant to be that protocol: to unlock Gary's vision of what he called "Aneditor" and we call "Paneditor".

Languages

JSON

As a machine data interchange format, JSON started out in a heated competion with XML. It largely won out over XML for common uses. For one thing, you could parse a JSON document without needing a schema file, which was huge. For another JSON dispensed with XML's overcomplicated namespaces and just let things have named relationships to each other where the name of the relationship tells you what sort of thing to expect.

CSTML has a version of JSON embedded within it for attributes, and uses JSON-style strings and escapes in its literal tags as well.

XML/HTML/SGML

These demonstrated to me the broad and continuing need for languages that allow metadata to be mixed together with text. HTML has the most immediately accessible notion of textual markup thanks to its node.innerText construct. The idea of angle brackets for tags and metadata as tags embedded in text originated from this family of languages. XML's distinction between SAX and DOM parsing drove agAST's relationship between tags and trees. We also preserved its distinction between tag-level validity and schema-level validity.

SrcML

The most direct inspiration for CSTML was SrcML, which uses XML to embed parser markup into code documents. It was looking at these example documents that most directly convinced me that while a markup language for code was a great idea, XML was not a good candidate language. For one thing SrcML documents can't be pretty-printed because all internal spacing is presumed to be the program's internal spacing. This makes the average line of SrcML code extremely long so that a SrcML document consists almost entirely of lines which run off the right hand side of the screen. The lack of named relationships between nodes walls them off from being able to serve as a standardized format for parse trees and leaves them scrambling to simulate the behavior: <function><type><name>void</name></type> <name>rotate</name> <parameter_list>()</parameter_list></function>.

Parsers

ANTLR

With inspiringly maniacal energy, ANTLR sets the gold standard for well-supported parser generator frameworks, integrating with nearly everything it can touch. BABLR hopes to grow to match ANTLR in this regard over time. Unlike ANTLR though, BABLR is not a parser generator! BABLR does not have a codegen step, instead running exactly the parser code you wrote. This means that you don't need specialized tools to work with our parsers -- not for debugging them, not for doing static analysis. Other points of pride BABLR has over ANTLR include our ability to do streaming parsing, our ability to power scripted refactoring, and our ability to load into a browser with an incredibly light footprint. ANTLR's web playground has to make a request to a server to run a parse!

Tree-Sitter

The current go-to parser system for modern IDEs, Tree-sitter is also a parser generator. Like ANTLR it supports many host languages and many target languages. Tree-sitter's system of structural queries is a primary inspiration to BABLR's system. The skill curve for writing and reading tree-sitter parsers is quite high though thanks to the codegen step and because most languages with a realistic level of complexity need a lexer hand-written in C. In general Tree-sitter is BABLR's closest competitor, yet its owners seem to regard it as a mature product without significant room for further growth. This is likely because they see Tree-sitter as only a third of an IDE, with the other two thirds being the text buffer and LSP. This stands in contrast with BABLR, which proposes to offer all three thirds together: everything you need to make an IDE, the whole state layer and the whole integration surface all in one place.

Parsec

Often referred to as the state of the art in parser combinator libraries, Parsec, like BABLR, is just function calling under the hood. While Parsec's native language Haskell is still largely an academic language, Parsec's approach to parsing has exploded beyond the boundaries of the Haskell community, spawning clones and derivatives in a great variety of programming languages, including several in JS alone. My first experiments into parsing were with combinators.

IDEs

Atom/Pulsar

The O.G. "hackable" browser-based IDE. Atom is triply notable: for spawning the Electron project which went on to power a generation of popular applications, for its syntax engine Tree-sitter, and for the members of the Atom team who went on to found Zed. It is a testament to the unique vision of the Atom project that despite having lost its corporate backing, much of its founding team, and the majority of its market share, this project is still actively developed and maintained. It shows us that users want more meaningful ownership of their tools, and how hard they will fight to keep those tools when they do feel that ownership.

VSCode

VSCode is the current market force to be reckoned with. It runs in Electron, and can also run natively in a web browser. Even though VSCode is theoretically less hackable than Atom, the sheer number of VSCode users has led the VSCode plugin ecosystem to dominance. Even many of the other editors that have been in the news lately like Cursor, Windsurf, Kiro, and Antigravity: all are forks of VSCode. While this may seem a high compliment, it's also a sign that the market has hit a major plateau. Even projects that aren't forks of VSCode like Zed feel kind of like forks of VSCode because of their reliance on Language Server Protocol. These would-be competitors are de-fanged because, having built on top of VSCode's platform, they end up inheriting its weaknesses as their own weakness while making it easier for VSCode to take their strengths for its own. Paneditor (with BABLR as its syntax engine) can threaten the dominant position of VSCode precisely because we aren't forced to invest in their ecosystem when we invest in ours.

Zed

Founded by Atom's dev team, Zed was the rewrite that Atom always wanted to be able to do but couldn't when Microsoft bought Github and made the executive decision to kill a product it might otherwise have had to compete with. Unfortunately the Zed team has a bit of a track record from learning the wrong lessons from things. VSCode taught them that Electron could be an incredibly powerful and popular platform for an IDE, so they abandoned it. Tree-sitter is their single most compelling piece of technology, but it sits largely unloved. They built live collaborative editing, but then neglected to offer a hosted environment. While they like to brag about their product in gaming terms (it's 120 frames per second) the gaming analog of such a thing would be shipping an open-world multiplayer game without any kind of dedicated server, so that the only way that everyone can continue to play is if one of the players leaves their open with the game running on it 24/7. Was this a priority to fix? No, they hard-pivoted to AI now they get to deal with an endless stream of dissatisfied customers with feedback like, "need help right away how did all my tokens get used up in one session" -_-

Emacs

I've never really used it myself, but I understand that it is considered the gold standard of hackable editors. Fun fact, after a ski accident I type with 9 fingers as I don't have enough feeling or fine motor control in my left pinky. So yeah, not a big Emacs user. I understand that there's a veritable gold mine here in terms of things we can learn from though! We want to put an editor experience that feels as powerful as Emacs into the web browser.

Javascript Tooling

Babel

Babel was the first hugely successful Javascript tool for code transformation. It made compiler-focused tradeoffs, discarding whitespace and comments. In return it gave you direct control of the program's semantic structure: it made building up a program from its indended semantics was as easy as composing a tree of syntax node objects. Babel was the most direct inspiration for BABLR, both in name and in technology. BABLR primarily seeks to expand on Babel in two ways: by making it usable to transform any language instead of just Javascript, and by keeping every character of the original syntax on one tree node or another so that all trees can be printed with a single algorithm.

ESLint

BABLR has a bit of a history of spats with ESLint. Our technologies, if successful, would be completely transform their project by melting the boundaries between the four API-driven syntax modification engines: ESLint, Prettier, Babel, and Recast. ESLint's ESTree standard for syntax trees is (loosely) followed by each of these tools, yet the presence of subtle tool-by-tool differences has made it impossible to consolidate infrastructure. ESTree's biggest structural deficiency is the presence of node.parent references in its trees, which change the data strcuture for a parse result from being a tree into being a graph with cycles. This in turn ensures that the parse-graph data structure cannot be made immutable, which then makes it impossible to safely share trees between plugins. This particular deficiency is shared by another major piece of JS tooling infrastructure: Typescript.

Biome

Biome is the continuation of work of the Rome project, where Rome was the original Babel author's attempt to design an ecosystem in which the calcified, arbitrary boundaries that had developed between tools were broken down. Biome and BABLR are both built using Concrete Syntax Trees as their core data structure. But these projects are quite different in terms of the practical choices made. Biome focuses on making CI servers smarter about code while BABLR focuses on making web browsers smarter about code. Biome is "batteries included" with support for the JS ecosystem out of the box, like a game console. BABLR has no language support in its core so that its core stays usage-agnostic, like an x86 CPU. Biome has blazingly fast throughput but it also comes in a honkingly large platform-specific npm package. BABLR has low throughput but is lightweight, highly responsive, and platform-agnostic. It is the combination of these similarities and differences which make potential collaboration between these two projects a fascinating possibility (to me).

VoidZero

This newest funded devtools startup brings together several existing projects, particularly Vite, Rolldown, and Oxc. While the possibility for technical collaboration is interesting with a pool of talented contributors, the certainty that our tools will soon be able to provide a categorically better free product than their shiniest proposed paid product makes us less-than-optimistic that they'll be around long enough for a collaboration to bear fruit. Both projects are labors of love that have consumed so many resources as to force their authors to seek economic returns. But compared to our business model (eat Github) theirs is extremely fuzzy.

Semantic editors

Hazel

An editor -- an editor with gaps! Still seems to see itself first as a text editor, and can enter some very odd and unintuitive states because of it. Best-in-class aids to syntax comprehension help you understand exactly what syntax node is selected and what it does. Wants to become a live notebook system.

Ki

Hey, want to be friends? I'll help you put the whole thing in the browser.

Cursorless

One of the more wildly imaginative semantic editors, Cursorless lets you write code fast just by speaking commands in a well-designed syntax.

Scratch

A purely block-based programming environment. Has embedding gaps, like all block-based editors. Multilingual!! Suffers from the usual problem that the code editor and the programming language are one closely-linked thing, severely limiting the potential to invest in the editor.

Pantograph

A texual-semantic code editor with embedding gaps and no support for invalid syntax. Users directly instantiate the typed representation of the code being edited. Heading towards multi-language support. Likely the current overall leader in editor UX innovation in my view, though the product is not yet market-ready.

JetBrains MPS

Another one I've never used. Products for extending the syntax of programming languages are few and far between so I feel like I should check it out, but also it's limited to their editors and their UI just doesn't look compelling to me from screenshots.

Extra

Even this 3Blue1Brown video about Fourier Series proved helpful in guiding my thoughts about what it means to control the movement of a fixed armature. Many programmers will happily tell you that HTML is not a programming language, but if it were it would be a language used to control a robot arm with only a very few degrees of freedom in its range of motion, a language that defines document trees by tracing their outline.