1

Why LSP?

 2 years ago
source link: https://matklad.github.io//2022/04/25/why-lsp.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Alternative Theory

I would say that the reason for such poor IDE support in the days of yore is different. Rather than M * N being too big, it was too small, because N was zero and M just slightly more than that.

I’d start with N — the number of language servers, this is the side I am relatively familiar with. Before LSP, there simply weren’t a lot of working language-server shaped things. The main reason for that is that building a language server is hard.

The essential complexity for a server is pretty high. It is known that compilers are complicated, and a language server is a compiler and then some.

First, like a compiler, a language server needs to fully understand the language, it needs to be able to distinguish between valid and invalid programs. However, while for invalid programs a batch compiler is allowed to emit an error message and exit promptly, a language server must analyze any invalid program as best as it can. Working with incomplete and invalid programs is the first complication of a language server in comparison to a compiler.

Second, while a batch compiler is a pure function which transforms source text into machine code, a language server has to work with a code base which is constantly being modified by the user. It is a compiler with a time dimension, and evolution of state over time is one of the hardest problems in programming.

Third, a batch compiler is optimized for maximum throughput, while a language server aims to minimize latency (while not completely forgoing throughput). Adding a latency requirement doesn’t mean that you need to optimize harder. Rather, it means that you generally need to turn the architecture on its head to have an acceptable latency at all.

And this brings us to a related cluster of accidental complexity surrounding language servers. It is well understood how to write a batch compiler. It’s common knowledge. While not everyone have read the dragon book (I didn’t meaningfully get past the parsing chapters), everyone knows that that book contains all the answers. So most existing compilers end up looking like a typical compiler. And, when compiler authors start thinking about IDE support, the first thought is “well, IDE is kinda a compiler, and we have a compiler, so problem solved, right?”. This is quite wrong — internally an IDE is very different from a compiler but, until very recently, this wasn’t common knowledge.

Language servers are a counter example to the “never rewrite” rule. Majority of well regarded language servers are rewrites or alternative implementations of batch compilers.

Both IntelliJ and Eclipse wrote their own compilers rather than re-using javac inside an IDE. To provide an adequate IDE support for C#, Microsoft rewrote their C++ batch compiler into an interactive self-hosted one (project Roslyn). Dart, despite being a from-scratch, relatively modern language, ended up with three implementations (host AOT compiler, host IDE compiler (dart-analyzer), on-device JIT compiler). Rust tried both — incremental evolution of rustc (RLS) and from-scratch implementation (rust-analyzer), and rust-analyzer decisively won.

The two exceptions I know are C++ and OCaml. Curiously, both require forward declarations and header files, and I don’t think this is a coincidence. See the Three Architectures for a Responsive IDE post for details.

To sum up, on the language server’s side things were in a bad equilibrium. It was totally possible to implement language servers, but that required a bit of an iconoclastic approach, and it’s hard to be a pioneering iconoclast.

I am less certain what was happening on the editor’s side. Still, I do want to claim that we had no editors capable of being an IDE.

IDE experience consists of a host of semantic features. The most notable example is, of course completion. If one wants to implement custom completion for VS Code, one needs to implement CompletionItemProvider interface:

interface CompletionItemProvider {
    provideCompletionItems(
        document: TextDocument,
        position: Position,
    ): CompletionItem[]
}

This means that, in VS Code, code completion (as well as dozens of other IDE related features) is an editor’s first-class concept, with uniform user UI and developer API.

Contrast this with Emacs and Vim. They just don’t have proper completion as an editor’s extension point. Rather, they expose low-level cursor and screen manipulation API, and then people implement competing completion frameworks on top of that!

And that’s just code completion! What about parameter info, inlay hints, breadcrumbs, extend selection, assists, symbol search, find usages (I’ll stop here :) )?

To sum the above succinctly, the problem with decent IDE support was not of N * M, but rather of an inadequate equilibrium of a two-sided market.

Language vendors were reluctant to create language servers, because it was hard, the demand was low (= no competition from other languages), and, even if one creates a language server, one would find a dozen editors absolutely unprepared to serve as a host for a smart server.

On the editor’s side, there was little incentive for adding high-level APIs needed for IDEs, because there were no potential providers for those APIs.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK