Language Server Protocol (LSP)

The Problem with Domain Specific Languages Design

In the context of creating DSLs for specific business domains, Markus Voelter in his book DSL Engineering writes:

⚠


 You could argue that this whole business about DSLs is nothing new. It has long been possible to build custom languages using parser generators such as lex/yacc, ANTLR or JavaCC. And of course you would be right



 However, I feel that language workbenches, which are tools to efficiently create, integrate and use sets of DSLs in powerful IDEs, make a qualitative difference. DSL developers, as well as the people who use the DSLs, are used to powerful, feature-rich IDEs and tools in general.

If you want to establish the use of DSLs and you suggest that your users use vi or notepad.exe, you won’t get very far with most people
 This is why I 
 emphasize IDE development just as much as language development.

Language Server Protocol (LSP) Introduction

From the Wikipedia:

The Language Server Protocol (LSP) is an open, JSON-RPC-based protocol for use between source code editors or integrated development environments (IDEs) and servers that provide “language intelligence tools” programming language-specific features like code completion, syntax highlighting and marking of warnings and errors, as well as refactoring routines.

The goal of the protocol is to allow programming language support, for instance syntax highlighting, error messages highlighting, completion, etc. to be implemented and distributed independently of any given editor or IDE.

In the early 2020s, LSP quickly became a “norm” for language intelligence tools providers.

The Language Server Protocol allows for decoupling language services from the editor so that the services may be contained within a general-purpose language server. Any editor can inherit sophisticated support for many different languages by making use of existing language servers. Similarly, a programmer involved with the development of a new programming language can make services for that language available to existing editing tools.

How it works

For instance, in VS Code, a language server has two parts:

  • Language Client: A normal VS Code extension written in JavaScript / TypeScript. This extension has access to all VS Code Namespace API.
  • Language Server: A language analysis tool running in a separate process.
📈

There are benefits of running the Language Server in a separate process:

  1. Save on the number of syntax extensions to write (one per IDE)
  2. The analysis tool can be implemented in any language, as long as it can communicate with the Language Client following the Language Server Protocol.
  3. As language analysis tools are often heavy on CPU and Memory usage, running them in separate process avoids performance cost.

VScode Language Server Extension Guide

Here is an illustration of VS Code running two Language Server extensions:

language-server-extension-guide/lsp-illustration.png

The HTML Language Client and PHP Language Client on the left are normal VS Code extensions written in TypeScript.

Each of them instantiates a corresponding Language Server and communicates with them through LSP.

Although the PHP Language Server on the right is written in PHP, it can still communicate with the PHP Language Client through LSP.

Tools for implementing LSP

  • Langium is an open source language engineering tool with first-class support for the Language Server Protocol, written in TypeScript and running in Node.js.

Other LSP DSL oriented Tools with IDE Focus

ToolLanguageIDEs SupportedStrengthMaturity
JetBrains MPSJavaJetBrains IDEsProjectional editingHigh
XtextJavaEclipse, VS Code (LSP)Rich toolingHigh
LangiumTSVS Code, TheiaModern LSP-firstMedium
RascalJavaVS Code (partial)Analysis + TransformMedium
SpoofaxJavaEclipse, LSPFormal toolsMedium
Custom (ANTLR + LSP)AnyAny (via LSP)Full controlHigh

A language workbench is a meta-tool or development environment specifically designed for creating, implementing, and maintaining domain-specific languages (DSLs) and programming languages.

Language Workbenches

A language workbench is different from a parser generator, since it usally provides:

  • Language definition capabilities: Tools to specify syntax, semantics, and type systems
  • Editor generation: Automatic creation of language-aware editors
  • Tool integration: Built-in support for debugging, refactoring, and other IDE features
  • Code generation: Mechanisms to translate from your DSL to target languages/platforms

Instead of hand-coding parsers and compilers, you declaratively specify:

  • Grammar and syntax rules
  • Type systems and constraints
  • Semantic behaviors
  • Editor appearance and interactions

And automatically generates:

  • Syntax highlighting
  • Code completion
  • Error checking
  • Refactoring tools
  • Debuggers

Language workbenches can be Projectional or Textual

  • Textual: Traditional text-based language editing (like Xtext)
  • Projectional: Direct manipulation of abstract syntax trees, visual representation (like JetBrains MPS)

Examples of Language Workbenches are:

  • JetBrains MPS: Projectional editing, Java-based
  • Eclipse Xtext: Textual DSLs, grammar-based
  • Spoofax: Research-oriented, multiple paradigms
  • Rascal: Meta-programming language for language engineering
  • Langium: Textual, TypeScript-based, VS Code integration

Among the Benefits are:

  • Rapid prototyping of new languages
  • Consistent tooling across different DSLs
  • Reduced development time compared to building languages from scratch
  • Domain expert accessibility - non-programmers can work with well-designed DSLs

Langium Tutorial