Langium Simple Example: Creating a Basic DSL

Let’s say you want to create a simple DSL for defining greetings. Your DSL will allow users to write statements like:

greet "Alice"
greet "Bob"

See repo ULL-ESIT-PL/learning-langium used to take the course Getting started with ‘Langium’

Step 1: Define the Grammar

First, you define the grammar for your language using Langium’s grammar language. This grammar describes the structure of valid statements in your DSL.

grammar Greetings
 
entry Model:
    greetings+=Greeting*;
 
Greeting:
    'greet' name=STRING;
 
hidden terminal WS: /\s+/;
terminal STRING: /"[^"]*"/;
  • Model is the root rule, which contains a list of Greeting statements.
  • Greeting consists of the keyword greet followed by a string (the name).
  • STRING is a terminal rule for matching quoted strings.
  • WS is a hidden rule for whitespace, which is ignored during parsing.

The meaning of greetings+=Greeting*;

Let us break the phrase greetings+=Greeting*; in detail:

greetings

This is the name of a property in the abstract syntax tree (AST). The AST is a tree representation of the structure of your program or DSL. Each rule in the grammar corresponds to a node in the AST, and properties like greetings define the relationships between nodes.

In this case, greetings is a property of the Model rule (the root rule in the grammar). It will hold a list of Greeting nodes.

+=

The += operator is used to append elements to a list. In Langium, this means that every time the parser encounters a Greeting statement, it will add it to the greetings list in the Model node.

For example, if your DSL input is:

greet "Alice"
greet "Bob"

The greetings property in the Model node will contain a list with two Greeting nodes: one for "Alice" and one for "Bob".

Greeting

This refers to another rule in the grammar, specifically the Greeting rule. The Greeting rule defines the structure of a greeting statement in your DSL. In this example, the Greeting rule is defined as:

Greeting:
    'greet' name=STRING;

This means a Greeting consists of the keyword greet followed by a string (the name).

The * symbol

The * symbol is a cardinality operator. It means “zero or more.” In this context, it indicates that the Model can contain zero or more Greeting statements.

For example:

  • If the input is empty, greetings will be an empty list.
  • If the input contains multiple greet statements, greetings will contain all of them.

Putting It All Together

The full line greetings+=Greeting*; means:

  1. The Model node has a property called greetings.
  2. This property is a list that can hold zero or more Greeting nodes.
  3. Every time the parser encounters a Greeting statement, it will append it to the greetings list.

Example AST Representation

If your DSL input is:

greet "Alice"
greet "Bob"

The resulting AST might look like this:

{
    "$type": "Model",
    "greetings": [
        {
            "$type": "Greeting",
            "name": "Alice"
        },
        {
            "$type": "Greeting",
            "name": "Bob"
        }
    ]
}
  • The Model node has a greetings property.
  • The greetings property is a list containing two Greeting nodes.
  • Each Greeting node has a name property with the value "Alice" or "Bob".

Summary

  • greetings is a property of the Model node.
  • += appends elements to the greetings list.
  • Greeting refers to the rule defining a greeting statement.
  • * means “zero or more” occurrences of Greeting.

This line of grammar defines a flexible and reusable structure for your language, allowing you to handle multiple statements of the same type in a clean and organized way.

Step 2: Generate the Language Server

Langium provides tools to generate a language server from your grammar. The language server handles parsing, validation, and other language features.

  1. Install Langium:

    npm install langium
  2. Use the Langium CLI to generate the language server:

    npx langium generate

This will generate TypeScript code for parsing and validating your DSL.

Step 3: Implement Custom Validation (Optional)

You can add custom validation logic. For example, let’s ensure that names in greetings are capitalized:

import { Greeting } from './generated/ast'; // Generated AST types
import { ValidationAcceptor } from 'langium';
 
export function validateGreeting(greeting: Greeting, accept: ValidationAcceptor) {
    if (greeting.name.value[0] !== greeting.name.value[0].toUpperCase()) {
        accept('warning', 'Name should be capitalized.', { node: greeting, property: 'name' });
    }
}

Step 4: Integrate with an Editor

Langium integrates with VS Code via the Language Server Protocol (LSP). You can package your language server as a VS Code extension to provide features like syntax highlighting, code completion, and validation.

  1. Create a VS Code extension:

    npx yo code
  2. Configure the extension to use your Langium language server.

  3. Add syntax highlighting by defining a language-configuration.json and a tmLanguage.json file for your DSL.

Step 5: Test Your DSL

Once everything is set up, you can open a .greet file in VS Code and write:

greet "alice"
greet "Bob"

The editor will highlight syntax errors, warn you about uncapitalized names, and provide a smooth editing experience.