How to make a templating language (Part 1)
source link: https://dorianmarie.fr/template/1.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
How to make a templating language (Part 1)
A simple template could be:
Hello {{user.first_name}}
And would be rendered like this:
template = "Hello {{user.first_name}}"
Template.new(template).render(
locals: { user: { first_name: "Dorian" } }
)
# => "Hello Dorian"
A template could be more complex like:
{define render(item)}
<p>{{link_to(item.user.name, item.user.url)}}</p>
{{item.content | markdown}}
<p>{{link_to(item.created_at.to_formatted_s(:long), item.url)}}</p>
{if item.parent}
{{render(item.parent)}}
{end}
{end}
{{render(item)}}
We will need a parser and an interpreter. I prefer to write both in ruby.
(edit: syntax changed as I wrote the language)
Parser
I will use parslet for the parser.
Let’s start with something simple:
require 'parslet'
class Template
class Parser < Parslet::Parser
rule(:empty) { str('') }
rule(:template) do
any.repeat(1).as(:text).repeat(1) |
empty.as(:text).repeat(1, 1)
end
root(:template)
end
end
And two simple tests:
require_relative "../../template/parser"
RSpec.describe Template::Parser do
let!(:template) { "" }
subject { described_class.new.parse(template) }
it { is_expected.to eq([{ text: "" }]) }
context "with only text" do
let!(:template) { "Hello" }
it { is_expected.to eq([{ text: "Hello" }]) }
end
end
any
is the same as match('.')
, e.g. it matches any character
then repeat(1)
tells the parser the character can be repeated 1 or more times.
as(:text)
captures the text under the text
key, e.g. { text: "Hello" }
I add another repeat(1)
because I will want to have an array of matches at the end, e.g. [text, interpolation, expression, text, expression, etc.]
empty.as(:text).repeat(1, 1)
is just so that it gives the same format for the output when an empty string is passed.
Interpolations with {{}}
I would like to be able to do Hello {{""}}
(which would render Hello
) and Hello \{\{World\}\}
which would render Hello {{World}}
) (e.g. escaping the interpolation).
As tests, I write:
context "with an interpolation" do
let!(:template) { "Hello {{user.first_name}}" }
it {
is_expected.to eq([
{ text: "Hello " },
{ interpolation: "user.first_name" }
])
}
end
context "with an escaped interpolation" do
let!(:template) { "Hello \\{\\{user.first_name\\}\\}" }
it {
is_expected.to eq([
{ text: "Hello {{user.first_name}}" },
])
}
end
The parser is getting a bit more complex:
require 'parslet'
class Template
class Parser < Parslet::Parser
rule(:empty) { str('') }
rule(:backslash) { str('\\') }
rule(:opening_bracket) { str('{') }
rule(:closing_bracket) { str('}') }
# text
rule(:unescaped_text_character) do
opening_bracket.absent? >> closing_bracket.absent? >> any
end
rule(:escaped_text_character) { backslash.ignore >> any }
rule(:text_character) { escaped_text_character | unescaped_text_character }
rule(:text) { text_character.repeat(1) }
# interpolation
rule(:opening_interpolation) { opening_bracket >> opening_bracket }
rule(:closing_interpolation) { closing_bracket >> closing_bracket }
rule(:interpolation) do
opening_interpolation.ignore >> expression >> closing_interpolation.ignore
end
# expression
rule(:expression_character) do
opening_interpolation.absent? >> closing_interpolation.absent? >> any
end
rule(:expression) { expression_character.repeat(0) }
rule(:template) do
(
interpolation.as(:interpolation) |
text.as(:text)
).repeat(1) |
empty.as(:text).repeat(1, 1)
end
root(:template)
end
end
For now the expression in the interpolation can be any character except {{
or }}
but we are going to change that later.
Expressions with {}
Without parsing the inside of the interpolations or the expressions:
context "with an expression" do
let!(:template) { "{if item.parent}{{markdown(item.parent.content)}}{end}" }
it {
is_expected.to eq([
{ expression: "if item.parent" },
{ interpolation: "markdown(item.parent.content)" },
{ expression: "end" },
])
}
end
rule(:interpolation) do
opening_interpolation.ignore >> statement >> closing_interpolation.ignore
end
# expression
rule(:opening_expression) { opening_bracket }
rule(:closing_expression) { closing_bracket }
rule(:expression) do
opening_expression.ignore >> statement >> closing_expression.ignore
end
# statement
rule(:statement_character) do
opening_expression.absent? >>
closing_expression.absent? >>
opening_interpolation.absent? >>
closing_interpolation.absent? >>
any
end
rule(:statement) { statement_character.repeat(0) }
rule(:template) do
(
interpolation.as(:interpolation) |
expression.as(:expression) |
text.as(:text)
).repeat(1) |
empty.as(:text).repeat(1, 1)
end
Calls
If I take a ruby program as:
user.first_name
The syntax tree is:
(program
(statements
(call
(vcall
(ident "user")
)
(period ".")
(ident "first_name")
)
)
)
Using stree ast
from syntax_tree
.
So I’m going to do something similar, e.g.:
context "with an interpolation" do
let!(:template) { "Hello {{user.first_name}}" }
it {
is_expected.to eq([
{ text: "Hello " },
{
interpolation: [
{
call: {
value: { name: "user" },
operator: ".",
call: { value: { name: "first_name" } }
}
}
]
}
])
}
end
And with the parser as:
# statement
rule(:statement) { call.as(:call) }
rule(:statements) { statement.repeat(1) }
# call
rule(:call) do
(value.as(:value) >> operator.as(:operator) >> call.as(:call)) |
value.as(:value)
end
# value
rule(:value) { name.as(:name) }
# name
rule(:name_character) do
opening_expression.absent? >>
closing_expression.absent? >>
opening_interpolation.absent? >>
closing_interpolation.absent? >>
operator.absent? >>
any
end
rule(:name) { name_character.repeat(1) }
# operator
rule(:operator_character) do
dot
end
rule(:operator) { operator_character.repeat(1) }
Values
Strings
I would like to be able to write "Dorian"
or 'Dorian'
or 'Dorian\'s'
.
And as a bonus I would like to be able to write "\n"
, "\t"
, "\\"
The parser:
# string
rule(:escaped_string_character) do
(backslash >> (n | t)) | (backslash.ignore >> any)
end
rule(:double_quote_string_character) do
escaped_string_character | (double_quote.absent? >> any)
end
rule(:single_quote_string_character) do
escaped_string_character | (single_quote.absent? >> any)
end
rule(:double_quote_string) do
double_quote.ignore >>
double_quote_string_character.repeat(0) >>
double_quote.ignore
end
rule(:single_quote_string) do
single_quote.ignore >>
single_quote_string_character.repeat(0) >>
single_quote.ignore
end
rule(:string) do
double_quote_string | single_quote_string
end
And one of the tests:
context "escaped characters" do
let!(:template) { "{'Dorian\\'s\\nOr not'}" }
it {
is_expected.to eq([
{
expression: [
{
value: { string: "Dorian's\\nOr not" },
}
]
}
])
}
end
I will deal with the escaped characters (\n
and \t
) in the interpreter.
String interpolations
Because I’m also having fun making this language :).
So, for instance {{"Hello {{user}}"}}
The parser (I’m reusing the interpolation
and expression
rules):
# string
rule(:string_character) do
opening_expression.absent? >>
opening_interpolation.absent? >>
any
end
rule(:escaped_string_character) do
(backslash >> (n | t)) | (backslash.ignore >> any)
end
rule(:double_quote_string_character) do
escaped_string_character | (double_quote.absent? >> string_character)
end
rule(:single_quote_string_character) do
escaped_string_character | (single_quote.absent? >> string_character)
end
rule(:double_quote_string) do
double_quote.ignore >>
(
(
interpolation.as(:interpolation) |
expression.as(:expression) |
double_quote_string_character.repeat(1).as(:text)
).repeat(0)
) >> double_quote.ignore
end
rule(:single_quote_string) do
single_quote.ignore >>
(
(
interpolation.as(:interpolation) |
expression.as(:expression) |
single_quote_string_character.repeat(1).as(:text)
).repeat(0)
) >> single_quote.ignore
end
rule(:string) { double_quote_string | single_quote_string }
And a test:
context 'interpolation' do
let!(:template) { '{{"Hello {{user}}"}}' }
it do
is_expected.to eq(
[
{
interpolation: [
{
value: {
string: [
{ text: 'Hello ' },
{ interpolation: [{ value: { name: 'user' } }] }
]
}
}
]
}
]
)
end
end
A string can now have many parts, text
parts, interpolation
parts and expression
parts kinda like a template itself :)
Numbers
Because I’m doing this for fun, I’m going to do base 2 (binary), base 8 (octal), base 10 (decimal), and base 16 (hexadecimal)
Base 10
Examples of base 10 numbers:
0
12
13.2
1_000
1_000.59
1e10
-10
- `1 234 567.50”
1,450.12
I’m gonna write a separate parser for numbers:
require 'parslet'
class Template
class Number
# 12,735.23 is parsed as { base_10: { whole: '12735', decimal: '23' } }
class Parser < Parslet::Parser
rule(:minus) { str('-') }
rule(:plus) { str('+') }
rule(:dot) { str('.') }
rule(:underscore) { str('_') }
rule(:comma) { str(',') }
rule(:e) { str('e') }
rule(:zero) { str('0') }
rule(:one) { str('1') }
rule(:two) { str('2') }
rule(:three) { str('3') }
rule(:four) { str('4') }
rule(:five) { str('5') }
rule(:six) { str('6') }
rule(:seven) { str('7') }
rule(:eight) { str('8') }
rule(:nine) { str('9') }
# space
rule(:space) { match('\s') }
# infinity
rule(:infinity) { str('Infinity') }
# base 10
rule(:base_10_digit) do
zero |
one |
two |
three |
four |
five |
six |
seven |
eight |
nine
end
rule(:base_10_number) do
(
base_10_digit |
space.ignore |
underscore.ignore |
comma.ignore
).repeat(1).as(:whole) >> (
dot.ignore >> (
base_10_digit |
space.ignore |
underscore.ignore
).repeat(1)
).as(:decimal).maybe >> (
e.ignore >> base_10_number
).as(:exponent).maybe
end
rule(:number) do
(minus | plus).as(:sign).maybe >> (
infinity.as(:infinity) |
base_10_number.as(:base_10)
)
end
root(:number)
end
end
end
Then I just need to import the number parser in the template parser:
rule(:number) { Template::Number::Parser.new }
Base 16, 8 and 2
Pretty simple to parse, it’s just 0xFF
, 0o17
or 0b10
for instance
# base 16
rule(:base_16_digit) do
zero | one | two | three | four | five | six | seven | eight | nine |
a | b | c | d | e | f
end
rule(:base_16_number) do
zero.ignore >> x.ignore >> base_16_digit.repeat(1)
end
# base 8
rule(:base_8_digit) do
zero | one | two | three | four | five | six | seven
end
rule(:base_8_number) do
zero.ignore >> o.ignore >> base_8_digit.repeat(1)
end
# base 2
rule(:base_2_digit) do
zero | one
end
rule(:base_2_number) do
zero.ignore >> b.ignore >> base_2_digit.repeat(1)
end
Boolean
Easy, true
or false
# boolean
rule(:boolean) { str('true') | str('false') }
# value
rule(:value) do
boolean.as(:boolean) | number.as(:number) | string.as(:string) |
name.as(:name)
end
nothing
I think it’s more explicit than nil
, null
, undefined
, void
, etc.
# nothing
rule(:nothing) { str('nothing') }
# value
rule(:value) do
nothing.as(:nothing) | boolean.as(:boolean) | number.as(:number) |
string.as(:string) | name.as(:name)
end
And now onto the meat:
Lists
Can be either an explicit list e.g. [1, 2, 3]
or an implicit list, e.g. 1, 2, 3
Only the root list can be implicit, e.g. 1, [2, 3], 4
can’t be written as 1, 2, 3, 4
# list
rule(:implicit_list) do
value.as(:first) >> spaces? >> comma >>
(
spaces? >> value.as(:second) >>
(comma >> spaces? >> value).repeat(0).as(:others)
).maybe
end
rule(:list) do
left_square_bracket >> spaces? >> value.as(:first) >>
(comma >> spaces? >> value).repeat(0).as(:others) >> spaces? >>
right_square_bracket
end
# value
rule(:value) do
list.as(:list) | nothing.as(:nothing) | boolean.as(:boolean) |
number.as(:number) | string.as(:string) | name.as(:name)
end
# statement
rule(:statement) do
(value >> operator.as(:operator) >> statement.as(:statement)) |
(implicit_list.as(:list) | value)
end
Dictionnaries
Can be either an explicit dictionnary e.g. {id: 1, name: "Dorian"}
, or implicit, e.g.
id: 1, name: "Dorian"
Implicit dictionnaries can be included in lists, e.g. 3, 2, 1, do: "Party"
# dictionnary
rule(:short_key) { name.as(:string) >> colon }
rule(:long_key) { value >> spaces? >> (colon | (equal >> greater_than)) }
rule(:key_value) do
(short_key | long_key).as(:key) >> spaces? >> value.as(:value)
end
rule(:implicit_dictionnary) do
key_value.as(:first) >>
(spaces? >> comma >> spaces? >> key_value).repeat(0).as(:others)
end
rule(:dictionnary) do
left_curly_bracket >> spaces? >> key_value.as(:first) >>
(comma >> spaces? >> key_value).repeat(0).as(:others) >> spaces? >>
right_curly_bracket
end
That’s it for Part 1
Next we will finish the parser with blocks: functions, conditions, loops, etc. (Part 2)
Part 3 will be a basic interpreter with all kinds of values.
Part 4 will be finishing the interpreter with blocks.
Part 5 will be nice finishing touches like making a website to try the language etc.
If you enjoyed (or not), go to my Twitter @dorianmariefr and/or send me an email at [email protected].
You can find the code on github.com/dorianmariefr/template.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK