Mythical Macros
source link: https://soegaard.github.io/mythical-macros/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
This is a work in progress.
Thanks to Michael Rauh, Laurent Orseau, G. Knauth and Any for suggestions, encouragement and corrections.
We will in due course look at all of these.
Let’s say I discover this pattern in my code:
That is, for some reason, I often have the need to declare and set two variables to the same value. Following the DRY principle DRY means “Don’t repeat yourself”. Wikipedia I would like to write this instead:
(define2 x y 42)(define2 a b 43)
To do this I need to bind
define2
to a syntax transformer (macro), a normal function from syntax objects to syntax objects.
The input to the syntax transformer is a piece of syntax representing a use of the code, such as
(define2 x y 42)
.
The output is a newly constructed piece of syntax representing the rewritten code.
(define-syntax define2 ; bind define2 to (lambda (stx) ; this syntax transformer <analyze stx and return a new syntax object>))
Given, say,
(define2 x y 42)
as input, the syntax transformer needs to analyze the input to find out which variables were used. This is often done via pattern matching with the help ofsyntax-parse
. In its simplest formsyntax-parse
looks like this:
(syntax-parse stx [pattern1 expr1] [pattern2 expr2] [pattern3 expr3])
It will match the syntax object against the patterns
pattern1
,pattern2
,pattern3
. When it finds a pattern that matches, the corresponding expression is evaluated. In our example we only need one pattern:
(_define2 var1 var2 expr)
When this pattern is matched against
(define2 x y 42)
, the pattern variables , and will be bound to pieces of syntax representing , and respectively. This is what we have so far:
(define-syntax define2 (lambda (stx) (syntax-parse stx [(_define2 var1 var2 expr) ; pattern <construct the output syntax>]))) ; expression
In order to construct the output we will use the form
.
(syntax template)
The form
constructs a new syntax object from a given template. It takes the template and replaces identifiers that are bound as pattern variables with their corresponding values.will splice the forms into the enclosing context.
The resulting macro becomes:
(define-syntax define2 (lambda (stx) (syntax-parse stx [(_define2 var1 var2 expr) (syntax (begin (define var1 expr) (define var2 var1)))])))
A complete, runnable program with a few tests:
#lang racket(require (for-syntax syntax/parse))(define-syntax define2 ; bind define2 to a (lambda (stx) ; syntax transformer that (syntax-parse stx ; matches stx against [(_define2 var1 var2 expr) ; this pattern, and produces (syntax ; a new syntax object (begin ; following this template (define var1 expr) (define var2 var1)))]))) (define2 x y 42)(define2 a b 43)(list x y a b) ; (42 42 43 43)
define3
.instead of the one used above?
- A read pass turns a character stream into a syntax object.- An expand pass transforms the syntax object into a fully parsed one.- Finally, compile and/or eval to get the job done.
Usually the last step, evaluation, involves compiling into machine code.
The fact that macros are called only during expansion allows us to conclude:
- Use a
reader
if you need to change the lexical structure. Macros can’t be used to change the lexical structure(we can’t change the syntax of numbers with macros).- Macros are a compile time concept.
More on runtime vs compile time later. Macro expansion happens once - not each time the program is run.
The following example shows the results of the various passes during the evaluation of a small program fragment. This section is about the big picture, so there are details of the example program, that we haven’t yet dicussed (such as the use of the namespace). However, the program gives you something to try out in the repl.
#lang racket(require racket/format) (define source "(begin (define (fact n) (cond [(= n 0) 1] [else (* n (fact (- n 1)))])) (fact 5))") (define ns (make-base-namespace)) ; expand and eval(parameterize ([current-namespace ns] ; need a namespace [read-accept-compiled #t]) (define input (open-input-string source)) (define before (read input)) (define after (expand before)) (define compiled (compile after)) (define result (eval compiled)) (define out (list "-- before expansion --" before "-- after expansion --" after "-- compiled expression --" (substring (~a compiled) 0 20) "-- result --" result)) (for-each displayln out))
The output shown below has been slightly altered in order to make it easier to read.
-- before expansion -- (begin (define (fact n) (cond ((= n 0) 1) (else (* n (fact (- n 1)))))) (fact 5)) -- after expansion -- #<syntax (begin (define-values (fact) (lambda (n) (if (#%app = n (quote 0)) (let-values () (quote 1)) (let-values () (#%app * n (#%app (#%top . fact) (#%app - n (quote 1)))))))) (#%app (#%top . fact) (quote 5)))> -- compiled expression -- #~ 7.5racket...bytes omitted... -- result -- 120
"fact.rkt"
as a dummy source name.(namespace-syntax-introduce before ns)
to “enrichen” the syntax object before expansion.source-location information
- -Let’s produce a syntax object to experiment with:
> (define (to-syntax s [source-name #f]) (read-syntax source-name (open-input-string s)))> (to-syntax "(foo bar baz)")
#<syntax:string::1 (foo bar baz)>
The output #<syntax:string::1 (foo bar baz) doesn’t reveal much about the extra information stored in the syntax object. It looks almost like a plain S-expression.
'string
.
We also see that
(foo bar baz)
is read from position 1 (remember that file positions count from 1). The span is 13, which means that the number of characters in(foo bar baz)
was 13. Since the position of the open parenthesis ‘(‘ is 1, the position of the close parenthesis ‘)‘ is 14.Figure 1: Focus on the syntax info.
Figure 2: Focus on the line and column numbers.
M-x line-number-mode Note that the the line and column numbers are missing from in figure 1. The port returned by doesn’t track line and column numbers by default, but it’s easy to enable them with
.
> (define (to-syntax s [source-name #f]) (define input (open-input-string s)) (port-count-lines! input) (read-syntax source-name input))> (to-syntax "(foo bar baz)")
#<syntax:string:1:0 (foo bar baz)>
Figure 3: Focus on bar.
Finally, let’s click on to check that source-location
information is tracked not only for the whole list, but for all
elements. Notice that now only is green.
In previous sections, we have seen how to turn a file into a syntax object with the help of . In this section, we will look at template based construction instead.
The form
constructs a syntax object based on a template.
The grammar below describes how templates are written. I have simplified the grammar of
slightly. The full grammar also allows boxes and prefab structures, ellipsis quoting and the operators , . Later, we will return to the operators.(syntax template)
; a string constant> (syntax "Hello World")
#<syntax:eval:1:0 "Hello World">
; a number constant> (syntax 42)
#<syntax:eval:2:0 42>
; a character constant> (syntax #\a)
#<syntax:eval:3:0 #\a>
; an identifier> (syntax fish)
#<syntax:eval:4:0 fish>
(head-template ...)
.
This will construct a syntax object representing a list. The elements of the list have the same shape as head-template.
The ellipsis,
, after head-template means that a list template consist of 0 or more subtemplates (all head templates).(head-template ...+ . template)
.This constructs a syntax object representing an improper list (unless the last template produces a list).
The ...+ means that the template needs at least 1 subtemplate before the dot.
#(head-template ...)
.
This constructs a syntax object representing a vector.
; a list> (syntax (foo "bar" 42))
#<syntax:eval:5:0 (foo "bar" 42)>
; an improper list> (syntax (foo "bar" . 42))
#<syntax:eval:6:0 (foo "bar" . 42)>
; a vector> (syntax #(foo "bar" 42))
#<syntax:eval:7:0 #(foo "bar" 42)>
> '(foo "bar" 42)
'(foo "bar" 42)
> '(foo "bar" . 42)
'(foo "bar" . 42)
> '#(foo "bar" 42)
'#(foo "bar" 42)
parse-quote
.
> #'(foo "bar" 42)
#<syntax:eval:11:0 (foo "bar" 42)>
> #'(foo "bar" . 42)
#<syntax:eval:12:0 (foo "bar" . 42)>
> #'#(foo "bar" 42)
#<syntax:eval:13:0 #(foo "bar" 42)>
A quick example wherein an example of Pythagoras is constructed.
#<syntax:eval:14:0 (= (sqr 5) (+ (sqr 3) (sqr 4)))>
syntax-parse
are used to make an identifier into a pattern variable. If a pattern variable (identifier) appears in a template, won’t insert the identifier, but rather insert a syntax object associated with the pattern variable.. That is, we can also write:
> (with-syntax ([a 3] [b 4] [c 5]) (syntax (= (sqr c) (+ (sqr a) (sqr b)))))
#<syntax:eval:15:0 (= (sqr 5) (+ (sqr 3) (sqr 4)))>
(with-syntax ([pattern stx-expr] ...) body ...+)
The form
wraps a body in a series of clauses consisting of patterns and expressions.stx-expr
... are evaluated in order. If anstx-expr
doesn’t produce a syntax object it is converted to one using.
Each syntax object is then matched with the corresponding pattern. An identifier in the pattern will be bound as a pattern variable to the part of the syntax object it matched.
are:
- If an identifier is not bound as a pattern variable, we get the actual identifier.- If an identifier is bound as a pattern variable, then as a template inserts the value to which it is bound.- If an identifier is bound as a pattern variable in a pattern of the type
, then the template id ... as a template will insert (splice) the elements of the list.We have seen examples of the first two rules already. Let’s examine the third rule.
> (with-syntax ([x '(1 2 3 4 5)]) (syntax (list "first" x "last")))
#<syntax:eval:16:0 (list "first" (1 2 3 4 5) "last")>
> (with-syntax ([(y ...) '(a b c d e)]) (syntax (list "first" y ... "last")))
#<syntax:eval:17:0 (list "first" a b c d e "last")>
> (with-syntax ([x (list 1 2 3 4 5)]) (syntax (list "first" x ... "last")))
eval:18:0: syntax: too many ellipses in template
at: ...
in: (syntax (list "first" x ... "last"))
> (with-syntax ([(x ...) (list 1 2 3 4 5)]) (syntax (list "first" x ... "last")))
#<syntax:eval:19:0 (list "first" 1 2 3 4 5 "last")>
The reason is that the position after the dot in the template for an improper list is special. Only one literal/identifier can occur after the dot, which means that we are forced to have two productions (rules) in the grammar: one that allows ellipsis and one that doesn’t.
A head template in turn is either a template or a head template followed by one more ellipses (i.e. one more triple dot).
Since all templates are head templates, templates of the form
that will print the variable and value pairs before evaluating the body.
(let/verbose ([a 11] [b 12]) (list a b))
should be transformed into
(let ([a 11] [b 12]) (displayln (list "let: " (list 'a a) (list 'b b))) (list a b))
When run, we will see:
(let: (a 11) (b 12))'(11 12)
The first job is to write a pattern that matches the input. As an input example, think of:
(let/verbose ([a 11] [b 12]) (list a b))
The example has two binding clauses, but the actual number varies. We need to use an ellipsis in the pattern:
(define (let/verbose-transformer stx) (with-syntax ([(_let/verbose ([x expr] ...) body) stx]) <produce-the-output>))
In order to test that our pattern works as expected, we can temporarily insert displays to show what was matched:
> (define (let/verbose-transformer stx) (with-syntax ([(_let/verbose ([x expr] ...) body) stx]) (displayln #'(x ...)) (displayln #'(expr ...)) (displayln #'body) #'we-still-need-to-produce-some-output))> (let/verbose-transformer #'(let/verbose ([a 11] [b 12]) (list a b)))
#<syntax:eval:1:0 (a b)>
#<syntax:eval:1:0 (11 12)>
#<syntax:eval:2:0 (list a b)>
#<syntax:eval:1:0 we-still-need-to-produce-some-output>
Given these pieces we want to produce the following output:
(let ([a 11] [b 12]) (displayln (list "let: " (list 'a a) (list 'b b))) (list a b))
To do that, we need to write a template:
> (define (let/verbose-transformer stx) (with-syntax ([(_let/verbose ([x expr] ...) body) stx]) (syntax (let ([x expr] ...) (displayln (list "let: " (list 'x x) ...)) body))))> (let/verbose-transformer (syntax(let/verbose ([a 11] [b 12]) (list a b))))
#<syntax:eval:3:0 (let ((a 11) (b 12)) (displayln (list "let: " (list (quote a) a) (list (quote b) b))) (list a b))>
> (let ((a 11) (b 12)) (displayln (list "let: " (list 'a a) (list 'b b))) (list a b))
(let: (a 11) (b 12))
'(11 12)
(let: (a 11) (b 12))
we want(let: a 11 b 12)
."syntax: missing ellipsis with pattern variable in template at: x". The problem is that the ellipsis in our template only applies to the before it, not to the in . Simply adding an extra ellipsis, as in does result in a program that compiles, but now the order is incorrect. The names are printed first, followed by the values.
> (define (let/verbose-transformer stx) (with-syntax ([(_let/verbose ([x expr] ...) body) stx]) (syntax (let ([x expr] ...) (displayln (list "let: " (~@ 'x x) ...)) body))))> (let/verbose-transformer (syntax(let/verbose ([a 11] [b 12]) (list a b))))
#<syntax:eval:6:0 (let ((a 11) (b 12)) (displayln (list "let: " (quote a) a (quote b) b)) (list a b))>
#<syntax:eval:8:0 (list (list a 11) (list b 22) (list c 33))>
The outer
is used to extract the strings. The inner then converts the strings into numbers, which appears in the output.Note, that we can’t write:
eval:9:0: syntax: no pattern variables before ellipsis in
template
at: str
in: (syntax (str ...))
> (require racket/syntax)> (with-syntax* ([((sym str) ...) (syntax ((a "11") (b "22") (c "33")))] [(num ...) (map string->number (syntax->datum (syntax (str ...))))]) (syntax (list (list sym num) ...)))
#<syntax:eval:11:0 (list (list a 11) (list b 22) (list c 33))>
with-syntax*
is as follows.(with-syntax* ([pattern stx-expr] ...) body ...+)
racket/syntax
exportswith-syntax*
. Similar to , but the pattern variables of each pattern are not only bound in the bodys, but also in the stx-exprs of subsequent clauses. Also, the patterns no longer need to bind distinct pattern variables; later bindings shadow earlier bindings.???-pattern
and???-template
in the following snippets in order to make all snippets evaluate to true.(define result0 (with-syntax ([(num ...) (syntax (1 2 3))]) (syntax ???-template))) (equal? (syntax->datum result0) (list 1 2 3))
(define result2 (with-syntax ([(str ...) (syntax ("a" "b" "c"))] [(num ...) (syntax ( 1 2 3))]) (syntax ???-template))) (equal? (syntax->datum result2) '(("a" 1) ("b" 2) ("c" 3)))
(define result3 (with-syntax ([(str ...) (syntax ("a" "b" "c"))] [(num ...) (syntax ( 1 2 3))]) (syntax ???-template))) (equal? (syntax->datum result3) '("a" 1 "b" 2 "c" 3))
(define result4 (with-syntax ([(str ...) (syntax ("a" "b" "c"))] [(num ...) (syntax ( 1 2 3))]) (syntax ???-template))) (equal? (syntax->datum result4) '("a" "b" "c" 1 2 3))
(define result5 (with-syntax ([(num ...) (syntax (1 2 3))]) (with-syntax ([(str ...) ???-expr]) (syntax ???-template)))) (equal? (syntax->datum result5) '("1" "2" "3"))
(define result6 (with-syntax ([???-pattern (syntax ((a (1 "alpha") (2 "beta")) (b (3 "gamma") (4 "delta"))))]) (syntax ???-template))) (equal? (syntax->datum result6) '((a b) (1 2 3 4) ("alpha" "beta" "gamma" "delta")))
– Bob Hope But first, a small reminder. The idea is to write one thing, A, in our source code, but the compiler must pretend we wrote another, B.
The reader turns A into a syntax object (buried in a very large syntax object representing the entire program). If the expander can tell that A is a macro call, it will call the associated syntax transformer. The return value B will then be used in place of A.
backwards
that allows us to write A:
but will compile as if we had written B:
Let’s write our transformer first.
The general form of the input A is:
(backwards form ...)
And we want to produce
(begin rev-form ...)
where the
rev-form
forms appear in the reverse order of the original forms.> (define (backwards-transformer stx) ; First use a pattern to get parts of the input (with-syntax ([(backwards form ...) stx]) ; Compute auxiliary data needed in output (define rev-forms (reverse (syntax->list (syntax(form ...))))) ; Use a template to produce the output (with-syntax ([(rev-form ...) rev-forms]) (syntax (begin rev-form ...)))))
We can test the transformer in the repl:
> (backwards-transformer (syntax (backwards (display 1) (display 2) (display 3))))
#<syntax:eval:1:0 (begin (display 3) (display 2) (display 1))>
We get the expected result, so we are ready to tell the expander that
backwards
must invoke the transformerbackwards-transformer
.Our first attempt is:
> (define-syntax backwards backwards-transformer)
eval:3:0: backwards-transformer: unbound identifier;
also, no #%top syntax transformer is bound in the
transformer phase
in: backwards-transformer
backwards-transformer
is defined as a normal definition, which means it isn’t available until the final program runs.(require (for-syntax "helper.rkt"))
. However, we will instead mark our definition ofbackwards-transformer
, so the system knows it is supposed to run at compile time. The form is what we need.> (begin-for-syntax (define (backwards-transformer stx) (with-syntax ([(backwards form ...) stx]) (define rev-forms (reverse (syntax->list (syntax(form ...))))) (with-syntax ([(rev-form ...) rev-forms]) (syntax (begin rev-form ...))))))> (define-syntax backwards backwards-transformer)
backwards
and the transformerbackwards-transformer
separately to make the concepts clear. However, in practise the common case is to define both at the same time.> (require (for-syntax racket))> (define-syntax (backwards stx) (with-syntax ([(backwards form ...) stx]) (define rev-forms (reverse (syntax->list (syntax(form ...))))) (with-syntax ([(rev-form ...) rev-forms]) (syntax (begin rev-form ...)))))
(backwards)
expands to(begin)
, which produces a void value. No problem there.In our code we have:
(with-syntax ([(backwards form ...) stx]) <more>)
If the input syntax in doesn’t match the the pattern
(backwards form ...)
, the user will get an error. Two examples illustrate the problem.backwards
being used as an identifier macro.
> (backwards 1 2 . 3)
eval:2:0: with-syntax: binding match failed
in: (backwards form ...)
> (list 1 2 backwards 3 4)
eval:2:0: with-syntax: binding match failed
in: (backwards form ...)
From a user perspective “with-syntax: binding match failed” is a somewhat confusing error - nowhere in the user code is
used. This can be especially confusing, if the person who uses the macro and the author of the macro isn’t the same. Error messages should always be worded in terms of the macro itself – not to some arbitrary part implementing the macro.syntax-parse
instead of to pick the input apart. Note thatwill highlight the offending user code correctly in DrRacket/racket-mode.
Note that we have added
syntax/parse
to the form to makesyntax/parse
available in the transformer phase.
> (require (for-syntax racket syntax/parse))> (define-syntax (backwards stx) (syntax-parse stx [(backwards form ...) (define rev-forms (reverse (syntax->list (syntax (form ...))))) (with-syntax ([(rev-form ...) rev-forms]) (syntax (begin rev-form ...)))] [_ (raise-syntax-error 'backwards "expected (backwards form ...), got: " stx)]))> (backwards 1 2 . 3)
eval:4:0: backwards: expected (backwards form ...), got:
in: (backwards 1 2 . 3)
> (list 1 2 backwards 3 4)
eval:5:0: backwards: expected (backwards form ...), got:
in: backwards
Our macro now works both for well-formed and malformed input.
The steps we followed to in the preceding section can be thought of as a recipe for writing simple macros.
1.Decide the input form of a macro2.Experiment with concrete examples to find a general transformation3.Turn the examples into tests.4.Write the macro by filling in:
5.Run tests.(define-syntax (name stx) ; Use syntax-parse destructure the input syntax. (syntax-parse stx [(_name form ...) ; Compute auxillary information. ; Use with-syntax to bind the information ; to pattern variables. (with-syntax ([...]) ; Construct output using a template. (syntax (begin rev-form ...)))] ; Repeat if several input patterns are needed. ; Finally use _ to catch all erroneous uses. [_ (raise-syntax-error 'name "expected (name form ...), got: " stx)]))
mrlib/syntax-browser
contains a functionrender-syntax/window
that can be used to explore a syntax object.#lang racket(require mrlib/syntax-browser)(render-syntax/window #'(foo bar baz))
The main resource when learning the art of macros is the chapter Macros in the Racket Reference. The second and third section of the chapter Macros are also relevant.
If you decide to dig into the papers on macros, take a look at the bibliography formerly on readscheme.org: Macros.
Some highlights:
R. Kent Dybvig.Syntactic abstraction: the syntax-case expander (pdf)R. Kent Dybvig.Writing Hygenic Macros in Scheme with Syntax-CaseMatthew Flatt.Composable and Compilable Macros: You Want it When?Matthew Flatt.Binding as Sets of Scopes
Finally, if you are looking for some concrete mini projects to practise on, check out Racket Macro Exercises by Alexis King.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK