5

De Bruijn index: why and how

 2 years ago
source link: https://dannypsnl.github.io/blog/2020/05/16/cs/de-bruijn-index/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

De Bruijn index: why and how

At the beginning of learning PLT, everyone could be confused by program that didn't have a variable! Will, I mean they didn't use String, Text or something like that to define a variable. A direct mapping definition of lambda calculus(a.k.a. LC, if you haven't been familiar with LC, you can read this article first) usually look like this:

#lang typed/racket

(define-type term (U t:var t:λ (app term)))
(define-type t:var Symbol)
(struct t:λ
  ([x : t:var]
   [m : term])
  #:property prop:custom-write
  (λ (v port mode)
    (fprintf port "λ~a.~a"
             (t:λ-x v)
             (t:λ-m v))))
(struct (t) app
  ([t1 : t]
   [t2 : t])
  #:property prop:custom-write
  (λ (v port mode)
    (fprintf port "(~a ~a)"
             (app-t1 v)
             (app-t2 v))))

However, we may find some definitions look like this:

#lang typed/racket

(define-type bterm (U b:var b:λ (app bterm)))
(define-type b:var Integer)
(struct b:λ
  ([m : bterm])
  #:property prop:custom-write
  (λ (v port mode)
    (fprintf port "λ~a" (b:λ-m v))))

There has two significant differences:

  1. variable is an Integer.
  2. lambda does not contain x(which means parameter in high-level languages' concept).

This is De Bruijn index(a.k.a. DBI). We can write it out, for example, id function λx.x can be rewritten with λ0; Y combinator λf.(λx.f (x x)) (λx.f (x x)) can be rewritten with λ(λ0 (1 1))(λ0 (1 1)). To understand why using DBI, we need to know what was the problem in LC.

α\alphaα-conversion

Usually, two id functions considered the same function. However, if we encode LC in the first way, we will get into trouble. We may say λx.x is not the same function as λy.y, when they are the same. To solve this problem, we develop a conversion called α\alphaα-conversion(a.k.a. α\alphaα-renaming), which renaming λx.x and λy.y to λa.a let's say.

Looks good, any problem else? Emm...yes. As we know, the real world never make thing easier, but that also means a challenge is coming, hope we all love this challenge! When λy.λx.x be renamed to λa.λa.a is fine, because of all of us work with variable-shadowing for a long-long time. However, there has a possible dangerous conversion is the renamed variable existed! For example, λx.λa.x should not simply be rewritten with λa.λa.a, because later when we rename a, we would get λa.λb.b, oops. λx.λa.x definitely is not λa.λb.b. To correct α\alphaα-conversion is really hard, that's a good point to introduce the De Bruijn index.

De Bruijn Index

We already seem some examples, but why it resolves the problem we mentioned in the previous section? We need to know those rules used by the conversion process:

  1. remember the level of λ, every time we found a λ when converting recursively, it should increase(or decrease, it depends on index counting order) this level value.
  2. when found a λ, replace it's variable with the De Bruijn index form, the value of the index is the current level.

Let's manually do this conversion:

λx.λy.λz.z
-> λλy.λz.z // x = cur_level = 0, cur_level+1
-> λλλz.z   // y = cur_level = 1, cur_level+1
-> λλλ2 // z = cur_level = 2

Notice that since new form of abstraction(a.k.a lambda) only needs M part(a.k.a. body). Another important thing is some De Bruijn index use reverse order than we show at here, so would be λλλ0, not λλλ2.

Implementation

Now, it's time for the program:

(: convert (->* (term)
                ((Immutable-HashTable Symbol Integer))
                bterm))
(define (convert t [name->index (make-immutable-hash '())])
  (match t
    ;; bind parameter name to an index and keep conversion
    [(t:λ p b) (b:λ (convert b (hash-set name->index p (hash-count name->index))))]
    [(app t1 t2) (app (convert t1 name->index) (convert t2 name->index))]
    ;; get index from environment
    [name (hash-ref name->index name)]))

->* is a type constructor in Racket for optional parameters, should be read like (->* normal-parameter-types optional-parameter-types return-type). I use optional parameters to help users don't need to remember they must provide an empty hash table. A tricky thing is I didn't record level, at least, not directly. Here I use an immutable hash table to remember level, since how many variables should be renamed exactly is level value. Then variable only need to replace its name with the index.

Congratulation, now you know everything about DBI!? No, not yet, there still one thing you need to know.

β\betaβ-reduction

β\betaβ-reduction? You might think how can such basic things make things go wrong. However, a naive implementation of β\betaβ-reduction can break structural equivalence of DBI form, which can make an annoying bug in those systems based on LC. The problem is lack-lifting. For example, a normal implementation of β\betaβ-reduction would simply make λ(λλ2 0) become λλ2. However, the same form directly converted from λx.λz.z would become λλ1, and λλ2 will be considered as different value as λλ1 since 1 is not 2. We can introduce another renaming for these, but if we can fix it in β\betaβ-reduction, why need another phase?

Implementation

(: β-reduction (->* (bterm)
                    (Integer (Immutable-HashTable Integer bterm))
                    bterm))
(define (β-reduction t [de-bruijn-level 0] [subst (make-immutable-hash '())])
  (match t
    [(b:λ body) (b:λ (β-reduction body (add1 de-bruijn-level) subst))]
    [(app t1 t2)
     (match t1
       [(b:λ body)
        (let ([reduced-term (β-reduction body (add1 de-bruijn-level)
                                         (hash-set subst de-bruijn-level t2))])
          ;;; dbi lifting by replace reduced-term (+ 1 dbi) with (var dbi)
          (β-reduction reduced-term de-bruijn-level
                       (hash-set subst (add1 de-bruijn-level) de-bruijn-level)))]
       [_ (raise "cannot do application on non-lambda term")])]
    [i (hash-ref subst i (λ () t))]))

We have to record level independently since it has no relation with the substitution map this time. For variables, all need to do is apply substitution map to get value, if not, use origin form as a result. For the lambda, increase level is the only thing. For application, it's complicated. We need to be more careful with it. It contains three major parts:

  1. check t1 is an abstraction(a.k.a lambda)
  2. do β\betaβ-reduction by add stuff into substitution map, substitution map will be used for index substitution
  3. DBI lifting(for example, λλ2 should become λλ1 after expanded at caller part)

Examples

Now you can play around above program

(convert (t:λ 'x (t:λ 'y (t:λ 'z 'x))))
(convert (t:λ 'x (t:λ 'y (t:λ 'z 'y))))
(convert (t:λ 'x (t:λ 'y (t:λ 'z 'z))))
(convert (t:λ 'f
              (app
               (t:λ 'x (app 'f (app 'x 'x)))
               (t:λ 'x (app 'f (app 'x 'x))))))
(convert (app (t:λ 'x 'x) 'y) (make-immutable-hash '((y . 0))))
(β-reduction (convert (app (t:λ 'x 'x) 'y) (make-immutable-hash '((y . 0)))))

Conclusion

DBI is a quite useful technology when implementing complicated AST conversion. It's not just easier to avoid rename conflicting, but also a less memory required form for implementations. I hope you enjoy the article and have a nice day, if this even really helps you in a real task, would be awesome!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK