5

A brief look at the internal structure of Clojure Zippers

 2 years ago
source link: https://blog.jakubholy.net/2019/briefly-internal-structure-of-clojure-zippers/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

A brief look at the internal structure of Clojure Zippers

December 24, 2019

Clojure Zippers is a library for navigating and modifying tree data structures. While refactoring Cryogen, I needed an operation not supported out of the box (the removal of all nodes to the right of the current one) and thus had to learn a bit about the internal structure of zippers. I record it here for posterity.

A zipper - called loc throughout the code - is simply a tuple of [current-node path-and-state-info] with some metadata. You create a zipper/loc using one of the built-in builders such as xml-zip (if inner node = map, children at :content), vector-zip, seq-zip or using zipper providing your own branch?, children, and make-node functions. And you get the tree back via (node (root loc)).

Remember the distinction between a node and a loc, it will be important to know which one you are working with and different functions return either the one or the other. (You can get the node from a loc using (node loc).) Occasionally I will also use path to refer to the second member of loc, i.e. the path-and-state-info above.

Assume (require '[clojure.zip :refer :all]) when reading code snippets here.

In a loc, current-node is the actual node in the original tree (e.g. a map in a xml-zip tree) and path-and-state-info is a map containing among others:

  • :l - a vector of nodes to the left (leftmost = first; conj to add to end)

  • :r - a vector (or, sometimes, seq?) of nodes to the right (closest = first, rightmost = last)

  • :pnodes (available via (path loc) - a seq of nodes leading to this loc from the root

  • :ppath - parent’s path (= (second (up loc)))

  • changed? - whether the node has been changed (e.g. by insert-left) - used during (up), see below

Let’s see how these are used in a few key functions.

Code examples

up

up (used e.g. by root) is crucial because that is the place where nodes are updated to reflect any changes made to the tree - the parent loc and node are replaced using make-node based on the (possibly modified) :l, node, and :r to reflect any changes to its children:

(defn up
  "Returns the loc of the parent of the node at this loc, or nil if at
  the top"
  {:added "1.0"}
  [loc]
    (let [[node {l :l, ppath :ppath, pnodes :pnodes r :r, changed? :changed?, :as path}] loc]
      (when pnodes
        (let [pnode (peek pnodes)]
          (with-meta (if changed?
                       [(make-node loc pnode (concat l (cons node r)))
                        (and ppath (assoc ppath :changed? true))]
                       [pnode ppath])
                     (meta loc))))))

xml-zip

Create a zipper from a tree with map nodes and children being a vector under :content:

(defn xml-zip
  "Returns a zipper for xml elements (as from xml/parse),
  given a root element"
  {:added "1.0"}
  [root]
    (zipper (complement string?) ; branch?
            (comp seq :content)  ; children
            (fn [node children]  ; make-node
              (assoc node :content (and children (apply vector children))))
            root))

Notes

Functions that change the tree operate mostly on the loc and change its :l, :r etc. as appropriate; the change gets reflected to the node only during up - e.g. insert-left/right. Functions that operate on the children such as insert-child change the parent node (and loc) at once.

See also

  • tupelo.forest is a library for searching & manipulating tree-like data structures (examples)

  • nathanmarz/specter - querying and manipulation of complex, nested data structures - including trees - made easy


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK