![](/style/images/good.png)
![](/style/images/bad.png)
GitHub - ubolonton/emacs-tree-sitter at 0.4.0
source link: https://github.com/ubolonton/emacs-tree-sitter/tree/0.4.0#installation
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
emacs-tree-sitter
This is an Emacs Lisp binding for tree-sitter, an incremental parsing library. It requires Emacs 25.1 or above, built with dynamic module support.
It aims to be the foundation for a new breed of Emacs packages that understand code structurally. For example:
- Faster, fine-grained code highlighting.
- More flexible code folding.
- Structural editing (like Paredit, or even better) for non-Lisp code.
- More informative indexing for imenu.
The author of tree-sitter articulated its merits a lot better in this Strange Loop talk.
Installation
At this stage of the project, there are few end-user-visible features, but you can already install it to play around with the APIs.
- Check that Emacs was built with module support:
(functionp 'module-load)
. - Add the tree-sitter ELPA to
package-archives
(remember to runpackage-refresh-contents
afterwards):(add-to-list 'package-archives '("ublt" . "https://elpa.ubolonton.org/packages/"))
- Install
tree-sitter
andtree-sitter-langs
like other normal packages.
The package is not yet on MELPA, because it currently doesn't have a convenient way to distribute packages with pre-compiled binaries.
If you want to hack on emacs-tree-sitter
itself, see the section Setup for Development instead.
Basic Usage
- Enable the
tree-sitter
minor mode in a supported major mode (defined intree-sitter-major-mode-language-alist
):(require 'tree-sitter) (require 'tree-sitter-langs) (add-hook 'rust-mode-hook #'tree-sitter-mode)
- Show the debug view of a buffer's parse tree
(require 'tree-sitter-debug) (tree-sitter-debug-enable)
- Get names of all functions in a Rust file:
(with-current-buffer "types.rs" (seq-map (lambda (capture) (pcase-let ((`(_ . ,node) capture)) (ts-node-text node))) (tree-sitter-query [(function_item (identifier) @name)])))
- Write a simple extension to
expand-region
:(defun tree-sitter-mark-next-bigger-node () (interactive) (let* ((p (point)) (m (if mark-active (mark) p)) (beg (min p m)) (end (max p m)) (root (ts-root-node tree-sitter-tree)) (node (ts-get-named-descendant-for-position-range root beg end)) (node-beg (ts-node-start-position node)) (node-end (ts-node-end-position node))) ;; Already marking current node. Try its parent node instead. (when (and (= beg node-beg) (= end node-end)) (when-let ((node (ts-get-parent node))) (setq node-beg (ts-node-start-position node) node-end (ts-node-end-position node)))) (set-mark node-end) (goto-char node-beg)))
- Parse a string:
(let ((parser (ts-make-parser))) (ts-set-language parser (tree-sitter-require 'rust)) (ts-parse-string parser "fn foo() {}"))
- The tree-sitter doc is a good read to understand its concepts, and how to use the parsers in general.
- Functions in this package are named differently, to be more Lisp-idiomatic. The overall parsing flow stays the same.
- Documentation for individual functions can be viewed with
C-h f
(describe-function
), as usual. - A
symbol
in the C API is actually the ID of a type, so it's calledtype-id
in this package.
Types
language
,parser
,tree
,node
,cursor
,query
: corresponding tree-sitter types, embedded inuser-ptr
objects.point
: a pair of(LINE-NUMBER . BYTE-COLUMN)
.LINE-NUMBER
is the absolute line number returned byline-number-at-pos
, counting from 1.BYTE-COLUMN
counts from 0, likecurrent-column
. However, unlike that function, it counts bytes, instead of displayed glyphs.
range
: a vector in the form of[START-BYTEPOS END-BYTEPOS START-POINT END-POINT]
.
These types are understood only by this package. They are not recognized by type-of
, but have corresponding type-checking predicates, which are useful for debugging: ts-language-p
, ts-tree-p
, ts-node-p
...
For consistency with Emacs's conventions, this binding has some differences compared to the tree-sitter's C/Rust APIs:
- It uses 1-based byte positions, not 0-based byte offsets.
- It uses 1-based line numbers, not 0-based row coordinates.
Functions
- Language:
tree-sitter-require
: likerequire
, for tree-sitter languages.
- Parser:
ts-make-parser
: create a new parser.ts-set-language
: set a parser's active language.ts-parse-string
: parse a string.ts-parse
: parse with a text-generating callback.ts-set-included-ranges
: set sub-ranges when parsing multi-language text.
- Tree:
ts-root-node
: get the tree's root node.ts-edit-tree
: prepare a tree for incremental parsing.ts-changed-ranges
: compare 2 trees for changes.ts-tree-to-sexp
: debug utility.
- Cursor:
ts-make-cursor
: obtain a new cursor from either a tree or a node.ts-goto-
functions: move to a different node.ts-current-
functions: get the current field/node.
- Node:
ts-node-
functions: node's properties and predicates.ts-get-
functions: get related nodes (parent, siblings, children, descendants).ts-count-
functions: count child nodes.ts-mapc-children
: loops through child nodes.ts-node-to-sexp
: debug utility.
- Query:
ts-make-query
: create a new query.ts-make-query-cursor
: create a new query cursor.ts-query-matches
,ts-query-captures
: execute a query, returning matches/captures.ts-set-byte-range
,ts-set-point-range
: limit query execution to a range.
Setup for Development
Clone this repo and add its lisp
and langs
directories to load-path
.
If you want to hack on the high-level features (in Lisp) only:
- Evaluate this (once) to download the necessary binaries:
(require 'tree-sitter-langs-build) ;; Download pre-compiled `tree-sitter-dyn'. (tree-sitter-download-dyn-module) ;; Download pre-compiled language grammars. (tree-sitter-langs-install)
- Make changes to the
.el
files. - Add tests to
tree-sitter-tests.el
and run them with./bin/test
(.\bin\test
on Windows).
If you want to build addtional (or all) grammars from source, or work on the core dynamic module, see the next 2 sections.
Building grammars from source
- Install tree-sitter CLI tool (if you don't use NodeJS, you can download the binary directly from GitHub):
# For yarn user yarn global add tree-sitter-cli # For npm user npm install -g tree-sitter-cli
- Run:
# macOS/Linux make ensure/rust
# Windows .\bin\ensure-lang rust
- You can modify
tree-sitter-langs-repos
if the language you need is not declared there.
Working on the dynamic module
- Install the Rust toolchain.
- Install
clang
, to generate the raw Rust binding foremacs-module.h
. - Build:
# macOS/Linux make build
# Windows .\bin\build
- Test:
# macOS/Linux make test
# Windows .\bin\test
- Continuously rebuild and test on change (requires cargo-watch):
# macOS/Linux make watch
# Windows .\bin\test watch
To test against a different version of Emacs, set the environment variable EMACS
(e.g. EMACS=/snap/bin/emacs make test
).
Overall Plan
Targeting lib authors:
- Write a guide on using the tree-sitter APIs.
Targeting end users:
- Pick a language, make a "killer" minor mode that extends its major mode in multiple ways.
- Make minor modes for most common languages.
- Extract common patterns from the language minor modes into helper language-diagnostic minor modes.
- Get a language major mode to use tree-sitter for optional features.
Alternative
Binding through C instead of Rust: https://github.com/karlotness/tree-sitter.el
Contribution
Contributions are welcomed. Please take a look at the issue list for ideas, or create a new issue to describe any idea you have for improvement.
Show respect and empathy towards others. Both technical empathy and general empathy are highly valued.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK