xj — HTML to JSON
source link: https://idiomdrottning.org/xj
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
xj — HTML to JSON
This, xj
, is a Unix filter that reads XML (or permissively parses
HTML) and outputs JSON. Perfect for piping directly into jq, gron
or json2tsv.
Usage
wget -qO- https://stedolan.github.io/jq/|xj|jq '..|select(.title?)[][]'
Installation
apt install chicken
chicken-install xj
Formal Semantics
Elements are objects with one key, the element name, and the value is an array with the children of the element, or an empty array if there aren’t any. (This is to disambiguate elements from text data.)
Iff there are any attributes, an attibute object is listed first among the children, disambiguated from the other children by having a “@” key. The attributes are not in a list, they can be accessed directly.
In XML, an element can have several children with the same name, and in turn have grandchildren. But the same isn’t true for attributes which is why it can have simpler semantics.
Source code
git clone https://idiomdrottning.org/xj
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK