A C++ API for Vega-Lite

In this post, we present the first public release of XVega, a C++ library for producing Vega-Lite charts.

Data science workflows differ from traditional software development in that engineers make use of available tools to explore and reason about a problem. In such exploratory work, engineers load data, crunch numbers, produce simple visualizations and iterate… Progress happens in quick incremental iterations, which is possible when tooling does not get in the way.

This kind of interactive computing is generally associated with the Python or R programming languages. However, with the advent of the Cling C++ interpreter from CERN, and the subsequent development of the xeus-cling Jupyter kernel, new possibilities have opened up in this space.

The Jupyter stack — that started in the scientific Python community has evolved into a language-agnostic framework that can now be leveraged by C++ developers. It bridges the gap between the countless scientific computing libraries and tools available in C++ and the Jupyter ecosystem.

The scientific C++ stack now has numerous projects under its belt — such as xtensor, xframe, etc. However, there is little support for visualization — especially for interactive plots. While there exist matplotlib-cpp and matplotplusplus (with their plotting API resembling the original matplotlib library) — they suffer from the same cons as the original library does (such as the imperative API and the confusion between dual object-oriented and state-based interface).

Owing to all these shortcomings, along with the observation that JupyterLab comes with existing support for Vega and Vega-Lite Charts (through the application/vnd.vegalite.v3+json MIME type), one can leverage this support to bridge the gap rather than reinvent the wheel. Apart from standalone use — one could also integrate such a system into other projects such as xeus-SQLite.

The main idea is to programmatically fill in a JSON that conforms to the Vega-Lite specification and respects the notion of grammar of graphics. It is analogous to what Altair did for Python. We will expose different APIs responsible for filling in certain parts of the JSON.

The fundamentals with XVega are still the same, i.e. the three essential elements of a Chart are Data, Marks and Encodings as usual and importing the library is as simple as writing two statements:

#include "xvega/xvega.hpp"using namespace xv;

The experience is similar to what Altair offers and, hence, the central piece to the library is the Chart() object — which knows how to emit the JSON dictionary representing the data and visualization encodings.

For those unfamiliar with the Vega ecosystem, a quick recap for the above terms is given below:

Marks — What graphic should represent the data?
Encodings — Mapping between Data and Visual Elements of the Chart (such as x-axis, etc.).
Encoding Types: Quantitative (real-valued), Nominal (unordered categorical), Ordinal (ordered categorical), Temporal (time-series).

Basic usage of XVega showcasing the essential elements — Data, Marks and Encodings.

The core strength of using such a system is the separation of specification and execution. The declarative API makes it easy to specify “what” should be done rather than focus on incidental details of the “how”. It means that rather than having a special “hist()” function for plotting a histogram, passing “bin=True” does the job.

Simply stating bin=True bins the x-axis giving us the Histogram directly — without using a dedicated function.

We can of-course customize the binning parameters with a “Bin()” object instead. And while we are doing that, let’s add a colour encoding as well to get a sense of the 3rd dimension.

More control can be achieved using a custom Bin() object — used to set the binning parameters.

Another plus of using Vega-Lite is the possibility of using transformations within the specification rather than doing it before.
(E.g., one can do linear regression as a part of this declarative API).

Usage of layering and transformations in XVega.

Lastly, support for Interactions and Selections is a no-brainer. It’s as simple as defining what to use and adding it to the Chart() object.

Zooming and Panning along with Tooltips using Interval Selection in XVega

A C++ API for Vega-Lite

A C++ API for Vega-Lite

Recommend

Pandas for Data Analysis

How to setup Django with React

A Tale of Two Deployments

Vulnerability, leadership and paternity leave ft. Erran Berger

Evennia 0.9.5 released!

Exploration and Explanation in Computational Notebooks

How to Build Python from Source

Economics of Home Ownership Deep Dive

Replacing ReadTheDocs with GitHub

Bitcoin Machine Learning.

About Joyk