JSON Changelog with SQLite
source link: https://www.tuicool.com/articles/hit/zm26Rzz
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Sometimes you want a log of all changes to the data in your database (every INSERT, UPDATE and DELETE). In Buckets’ case, such a log will be used to help merge budget files between computers/phones.
In this post, I’ll show you one method for adding change tracking to your SQLite database. Someone has probably already done it this way, but I couldn’t find it.
Do you need it?
Before you start tracking every single change in your database, decide if you really need it. You might be able to get away with tracking specific columns, for instance.
The Data
I have a people
table created with the following SQL:
CREATE TABLE people ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, name TEXT, age INTEGER );
SQLite + JSON
SQLite includes some nice JSON functions . Before showing you how to use them to make a change log, here’s how some of the functions work on their own:
json_array
sqlite> select json_array(1, "hello", "world") as data; data ------------------- [1,"hello","world"]
json_extract
sqlite> select json_extract('{"foo": "bar"}', '$.foo') as data; data ---------- bar
json_group_object
This is an aggregate function.
sqlite> create temporary table foo (a, b); sqlite> insert into foo (a, b) values ("Alice", 30), ("Bob", 42); sqlite> select json_group_object(a, b) as data from foo; data ----------------------- {"Alice":30,"Bob":42}
json_each
This is a table-value function (it produces a table).
sqlite> select * from json_each(json_array("apple","banana","cow")); key value type atom id parent fullkey path ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- 0 apple text apple 1 $[0] $ 1 banana text banana 2 $[1] $ 2 cow text cow 3 $[2] $
JSON Change Logs
Using the above functions, you can make change logs!
Version 1 - Track Everything
The following change log table tracks all changes to all columns:
-- Data table CREATE TABLE people ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, name TEXT, age INTEGER ); -- Change log table CREATE TABLE change_log ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, action TEXT, table_name TEXT, obj_id INTEGER, changes TEXT ); -- Insert Trigger CREATE TRIGGER people_track_insert AFTER INSERT ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'INSERT', 'people', NEW.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', null, NEW.id), json_array('created', null, NEW.created), json_array('name', null, NEW.name), json_array('age', null, NEW.age) ) ) ) ); END; -- Update Trigger CREATE TRIGGER people_track_update AFTER UPDATE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'UPDATE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, NEW.id), json_array('created', OLD.created, NEW.created), json_array('name', OLD.name, NEW.name), json_array('age', OLD.age, NEW.age) ) ) ) ); END; -- Delete Trigger CREATE TRIGGER people_track_delete AFTER DELETE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'DELETE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, null), json_array('created', OLD.created, null), json_array('name', OLD.name, null), json_array('age', OLD.age, null) ) ) ) ); END;
sqlite> INSERT INTO people (name, age) VALUES ('Alice', 30), ('Bob', 42); sqlite> UPDATE people SET age = age + 2; sqlite> UPDATE people SET name = 'Eva' WHERE name='Alice'; sqlite> DELETE FROM people WHERE name = 'Bob'; sqlite> SELECT * FROM change_log; id created action table_name obj_id changes --- ---------- ---------- ---------- ------ -------------------------------------------------------------------------------- 1 2018-08-27 INSERT people 1 {"id":[null,1],"created":[null,"2018-08-27 21:53:02"],"name":[null,"Alice"],"age 2 2018-08-27 INSERT people 2 {"id":[null,2],"created":[null,"2018-08-27 21:53:02"],"name":[null,"Bob"],"age": 3 2018-08-27 UPDATE people 1 {"id":[1,1],"created":["2018-08-27 21:53:02","2018-08-27 21:53:02"],"name":["Ali 4 2018-08-27 UPDATE people 2 {"id":[2,2],"created":["2018-08-27 21:53:02","2018-08-27 21:53:02"],"name":["Bob 5 2018-08-27 UPDATE people 1 {"id":[1,1],"created":["2018-08-27 21:53:02","2018-08-27 21:53:02"],"name":["Ali 6 2018-08-27 DELETE people 2 {"id":[2,null],"created":["2018-08-27 21:53:02",null],"name":["Bob",null],"age":
Version 2 - Only Track Changes
There’s a lot of duplicate information in the above version (for instance, the created timestamp never changes after INSERT but is recorded twice for every UPDATE). This version improves on the other by only recording values that have changed.
-- Data table CREATE TABLE people ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, name TEXT, age INTEGER ); -- Change log table CREATE TABLE change_log ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, action TEXT, table_name TEXT, obj_id INTEGER, changes TEXT ); -- Insert Trigger CREATE TRIGGER people_track_insert AFTER INSERT ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'INSERT', 'people', NEW.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', null, NEW.id), json_array('created', null, NEW.created), json_array('name', null, NEW.name), json_array('age', null, NEW.age) ) ) WHERE oldval IS NOT newval ) ); END; -- Update Trigger CREATE TRIGGER people_track_update AFTER UPDATE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'UPDATE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, NEW.id), json_array('created', OLD.created, NEW.created), json_array('name', OLD.name, NEW.name), json_array('age', OLD.age, NEW.age) ) ) WHERE oldval IS NOT newval ) ); END; -- Delete Trigger CREATE TRIGGER people_track_delete AFTER DELETE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, changes) SELECT 'DELETE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, json_array(oldval, newval)) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, null), json_array('created', OLD.created, null), json_array('name', OLD.name, null), json_array('age', OLD.age, null) ) ) WHERE oldval IS NOT newval ) ); END;
sqlite> INSERT INTO people (name, age) VALUES ('Alice', 30), ('Bob', 42); sqlite> UPDATE people SET age = age + 2; sqlite> UPDATE people SET name = 'Eva' WHERE name='Alice'; sqlite> DELETE FROM people WHERE name = 'Bob'; sqlite> SELECT * FROM change_log; id created action table_name obj_id changes --- ---------- ---------- ---------- ------ -------------------------------------------------------------------------------- 1 2018-08-27 INSERT people 1 {"id":[null,1],"created":[null,"2018-08-27 21:53:25"],"name":[null,"Alice"],"age 2 2018-08-27 INSERT people 2 {"id":[null,2],"created":[null,"2018-08-27 21:53:25"],"name":[null,"Bob"],"age": 3 2018-08-27 UPDATE people 1 {"age":[30,32]} 4 2018-08-27 UPDATE people 2 {"age":[42,44]} 5 2018-08-27 UPDATE people 1 {"name":["Alice","Eva"]} 6 2018-08-27 DELETE people 2 {"id":[2,null],"created":["2018-08-27 21:53:25",null],"name":["Bob",null],"age":
Version 3 - Only Old Values
In Buckets, most records are inserted and never updated, and very few are deleted. So to save on space, there’s another optimization that favors this use case. Instead of storing both old and new values, only storing old values means that:
- no information is duplicated for INSERTS
- each change entry is smaller, too.
More application work is required to piece together a full change log but all the information is still available.
-- Data table CREATE TABLE people ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, name TEXT, age INTEGER ); -- Change log table CREATE TABLE change_log ( id INTEGER PRIMARY KEY, created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, action TEXT, table_name TEXT, obj_id INTEGER, oldvals TEXT ); -- Insert Trigger CREATE TRIGGER people_track_insert AFTER INSERT ON people BEGIN INSERT INTO change_log (action, table_name, obj_id) VALUES ('INSERT', 'people', NEW.id); END; -- Update Trigger CREATE TRIGGER people_track_update AFTER UPDATE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, oldvals) SELECT 'UPDATE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, oldval) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, NEW.id), json_array('created', OLD.created, NEW.created), json_array('name', OLD.name, NEW.name), json_array('age', OLD.age, NEW.age) ) ) WHERE oldval IS NOT newval ) ); END; -- Delete Trigger CREATE TRIGGER people_track_delete AFTER DELETE ON people BEGIN INSERT INTO change_log (action, table_name, obj_id, oldvals) SELECT 'DELETE', 'people', OLD.id, changes FROM (SELECT json_group_object(col, oldval) AS changes FROM (SELECT json_extract(value, '$[0]') as col, json_extract(value, '$[1]') as oldval, json_extract(value, '$[2]') as newval FROM json_each( json_array( json_array('id', OLD.id, null), json_array('created', OLD.created, null), json_array('name', OLD.name, null), json_array('age', OLD.age, null) ) ) WHERE oldval IS NOT newval ) ); END;
sqlite> INSERT INTO people (name, age) VALUES ('Alice', 30), ('Bob', 42); sqlite> UPDATE people SET age = age + 2; sqlite> UPDATE people SET name = 'Eva' WHERE name='Alice'; sqlite> DELETE FROM people WHERE name = 'Bob'; sqlite> SELECT * FROM change_log; id created action table_name obj_id oldvals --- ---------- ---------- ---------- ------ -------------------------------------------------------------------------------- 1 2018-08-27 INSERT people 1 2 2018-08-27 INSERT people 2 3 2018-08-27 UPDATE people 1 {"age":30} 4 2018-08-27 UPDATE people 2 {"age":42} 5 2018-08-27 UPDATE people 1 {"name":"Alice"} 6 2018-08-27 DELETE people 2 {"id":2,"created":"2018-08-27 21:53:53","name":"Bob","age":44}
You can also query the changelog using JSON functions. Here’s every change involving the age
field:
sqlite> SELECT * FROM change_log WHERE json_type(oldvals, '$.age') IS NOT NULL; id created action table_name obj_id oldvals --- ---------- ---------- ---------- ------ -------------------------------------------------------------------------------- 3 2018-08-27 UPDATE people 1 {"age":30} 4 2018-08-27 UPDATE people 2 {"age":42} 6 2018-08-27 DELETE people 2 {"id":2,"created":"2018-08-27 22:02:23","name":"Bob","age":44}
Using sqlite_master and table_info
To make sure I never miss a column, I use sqlite_master
and PRAGMA table_info(TABLENAME)
to generate the trigger SQL as in this pseudo code:
rows = Query("SELECT name FROM sqlite_master WHERE type='table'") for row in rows: columns = Query("PRAGMA table_info(" + row.name + ")") # generate SQL using this table and column list
— Matt
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK