2

Temporal Tables In SQL Server

 2 years ago
source link: https://dotnetcoretutorials.com/2021/12/11/temporal-tables-in-sql-server/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Temporal Tables In SQL Server

An under the radar feature introduced in SQL Server 2016 (And available in Azure SQL also) is Temporal Tables. Temporal tables allow you to keep a “history” of all data within a SQL table, in a separate (but connected) history table. In very basic terms, every time data in a SQL table is updated, a copy of the original state of the data is cloned into the history table.

The use cases of this are pretty obvious, but include :

  • Ability to query what the state of data was at a specific time
  • Ability to select multiple sets of data between two time periods
  • Viewing how data changes over time (For example, feeding into an analytics or machine learning model)
  • An off the shelf, easy to use, auditing solution for tracking what changed when
  • And finally, a somewhat basic, but still practical disaster recovery scenario for applications going haywire

A big reason for me doing this post is that EF Core 6 has just been released, and includes built in support for temporal tables. While this post will just be a quick intro in how temporal tables work, in the future I’ll be giving a brief intro on getting set up with Entity Framework too!

Getting Started

When creating a new table, it’s almost trivial to add in temporal tables. If I was to create a Person table with two columns, a first and last name, it would look something like so :

CREATE TABLE Person
(  
    [Id] int NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED, 
    FirstName NVARCHAR(250) NOT NULL,
    LastName NVARCHAR(250) NOT NULL , 
    -- The below is how we turn on Temporal. 
    [ValidFrom] datetime2 (0) GENERATED ALWAYS AS ROW START,
    [ValidTo] datetime2 (0) GENERATED ALWAYS AS ROW END,
    PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo)
 )  
 WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.PersonHistory));

There is a couple of things to note here, the first is the last three lines of the CreateTable expression. We need to add the ValidFrom and ValidTo columns and the PERIOD line for everything to work nicely.

Second, it’s very important to note the HISTORY_TABLE statement. When I first started with temporal tables I assumed that there would be a naming convention along the lines of {{TableName}}History. But infact, if you don’t specify what the history table should be called, instead you just end up with a semi random generated name that doesn’t look great.

With this statement run, we end up with a table within a table when looking via SQL Management Studio. Something like so :

I will note that you can turn on temporal tables on an existing table too with an ALTER TABLE statement which is great for projects already on the go.

But here’s the most amazing part about all of this. Nothing about how you use a SQL table changes. For example, inserting a Person record is the same old insert statement as always :

INSERT INTO Person (FirstName, LastName)
VALUES ('Wade', 'Smith')

Our SQL statements for the most part don’t even have to know this is a temporal table at all. And that’s important because if we have an existing project, we aren’t going to run into consistency issues when trying to turn temporal tables on.

With the above insert statement, we end up with a record that looks like so :

The ValidFrom is the datetime we inserted, and obviously the ValidTo is set to maximum because for this particular record, it is valid for all of time (That will become important shortly).

Our PersonHistory table at this point is still empty. But let’s change that! Let’s do an Update statement like so :

UPDATE Person
SET LastName = 'G'
WHERE FirstName = 'Wade'

If we check our Person table, it looks largely the same as before, our ValidFrom date has shifted forward and Wade’s last name is G. But if we check our PersonHistory table :

We now have a record in here that tells us that between our two datetimes, the record with ID 1 had a last name of Smith.

Again, our calling code that updates our Person record doesn’t even have to know that temporal tables is turned on, and everything just works like clockwork encapsulated with SQL Server itself.

Modifying Table Structure

I wanted to point out another real convenience with temporal tables that you might not get if you decided to roll your own history table. After a table creation, what happens if you wanted to add a column to the table?

For example, let’s take our Person table and add a DateOfBirth column.

ALTER TABLE Person
ADD DateOfBirth DATE NULL

You’ll notice that I am only altering the Person table, and not touching the PersonHistory table. That’s because temporal tables automatically handle the alter table statements to also modify the underlying history table. So if I run the above, my history table also receives the update :

This is a huge feature because it means your two tables never get out of sync, and yet, it’s all abstracted away for you and you’ll never have to think about it!

Querying Temporal Tables

Of course, what happens if we actually want to query the history of our Person record? If we were rolling our own, we might have to do a union of our current Person table, and our PersonHistory table. But with temporal tables, it’s a single select statement and SQL Server will work out under the hood which table the data should come from.

Confused? Maybe these examples will help :

SELECT *
FROM Person 
FOR SYSTEM_TIME AS OF '2021-12-10 23:19:25'
WHERE Id = 1

I run the above statement to ask for the state of the Person record, with Id 1, at exactly a particular time. The code executes, and in my case, it pulls the record from the History table.

But let’s say I run it again with a different time :

SELECT *
FROM Person 
FOR SYSTEM_TIME AS OF '2022-01-01'
WHERE Id = 1

Here I’ve made it in the future, just to illustrate a point, but in this case I know it will pull the record from the Person table because it will be the very latest.

What I’m trying to demonstrate is that there is no switching between tables to try and work out which version was correct at the right time. SQL Server does it all for you!

Better yet, you’ll probably end up showing an audit history page somewhere on your web app if using temporal tables. For that we can use the BETWEEN statement like so :

SELECT *
FROM Person 
FOR SYSTEM_TIME BETWEEN '2021-01-01' AND '2022-01-01'
WHERE Id = 1

This then fetches all audit history *and* the current record if applicable between those time periods. Again, all hidden away under the hood for you and exposed as a very simple SYSTEM_TIME query statement.

Size Considerations

While all of this sounds amazing, there is one little caveat to a lot of this. And that’s data size footprint.

In general, you’ll have to think about how much data you are storing if your system generates many updates across a table. Due to the nature of temporal tables storing a copy of the data, many small updates could explode the size of your database. However, in a somewhat ironic twist, tables that receive many updates may be a good candidate for temporal table anyway for auditing history.

Another thing to think about is use of blob data types (text, nvarchar(max)), and even things such as nvarchar vs varchar. Considerations around these data types upfront could save a lot of data space in the long run when it’s duplicated across many historic rows.

There is no one size all approach that fits perfectly, but it is something to keep in mind!

Temporal Tables vs Event Sourcing

Let’s just get this out of the way, temporal tables and event sourcing are not drop in replacements for each other, nor are they really competing technologies.

A temporal table is somewhat rudimentary. It takes a copy of your data and stores it elsewhere on every update/delete operations. If we ask for a row at a specific point in time, we will receive what the data looked like at that point. And if we give a timeframe, we will be returned several copies of that data.

Event sourcing is more of a series of deltas that describe how the data was changed. The hint is the name (event), and it functions much the same as receiving events on a queue. Given a point in time, event sourcing can recreate the data by applying deltas up to that point, and given a timeframe, instead of receiving several copies of the data, we instead receive the deltas that were applied.

I think temporal tables work best when a simple copy of the data will do. For pure viewing purposes, maybe as a data administrator looking at how data looked at a certain point of time for application debugging and the like. Whereas event sourcing really is about structuring your application in an event driven way. It’s not a simple “switch” that you flick on to suddenly make your application work via event sourcing.

Temporal Tables vs Roll Your Own

Of course, history tables are a pretty common practice already. So why use Temporal Tables if you’ve already got your own framework set up?

I think it really comes down to ease of use and a real “switch and forget” mentality with temporal tables. Your application logic does not have to change at all, nor do you have to deal with messy triggers. It almost is an audit in a box type solution with very little overhead to set up and maintain. If you are thinking of adding an audit trail/historic log to an application, temporal tables will likely be the solution 99% of the time.

Entity Framework Support

As mentioned earlier, EF Core 6.0 shipped with temporal table support. That includes code first migrations for turning on temporal tables, and LINQ query extensions to make querying temporal tables a breeze. In the next post, we’ll give head first into how that works!

Learned Something New?

By buying me a coffee (As a one off or monthly), you ensure I have the optimal caffeine levels to continue reading patch notes and wading through shoddy documentation to give you nothing but the important bits of .NET.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK