COW and ROW Snapshot Working Principles

Rate and give feedback:

Concept

Storage Networking Industry Association (SNIA) defines a snapshot as a point-in-time copy of a defined collection of data. The data backup is an image of the source data at the point in time of data copy. A snapshot can be either a copy or a replication of the specified data. According to SNIA, snapshots are classified into full snapshots and incremental snapshots, which use different snapshot technologies:

Full snapshot: split mirror
Differential snapshot:
- Copy-on-write (COW)
- Redirect-on-write (ROW)

A full snapshot follows the same working principle as RAID 1. A mirrored volume of the source volume is generated during snapshot creation. It has no difference when users read data from a volume. However, users need to write data to both volumes. This document focuses on incremental snapshots and will describe them in detail below.

Application of Snapshot Technologies

Online data restoration: Snapshot technologies can be used to restore data online and restore a storage device to the state at the snapshot point in time in the case of storage device failure or damage.
Available copies: Snapshot technologies provide another data access channel for storage users. When the source data is being processed online, users can access and use the snapshot data to perform tasks, such as testing.

Incremental Snapshot

HyperSnap is a snapshot feature developed by Huawei and mainly uses incremental snapshots (including COW and ROW). The following describes their working principles.

Concepts

Data organization
Storage systems use virtual storage technology. Each LUN created in a storage pool is composed of both a meta volume and a data volume, where:
- A meta volume records the data organization form and attributes of a LUN. Each meta volume is arranged in a tree structure.
- A data volume stores the actual user data. Data is read and written in units of extent.
Source volume
A volume that stores the source data for a snapshot and is represented as a source LUN to users. A source LUN consists of a meta volume and a data volume, where:
- The meta volume records the location of the source data in the source LUN.
- The data volume records the service data saved in the source LUN.
Snapshot volume
A logical data copy generated after a snapshot is created for a source LUN. A snapshot LUN consists of a snapshot meta volume and a snapshot data volume, where:
- The snapshot meta volume contains the snapshot metadata. One snapshot has one snapshot meta volume.
- The snapshot data volume corresponds to the snapshot meta volume and records new data written into the snapshot LUN.
COW data space
An amount of storage space dynamically allocated by the storage system from the storage pool where the source LUN resides to store COW data after a snapshot is generated and activated. All the snapshot LUNs of the same source LUN share one COW data space. A COW data space consists of a COW meta area and a COW data area, where:
- The COW meta area is shared by all snapshots in the source LUN. It is used to store the COW mapping information of the snapshot LUNs if new data is written into their source LUN, that is, the mapping relationships between COW data and data locations in the COW data area.
- The COW data area corresponds to the COW meta area. It is used to record COW data if new data is written into the source LUN.
Mapping table
A mapping table records the data changes on the source and snapshot LUNs at a point in time as well as the new locations of such data. Mapping tables are classified into shared mapping tables and private mapping tables, where:
- A shared mapping table, which is stored in the COW meta area, records the mapping relationships between COW data and COW data locations in the COW data area.
- A private mapping table, which is stored in the snapshot meta volume of each snapshot LUN, records the mapping relationships between data written into the snapshot LUN and data locations in the snapshot LUN.

In a COW technology, when new data is written to a specific storage location for a first time, original content is first read and written to the COW data space, and then the new data is written to the storage device. No COW operation is required for next write operations on this location. When data is first written to a location, COW completes a read operation (reading the original data from the location) and two write operations (writing the original data to the snapshot space and writing new data to the location). It takes a lot of I/Os for frequent writing. Therefore, COW can be used when the number of read operations on a LUN is far more than that of write operations. In addition, if an application is prone to write hotspots, that is, only write data to a limited range of data, then the COW snapshot technology is recommended because multiple write operations to the same data will only lead to one COW operation.

After a snapshot is created and activated, a data copy that is identical to the source volume is generated. The storage system allocates COW data space from the source LUN and generates a snapshot volume. Because no data is written to snapshot or source volumes, no data is recorded to the COW meta and data areas or to the snapshot meta and data volumes. The following figure shows the initial state of a snapshot.

Figure 1-1 Initial state of a snapshot

download?uuid=f366992f545f497bbef54c9f13280fbb

Writing Data to a Source Block

If an application server attempts to write data to a source LUN of an activated snapshot, the storage system does not process the write request immediately. Instead, the storage system uses the COW mechanism to copy COW data to the COW data space, changes the mapping relationships in the mapping table, and then writes new data to the source volume. The application server sends a request to write the source volume at Time1:

Data1 is changed to DataX.
The COW mechanism is triggered to copy Data1 to the COW data space.
The mapping relationships are updated in the mapping table. The location of Data1 is changed to g0 in the COW data space.
DataX is written into the source volume.

Figure 1-2 I/O process of writing data to a source volume

download?uuid=0482387317934f7fbbce8ceef9fef681

When COW creates a snapshot, there is no physical data copying action, only the physical location metadata (pointer information) of the source data block where the original data resides is copied. Therefore, COW snapshot creation is very fast and can be completed instantly. After the snapshot is created, snapshot software monitors and traces changes in the original data (write operations to the source data block), and copies data on the source data block to the new data block before the original data in the source data block is rewritten. New data is written to the source data block to overwrite the original data. All source data blocks constitute the so-called source data volume and new data blocks constitute the snapshot volume. The snapshot volume retains only changed data blocks and is much smaller than the source data volume. COW has an obvious shortcoming that the write performance of the source data volume deteriorates. Because each time new data is rewritten, actually two write operations are performed.

Writing Data to a Snapshot Volume

After a snapshot is activated, application servers can send read/write requests to its snapshot LUN. Write requests are directly processed by the snapshot LUN, and then the private mapping table records the data location in the snapshot LUN.

An application server sends a request for writing Data a to the snapshot volume at Time2. Data a is written directly to the snapshot volume.
The private mapping table records that Data a is stored in location g'0 of the snapshot LUN.

Figure 1-3 I/O process of writing data to a snapshot volume

download?uuid=953373db6ee148b084cc8a3e2aaf5f80

Reading Snapshot Data (Data Written to the Snapshot Volume)

To access snapshot data of a certain point in time, the system directly read unchanged data blocks from the data volume, and changed and copied data blocks from the snapshot space. Once created, a snapshot traces and records the metadata information that describes block changes.

If the application server writes Data a into the snapshot volume, the application server reads snapshot data as follows:

The application server sends a request to read snapshot data.
The private mapping table locates the desired snapshot data.
The application server reads the snapshot data. In the following figure, this is Data a.

Figure 1-4 Reading a snapshot volume (data written to the snapshot volume)

download?uuid=a4c41a18f067496b9a97e6ed61a8b4b6

Reading Snapshot Data (No Data Written to the Snapshot Volume)

If the application server has written data to the source volume but not the snapshot volume, the application server reads snapshot data from the source volume or COW data space, as shown in the following figure.

The application server sends a request to read snapshot data.
The shared mapping table locates the desired snapshot data.
The application server reads the snapshot data. In the following figure, this is Data 0, Data 1, Data 2, and Data 3.

Figure 1-5 Reading a snapshot volume (no data written to the snapshot volume)

download?uuid=2ec1b0b5c533451ab1c3d195babe0bd3

Advantages and Disadvantages of COW

Advantages: COW neither occupies any storage resources nor affects system performance before creating snapshots.

Disadvantages:

The write performance of the source data volume deteriorates. Three read/write operations are performed in changing source data:
1. Read source data.
2. Write the source data to the snapshot volume.
3. Write new data to the source data volume.
It takes a lot of I/Os for frequent write requests from hosts.
No complete physical copy is obtained. The snapshot volume stores only part of the original data of the source data volume.
If the amount of data copied to the snapshot volume exceeds the reserved space, the snapshot becomes invalid.

The implementation principle of ROW is very similar to COW. The difference is that ROW's first write operation to the original data volume will redirect the new data to the reserved snapshot volume. Therefore, the original data in the ROW snapshot remains in the source data volume, and in order to ensure the integrity of the snapshot data, the state of the source data volume will change from read-write to read-only when the snapshot is created.

When creating a snapshot, ROW will also copy a source data pointer table as a snapshot data pointer table. At this time, the pointer records of the two tables are the same, as shown in Figure 2-6.

Figure 1-6 Initial state of a snapshot

download?uuid=da089484bf7c480db9b0de16fd5a20e2

After a user creates a snapshot at Time0, the internal processing flow of the storage system is as follows:

Copy the snapshot mapping table according to the source volume mapping table.
Create a snapshot volume to store newly written data.

Writing Data to the Source Volume

Figure 1-7 Writing data to the source volume

download?uuid=c1a1c26131174f30b85add85a6155432

The application server sends a request to write the snapshot volume at Time1 to change Data1 to DataX.
The original data in the source data volume receives an update operation instruction and new data is written into the new snapshot volume.
The source volume mapping table is updated.

Steps 1 to 3 are repeated in subsequent writes until the next snapshot is generated. In the preceding steps, the old data at the previous snapshot time point Time0 is still stored in the source volume, while the new data is finally stored in the reserved snapshot volume. Data in the snapshot volume mapping table remains unchanged.

If you take multiple snapshots of a VM, a snapshot chain is generated, and the disk volume of the VM is always mounted at the very end of the snapshot chain, that is, all write operations of the VM will be placed in the last snapshot volume. This feature causes a problem that if 10 snapshots are taken in total, when restoring to the latest snapshot point, you need to merge 10 snapshot volumes to obtain a complete latest snapshot time point data; if it is restored to the 8th snapshot time point next time, then you need to merge the previous 8 snapshot volumes into a complete snapshot time point data. It can be seen from this that the main disadvantage of ROW is that there is no complete snapshot volume, and the relationship between the snapshots is chained. If there are more snapshot levels, the system overhead for performing snapshot recovery will be greater.

Reading the Source Volume

A reading operation is redirected if a writing redirection has been performed on the location since the last snapshot. Otherwise, the reading operation does not need to be redirected.

Figure 1-8 Reading the source volume

download?uuid=55e13e1fee314f6396458ebbfcdeb845

The application server sends a request at Time2 to read data from G1 of the source volume.
If redirect-on-write is performed on G1 at Time1, data reading is redirected to L0 of the snapshot volume.

A new snapshot does not contain any user data but only a group of pointers for locating user data of the source file system. As a result, users accessing snapshot data are actually accessing the data of the source file system. When data in the source file system is modified, the snapshot retains the space occupied by the original data for protection. The protected space will not be reclaimed unless the snapshot is deleted.

Updates to the source file system cause the snapshot to retain more of the original data blocks and increase the snapshot space. However, the snapshot only retains the data at the point in time when it was created. New data written after that time point is not protected by the snapshot and does not occupy snapshot space. Snapshot rollback restores the data to its original state at the time of creation to prevent data loss from misoperations or viruses.

Exercise caution when performing snapshot rollback because its changes are irreversible. A rollback can restore data to a specified point in time, but data written after that time will be lost with the snapshot after a rollback. Manually copy individual files from the snapshot to the source file system for small-scale restoration to avoid rolling back the entire file system. Because snapshot rollback will cause the loss of file system data or the snapshot, services may be interrupted if you are accessing the data during the rollback. Exercise caution when performing snapshot rollback.

Advantages and Disadvantages of ROW

Advantage: The write performance of the source data volume is not affected. After a snapshot is taken for the source data volume, write operations on the source data volume are redirected to the new volume. All old data (of the snapshot volume) is saved on the read-only source data volume. Therefore, only one write operation is required to update the source data, avoiding the performance problem caused by writing twice in COW. A distributed system provides concurrent reads because data is distributed. Therefore, in the case of distributed storage, the continuous read and write performance of ROW will be higher than COW.

Disadvantages:

There is no complete snapshot volume. The data mapping table of the ROW snapshot volume stores the original copy of the source data volume, and the data pointer table of the source data volume stores the updated copy. Therefore, a snapshot chain is generated after multiple snapshots are created, making it extremely complex to access a snapshot volume for original data, trace data in the source data volume, and delete a snapshot. To restore a snapshot, snapshot files are continuously merged, causing a large system overhead.
The read performance of a single host deteriorates. Due to ROW operations, continuous data is distributed to disks and continuous writes become random writes, deteriorating the read performance.

Therefore, COW is more suitable for read-intensive applications or applications that are prone to write hotspots, that is, only write data to a limited range of data. Because data changes are limited to a range, multiple write operations to the same data will only lead to one replication operation. ROW is more suitable for write-intensive applications. In the case of distributed storage, the read performance of ROW will be higher than COW.

COW and ROW Snapshot Working Principles

Concept

Application of Snapshot Technologies

Incremental Snapshot

Concepts

Advantages and Disadvantages of COW

Advantages and Disadvantages of ROW

Recommend

iPhone14发布在即，又一富士康工厂开始招工，内推每人奖励5280元

使用React构建精简版本掘金（四）

Old Spice联名新广告，依旧如此有趣

Calendar view in SwiftUI with MultiDatePicker

「刺客首领」钟薛高，卖得贵其实也可以

用户激励不是等价值交换，而是学会运用杠杆！

【案例】上汽集团订制校招人才测评模板的落地方法

没有人知道《300英雄》还能活多久

Using Google Maps with Flutter

紫光集团司法重整将收官：智广芯承接100%股权，将一次性支付剩余现金清偿款项

About Joyk