12

How does the 2PL (Two-Phase Locking) algorithm work

 4 years ago
source link: https://vladmihalcea.com/2pl-two-phase-locking/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Last modified:

Imagine having a tool that can automatically detect if you are using JPA and Hibernate properly. Hypersistence Optimizer is that tool!

Introduction

The 2PL (Two-Phase Locking) algorithm is one of the oldest concurrency control mechanisms used by relational database systems to guarantee data integrity.

In this article, I’m going to explain how the 2PL algorithm works and how you can implement it in any programming language.

Lock types

Before we start discussing the 2PL algorithm implementation, it’s very important to explain how the read and write locks work.

A read or share lock prevents a resource from being written while allowing other concurrent reads.

A write or exclusive lock disallows both read and write operations on a given resource.

Compatibility matrix Read lock Write Lock Read lock Allow Prevent Write lock Prevent Prevent

Some database systems, like PostgreSQL, MySQL, or SQL Server, offer the possibility of acquiring read and write locks on a given tuple or range of tuples. Other database systems, like Oracle, only allow write/exclusive locks to be acquired via the FOR UPDATE clause.

Database name Read lock clause Write lock clause Oracle FOR UPDATE FOR UPDATE SQL Server WITH (HOLDLOCK,
ROWLOCK) WITH (HOLDLOCK,
UPDLOCK, ROWLOCK) PostgreSQL FOR SHARE FOR UPDATE MySQL LOCK IN SHARE MODE FOR UPDATE

For more details about how you can acquire read or write locks with JPA and Hibernate, check outthis article.

However, read and write locks are not limited to database systems only. While traditionally, entering a Java synchronized block allows the acquisition of an exclusive lock, since version 1.5, Java allows both read and write locks via the ReentrantReadWriteLock object.

Two-Phase Locking

Locks alone are not sufficient for preventing conflicts. A concurrency control strategy must define how locks are being acquired and released because this also has an impact on transaction interleaving.

For this purpose, the 2PL protocol defines a lock management strategy for ensuring strict serializability.

The 2PL protocol splits a transaction into two sections:

  • expanding phase (locks are acquired, and no lock is allowed to be released)
  • shrinking phase (all locks are released, and no other lock can be further acquired).

For a database transaction, the expanding phase means that locks are allowed to ve acquired from the beginning of the transaction until its end, while the shrinking phase is represented by the commit or rollback phase, as at the end of a transaction, all the acquired locks are being released.

The following diagram shows how transaction interleaving is coordinated by 2PL:

f2QnIvM.png!web

  • Both Alice and Bob acquire a read lock on a given a post record via a SELECT FOR SHARE PostgreSQL clause.
  • When Bob attempts to execute an UPDATE statement on the post entry, his statement is blocked by the Lock Manager because the UPDATE statement needs to acquire a write lock on the post row while Alice is still holding a read lock on this database record.
  • Only after Alice’s transaction ends and all her locks are being released, Bob can resume his UPDATE operation.
  • Bob’s UPDATE statement will generate a lock upgrade, so his previously acquired read lock is replaced by an exclusive lock, which will prevent other transactions from acquiring a read or write lock on the same post record.
  • Alice starts a new transaction and issues a SELECT FOR SHARE query with a read lock acquisition request for the same post entry, but the statement is blocked by the Lock Manager since Bob owns an exclusive lock on this record.
  • After Bob’s transaction is committed, all his locks are released, and Alice’s SELECT query can be resumed.

Strict Serializability

The 2PL algorithm offers Strict Serializability, which is the golden standard when it comes to data integrity. Strict Serializability means that the outcome is both Serializable and Linearizable.

Two or more transactions are Serializable if their associated read and write operations are interleaved in such a way that the outcome is equivalent to some serial execution. For example, if we have two transactions A and B, as long as the outcome is either A, B or B, A, the two transactions are Serializable. For N transactions, the outcome must be equivalent to one of the N! transaction permutations.

However, Serializability does not take the flow of time into consideration. On the other hand, Linearizability implies time-based ordering. For example, a system is linearizable if any subsequent read will reflect the changes done by a previous write operation. For more details about Lienearizbaility, check outthis article.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

MFn6ZvJ.png!webvEnUBfE.png!web

Conclusion

The 2PL (Two-Phase Locking) algorithm was introduced in 1976 in [The Notions of Consistency and Predicate Locks in a Database System] (http://research.microsoft.com/en-us/um/people/gray/papers/On%20the%20Notions%20of%20Consistency%20and%20Predicate%20Locks%20in%20a%20Database%20System%20CACM.pdf) paper by Kapali Eswaran and Jim Gray (et al.), which demonstrated that Serializability could be obtained if all transactions used the 2PL algorithm.

Initially, all database systems employed 2PL for implementing Serializable transactions, but, with time, many vendors have moved towards MVCC (Multi-Version Concurrency Control) concurrency control mechanisms.

Nowadays, only SQL Server uses the 2PL algorithm by default, unless the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION MVCC modes are not set at the database level. Even if the InnoDB MySQL storage engine is MVCC-based when switching to the Serializable isolation level, the database will use the 2PL algorithm since locks will be acquired on both read and write operations.

For this reason, it’s very important to understand how the 2PL algorithm works and that it can guarantee Strict Serializability.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK