Role of two phase commit protocol in Database Management

Two Phase commit protocol is a type of distributed commit protocol. There are two different types of databases. In a local database system, every transaction needs to be committed. Therefore, the transaction manager has the role to commit the decision by conveying it to the reporting manager.

However, when it comes to a distributed system, the transaction manager should convey it from all the servers from various sites included in the distributed system to commit the decision. When each server completes the processing at each site. The transaction reaches a partially committed state. But it has to wait until all the transaction reaches that state. Once all the transactions from different servers reach the partially committed state, the transaction manager can commit the transaction. However, it is necessary that all the sites must commit the transaction.

Role of two phase commit protocol in Database Management

Distributed Commit Protocols
One-phase commit protocol
Two Phase commit protocol
Three Phase commit protocol
Advantages of Two Phase commit protocol
Disadvantages of Two Phase commit protocol

Distributed Commit Protocols

There are three different types of commit protocols in the distributed system. They are as follows.

One-phase commit protocol

One-phase commit protocol is one of the simplest ways of distributed commit protocol. In the distributed system there are two sites, one is the controller site and another is the slave site. There is only one controller site for one or many slave sites. These slaves sites execute the transaction. The following are the steps of the One-phase commit protocol

Each slave works on a transaction. When the slave node completes the execution of the transaction, the “DONE” message is sent to the controller node by the slave node.
After that, the slave node waits for the message from the controller site. This waiting time is known as the window of vulnerability. The controller site must send a “Commit” or “Abort” message to the slave node.
To send the “Commit” or “Abort” message to the slave node. The controller must receive a “DONE” message from all the slave sites presented in the distributed system. After that, the controller makes the decision whether to commit or abort the transaction. This period is known as the commit point. After the decision is made the message is sent to all the slave nodes.
Once the controller receives the message to all the slave nodes, they need to either commit or abort the transaction and then send the respective acknowledgment message to the controller.

Two Phase commit protocol

To reduce the vulnerability of the one-phase protocol we use the two-phase protocol system. The steps are performed in two different phases.

Phase 1- Prepare Phase

The controller uses coordinator (Ci) to place a log record <Prepare T> on the log record.
When the controller receives the “DONE” message from all the slave nodes. The coordinator sends the <Prepare T> message to all the slave nodes.
The slave node after receiving the message from the controller can decide whether to commit or abort the transaction. If the slave wants to commit the transaction it will send the <ready T> on log record.
If the slave does not want to commit the transaction because of some incomplete activity or timeout it must send <no T> on log record.

Phase 2- Commit/Abort Phase

In this phase, there are two possibilities. One is that every slave node has sent a READY message to the controller and the other is that some of the slave nodes have sent a NOT READY message to the controller.

In case all the slave nodes have sent the “Ready” message to the controller, the controller performs the following steps.
- A “Global commit” message is sent to all the slave nodes by the controller node.
- After that, the slave nodes commit the transaction, it sends the acknowledgment commit message to the controller node.
- The transaction is considered to be committed once the controller receives the acknowledgment commit message from all the slave nodes.
If the controller has received its first “Not Ready” message from one of the slave nodes, it performs the following steps.
- A “Global Abort” message is sent to all the slave nodes.
- Similarly, after receiving the “Global Abort” message from the controller, the slave nodes abort the transaction and sent the acknowledgment abort to the controller.
- The transaction is considered to be aborted once the controller receives the acknowledgment abort from all the slave nodes.

Three Phase commit protocol

It is similar to two-phase protocol but the steps are divided into 3 phases.

Phase 1- Prepare Phase

The same steps of phase-1 of the two-phase protocol are followed here. The controller sends <Prepare T> message to all the slave nodes and waits for the response from the slave node.
If the slave node wants to commit the transaction it will send the <ready T> message otherwise it will send <no T> message to the controller.

Phase 2- Prepare to Commit Phase

Instead of sending the message to each node, the controller node sends a broadcast message “Enter Prepared State”.
Once all the slave nodes receive the message they can vote “OK” in response to the broadcast message.

Phase 3- Commit/Abort Phase

It follows the same steps which are followed in Phase 2 of the two-phase protocol.
It does not require the acknowledge commit or abort message from the slave node.

Advantages of Two Phase commit protocol

The data is consistent and always available.
The database is always synchronized.
All the database gets an update at once or none of them gets the update.

Disadvantages of Two Phase commit protocol

Higher latency as the controller waits for the message from all the slave nodes.
Execution time is more.
Not good for critical application processes.
Transaction Coordinator places an important role in two-phase commit protocol. If the transaction coordinator fails to send the message then all the slave nodes go to the blocked state.

Conclusion

Many distributed databases like MongoDB, CockroachDB, implement two-phase commit protocol. The two-phase protocol is consistent and allows synchronization in the database. It was introduced to reduce the vulnerability of a one-phase commit protocol.