Termination Detection Algorithm in Distributed System

In this tutorial you are going to learn about Termination Detection Algorithm in Distributed System.

Introduction

The termination detection algorithm was introduced by ‘Francez, Dijkstra and Scholten’.

Termination detection algorithm is a Fundamental problem technique to determine if a distributed computation has terminated.
Since no process has a complete knowledge of the global state and the global time does not exist, it is a non trivial task.
Locally terminated is a state in which a process has finished its computation state and will not restart any action unless it has received a message.
In the termination detection problem, a particular process must infer when the underlying computation has terminated.

The termination detection algorithm ensures that the execution of the algorithm can not be indefinitely delay underlying computation.

Termination detection using distributed Snapshots by “ST Huang” (1989)

Assumption that the termination detection algorithm has a logical bidirectional communications channels.
The communication channels lies between every pair of the processes.
Communication channels are reliable but non-FIFO.
Message delay is arbitrary but finite.

Main Idea

When a process goes from active to idle, it requests all other processes to take a local snapshot.
A request is successful if all processes have taken snapshot for it.
The requester is successful, a global snapshot of the request can thus be obtained and the recorded state will indicate termination of the computation.

Formal Description

Each process that is, i maintains a logical clock which is denoted by x, will be initialized to zero at the start of the computation.
A process increments its x once, whenever it becomes idle.
A message sent by a process at its logical time x is of the each process synchronizes its logical form b(x).
A control message that requests the process to take local snapshot issues by process I at its logical time x is in the form R(x, i)
Each process synchronizes its logical clock x loosely with the logical clocks x’s on other processes in such a way that it is the maximum of the clock values ever received or sent in messages.
A process maintains a variable k such that when the process is idle, (x, k) is the maximum of the values (x, k) on all the messages R(x, k) ever received or sent by processes.
Logical time is compared as follows (x, k)>(x`,k`) iff(x>x`)or(x=x`) and (k.k`) i.e a tie between x and x` is broken by the process identification numbers k and k`

Algorithm

The algorithm is defined by the following four rules.

When the process I is active, it may send a basic message to process j at any time by doing send a B(x) to j.
Upon receiving a b(x`), process I does
- Let x:x`+1:
- If(I is idle)-> go active.
When process I goes idle, it does
- Let x:x+1;
- Let k:=1; Send message R(x, k) to all other processes; Take a local snapshot for the request by R(x, k).
Upon receiving message R(x`, k`), processes does,
- [((x`, k`)>(x, k)) ^ (I is idle) ->let(x, k)];
- Take a local snapshot for the request by R(x`, k`)
- ((x`,k`)<(x, k))^(I is idle)-> do nothing;
- (I is active)-> let x:max(x`, x)

The last process is to terminate will have the largest clock value. Hence, every process in the termination detection algorithm will take a snapshot for it. but, it will not take a snapshot for any other process.

This article on Termination Detection Algorithm in Distributed System is contributed by Hemalatha P. If you like TheCode11 and would like to contribute, you can also write your article and mail to thecode11info@gmail.com