Labeling connected components in an image is a common operation. But the original algorithm proposed is slow. It works fine if the image is small. But as the image becomes larger, the algorithm slows down really fast. Recently, a few researchers find a simple solution to this. Image that previously took hundreds of seconds would not take only a couple of seconds to get labeled.
A binary image has only ones and zeroes. The goal of labeling connected components is to identify which ones are "connected".
For example, if the above image, the goal is to identify that the two circles on the left are connected. Sure, they are separate circles, but they are "connected". They share the same boundary. Similarly for the two circles on the right.
The original algorithm is a two pass algorithm. The first pass marks labels for each pixel and the second pixel marks corrects invalid labels. You might want to read more about the connected components labeling algorithm. You might also want to check an example on how the connected components labeling algorithm works.
The main bottleneck of the algorithm is the large equivalence array (the union-find structure). Calculating and resolving equivalent labels for one big image takes a lot of time. However, if the image is divided into several regions (say the image is split into 3x3 parts), then the labels can be computed and resolved much faster.
After splitting, you execute the standard algorithm on each part. Now you need to resolve any anomalies that might exist at the borders of each part.
This resolution is done in three parts:
For each case, you fix any wrong labeling that might exist.
The next step is to determine the number of pieces the image is split in. Experimentally, it was found that each piece should be 3030 pixels - 6060 pixels in size. That way, the equivalence array is small enough to be computed rapidly.
The researchers did experiments with a 1760*1168 image. This image is HUGE. The standard algorithm would take a lot of time. It simply isn't feasible to calculate the time it takes.
On dividing, the following times were obtained:
They also compared it against other methods of computing labels. This algorithm clearly surpassed the results any other algorithm could produce.
You learned about a new connected component labeling algorithm that computes labels for large images really fast. Have a look at the actual paper for pseudo code on how to implement it.