Otsu's method

From Bioimaging Wiki

Otsu's method is an image processing algorithm commonly used to compute a threshold level to generate a mask of objects within an image. The original algorithm[1] calculates a global threshold - a single threshold level is calculated that divides the pixels in an image into two classes (sometimes called the background and foreground). This threshold is typically determined by using an exhaustive search to identify the grayscale value which maximizes the variance between the two classes. Thus, this algorithm works best for images which have objects which have significantly different intensities compared to the background, e.g. for fluorescence images.

How it works

Otsu's original algorithm was developed for grayscale images, which can have pixel values between and , where is the number of bits in the image. The output of the algorithm is a threshold level , which splits the pixels in the image into two classes. Thus, the two classes will consist of all pixels between and . These classes are labeled 1 and 2 in the equations below.

To find the threshold level, the algorithm computes either the within-class variance or the between-class variance . These variances are statistical metrics defined by the following equations:

Note that only one of the metrics - within-class variance or between-class variance - needs to be computed as they are related by the equation . Thus maximizing is the same as minimizing .

The probabilities of a pixel being in class 1 (or 2) are given by

where is the probability of a pixel of value (i.e., the height of each bar in the image histogram), and and are the number of pixels in class 1 or class 2. The mean values and the variances of the two classes are given by the following equations:

The algorithm performs an exhaustive search, calculating these values for every possible value for the threshold (i.e., between and ). The optimal threshold (at least according to the original paper) is the one that best separates the two classes, i.e. provides the largest between-class variance .

Implementation

Note that these quantities can be calculated off the intensity histogram. Most implementations use this to speed up calculations.

Additionally, as noted in the original paper, the statistic relies on second-order statistics (variances) while relies only on first-order statistics (means). Hence, is often used as it is computationally simplest (i.e., requires the least number of computational operations).

Example

How the algorithm works is illustrated by the following example. As shown in Fig. 1, the example image used is 6 x 6 pixel and shows a bright object in the center (e.g., a bright fluorescent bead). The image intensity histogram for this image is shown on the right.

This figure shows two images. The image on the left is an example of an input image, which is 6x6 pixels and has a bright object at the center. The image on the right shows the image intensity histogram of the input image.
Figure 1: Input image (left) and intensity histogram (right).

The algorithm starts by choosing a threshold value and calculates the within-class and between-class variances as shown in Template:EquationNote and Template:EquationNote above. For example, assuming a threshold value of 1, the pixels in the image will be split into two classes as shown in Fig. 2.

This figure shows an image intensity histogram that has divided into two classes, assuming a threshold value of 1. The bars of the histogram with a grayscale value of less than 1 is labeled as Class 1 and the bars with a value of 2 and above is labeled as Class 2.
Figure 2: The image intensity histogram showing the pixels in the image divided into two classes based on the current threshold value.

The probability, mean, and variance for the pixels in Class 1 are then given by:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu_1 &= \frac{\sum_i p_i x_i}{N_1}\\ &= \frac{(9 \cdot 0) + (6 \cdot 1)}{9 + 6}\\ & = 0.4 \end{align} }

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_1 &= \frac{(9 \cdot (0 - 0.4)^2 + 6 \cdot (1 - 0.4)^2)}{9 + 6}\\ &= 0.24 \end{align} }


The probability, mean, and variance for the pixels in Class 2 are given by:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \omega_2 &= \frac{N_2}{N_1 + N_2}\\ &= \frac{ 4 + 5 + 8 + 4}{9 + 6 + 4 + 5 + 8 + 4}\\ &= 0.58 \end{align} }

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu_2 &= \frac{\sum_i p_i x_i}{N_2}\\ &= \frac{(4 \cdot 2) + (5 \cdot 3) + (8 \cdot 4) + (4 \cdot 4)}{4 + 5 + 8 + 4}\\ & = 3.94 \end{align} }

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_2 &= \frac{(4 \cdot (2 - 3.57)^2 + 5 \cdot (3 - 3.57)^2) + 8 \cdot (4 - 3.57)^2 + 4 \cdot (5 - 3.57)^2}{4 + 5 + 8 + 4}\\ &= 1.01 \end{align} }


Using these values, the within-class and between-class variances are:

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_W &= \omega_1 \sigma^2_1 + \omega_2 \sigma^2_2\\ &= 0.42 \cdot 0.24 + 0.58 \cdot 1.01\\ &= 0.69 \end{align} }

Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_B &= \omega_1 \omega_2 (\mu_1 - \mu_2)^2\\ &= 0.53 \cdot 0.47 (0.4 - 3.57)^2\\ &= 2.44 \end{align} }


To find the optimal threshold value, the algorithm performs these calculations for each possible value. In this example, the algorithm would compute the values between 0 and 5 since these is the range of intensities present in the image. The table below summarizes the calculations:

Calculated values for different thresholds
Threshold 0 1 2 3 4 5
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \omega_1} 0.25 0.42 0.53 0.67 0.89 1
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_1} 0 0.4 0.74 1.21 1.91 2.25
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_1} 0 0.24 0.62 1.33 2.46 3.13
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \omega_2} 0.75 0.58 0.47 0.33 0.11 0
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_2} 3 3.57 3.94 4.33 5 0
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_2} 1.93 1.01 0.53 0.22 0 0
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_W} 1.44 0.69 0.57 0.96 2.19 0
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_B} 1.69 2.44 2.56 2.17 0 0


The algorithm then outputs the threshold value that gives the lowest value for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_W} or similarly the highest value for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_B} , which is 2. Figure 3 below shows the result of a mask generated using the thresholding operation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I > 2} .

The figure shows the results of applying the threshold calculated using Otsu's method on the original image. On the left, the image shows a binary mask, with only some of the central regions in white (representing true) and the remainder in black (representing false). On the right, the image shows the resulting image intensity histogram with bars less than a value of 3 colored blue to represent Class 1 and bars with a value of 3 or higher colored yellow to represent Class 2.
Figure 3: Results of applying the threshold calculated using Otsu's method on the original image. (Left) Binary mask. (Right) The final image intensity histogram showing the calculated optimal threshold value.

Advantages and disadvantages

Otsu's method provides a (computationally) fast way to determine a suitable threshold level during masking. However, the original algorithm assumes that there are two distinct classes of intensity (i.e., that the intensity histogram shows two peaks separated by a valley). The algorithm therefore does poorly if the image has objects with multiple distinct intensity distributions or if the objects are very dim, and therefore close in intensity to the background. It also performs poorly if the image suffers from uneven illumination. Enhancements to the algorithm have since been made to improve its performance in these situations.

References

  1. N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan. 1979, doi: 10.1109/TSMC.1979.4310076.