Otsu's method
Otsu's method is an image processing algorithm commonly used to compute a threshold level to generate a mask of objects within an image. The original algorithm[1] calculates a global threshold - a single threshold level is calculated that divides the pixels in an image into two classes (sometimes called the background and foreground). This threshold is typically determined by using an exhaustive search to identify the grayscale value which maximizes the variance between the two classes. Thus, this algorithm works best for images which have objects which have significantly different intensities compared to the background, e.g. for fluorescence images.
How it works
Otsu's original algorithm was developed for grayscale images, which can have pixel values between and , where is the number of bits in the image. The output of the algorithm is a threshold level , which splits the pixels in the image into two classes. Thus, the two classes will consist of all pixels between and . These classes are labeled 1 and 2 in the equations below.
To find the threshold level, the algorithm computes either the within-class variance or the between-class variance . These variances are statistical metrics defined by the following equations:
Note that only one of the metrics - within-class variance or between-class variance - needs to be computed as they are related by the equation . Thus maximizing is the same as minimizing .
The probabilities of a pixel being in class 1 (or 2) are given by
where is the probability of a pixel of value (i.e., the height of each bar in the image histogram), and and are the number of pixels in class 1 or class 2. The mean values and the variances of the two classes are given by the following equations:
The algorithm performs an exhaustive search, calculating these values for every possible value for the threshold (i.e., between and ). The optimal threshold (at least according to the original paper) is the one that best separates the two classes, i.e. provides the largest between-class variance .
Implementation
Note that these quantities can be calculated off the intensity histogram. Most implementations use this to speed up calculations.
Additionally, as noted in the original paper, the statistic relies on second-order statistics (variances) while relies only on first-order statistics (means). Hence, is often used as it is computationally simplest (i.e., requires the least number of computational operations).
Example
How the algorithm works is illustrated by the following example. As shown in Fig. 1, the example image used is 6 x 6 pixel and shows a bright object in the center (e.g., a bright fluorescent bead). The image intensity histogram for this image is shown on the right.
The algorithm starts by choosing a threshold value and calculates the within-class and between-class variances as shown in Template:EquationNote and Template:EquationNote above. For example, assuming a threshold value of 1, the pixels in the image will be split into two classes as shown in Fig. 2.
The probability, mean, and variance for the pixels in Class 1 are then given by:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu_1 &= \frac{\sum_i p_i x_i}{N_1}\\ &= \frac{(9 \cdot 0) + (6 \cdot 1)}{9 + 6}\\ & = 0.4 \end{align} }
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_1 &= \frac{(9 \cdot (0 - 0.4)^2 + 6 \cdot (1 - 0.4)^2)}{9 + 6}\\ &= 0.24 \end{align} }
The probability, mean, and variance for the pixels in Class 2 are given by:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \omega_2 &= \frac{N_2}{N_1 + N_2}\\ &= \frac{ 4 + 5 + 8 + 4}{9 + 6 + 4 + 5 + 8 + 4}\\ &= 0.58 \end{align} }
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \mu_2 &= \frac{\sum_i p_i x_i}{N_2}\\ &= \frac{(4 \cdot 2) + (5 \cdot 3) + (8 \cdot 4) + (4 \cdot 4)}{4 + 5 + 8 + 4}\\ & = 3.94 \end{align} }
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_2 &= \frac{(4 \cdot (2 - 3.57)^2 + 5 \cdot (3 - 3.57)^2) + 8 \cdot (4 - 3.57)^2 + 4 \cdot (5 - 3.57)^2}{4 + 5 + 8 + 4}\\ &= 1.01 \end{align} }
Using these values, the within-class and between-class variances are:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_W &= \omega_1 \sigma^2_1 + \omega_2 \sigma^2_2\\ &= 0.42 \cdot 0.24 + 0.58 \cdot 1.01\\ &= 0.69 \end{align} }
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \sigma^2_B &= \omega_1 \omega_2 (\mu_1 - \mu_2)^2\\ &= 0.53 \cdot 0.47 (0.4 - 3.57)^2\\ &= 2.44 \end{align} }
To find the optimal threshold value, the algorithm performs these calculations for each possible value. In this example, the algorithm would compute the values between 0 and 5 since these is the range of intensities present in the image. The table below summarizes the calculations:
Threshold | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \omega_1} | 0.25 | 0.42 | 0.53 | 0.67 | 0.89 | 1 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_1} | 0 | 0.4 | 0.74 | 1.21 | 1.91 | 2.25 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_1} | 0 | 0.24 | 0.62 | 1.33 | 2.46 | 3.13 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \omega_2} | 0.75 | 0.58 | 0.47 | 0.33 | 0.11 | 0 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_2} | 3 | 3.57 | 3.94 | 4.33 | 5 | 0 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_2} | 1.93 | 1.01 | 0.53 | 0.22 | 0 | 0 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_W} | 1.44 | 0.69 | 0.57 | 0.96 | 2.19 | 0 |
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_B} | 1.69 | 2.44 | 2.56 | 2.17 | 0 | 0 |
The algorithm then outputs the threshold value that gives the lowest value for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_W}
or similarly the highest value for Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2_B}
, which is 2. Figure 3 below shows the result of a mask generated using the thresholding operation Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle I > 2}
.
Advantages and disadvantages
Otsu's method provides a (computationally) fast way to determine a suitable threshold level during masking. However, the original algorithm assumes that there are two distinct classes of intensity (i.e., that the intensity histogram shows two peaks separated by a valley). The algorithm therefore does poorly if the image has objects with multiple distinct intensity distributions or if the objects are very dim, and therefore close in intensity to the background. It also performs poorly if the image suffers from uneven illumination. Enhancements to the algorithm have since been made to improve its performance in these situations.
References
- ↑ N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," in IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, Jan. 1979, doi: 10.1109/TSMC.1979.4310076.