Image binarization – im2bw and graythresh
As I promised last time, I'm writing a series about functional designs for image binarization in the Image Processing Toolbox. Today I'll start by talking about im2bw and graythresh, two functions that have been in the product for a long time.
The function im2bw appeared in Image Processing Toolbox version 1.0, which shipped in early fall 1993. That was about the time I interviewed for my job at MathWorks. (I was a beta tester of version 1.0.)
Here is the help text from that early function:
%IM2BW Convert image to black and white by thresholding. % BW = IM2BW(X,MAP,LEVEL) converts the indexed image X with % colormap MAP to a black and white intensity image BW. % BW is 0 (black) for all pixels with luminance less % than LEVEL and 1 (white) for all other values. % % BW = IM2BW(I,LEVEL) converts the gray level intensity image % I to black and white. BW is 0 (black) for all pixels with % value less than LEVEL and 1 (white) for all other values. % % BW = IM2BW(R,G,B,LEVEL) converts the RGB image to black % and white. BW is 0 (black) for all pixels with luminance % less than LEVEL and 1 (white) for all other values. % % See also IND2GRAY, RGB2GRAY.
At that time, the prefix "im" in the function name meant that the function could take more than one image type (indexed, intensity, RGB).
At this point in the early history of MATLAB, the language really only had one type. Everything in MATLAB was a double-precision matrix. This affected the early functional design in two ways. First, the toolbox established [0,1] as the conventional dynamic range for gray-scale images. This choice was influenced by the mathematical orientation of MATLAB as well as the fact that there was no one-byte-per-element data type. The second impact on functional design can be seen in the syntax IM2BW(R,G,B,LEVEL). RGB (or truecolor) images had to be represented with three different matrices, one for each color component. I really don't miss those days!
Here are two examples, an indexed image and a gray-scale image.
[X,map] = imread('trees.tif'); imshow(X,map); title('Original indexed image')
bw = im2bw(X,map,0.5); imshow(bw) title('Output of im2bw')
I = imread('cameraman.tif'); imshow(I) title('Original gray-scale image') xlabel('Cameraman image courtesy of MIT')
bw = im2bw(I,0.5); imshow(bw) title('Output of im2bw')
It turns out that im2bw had other syntaxes that did not appear in the documentation. Specifically, the LEVEL argument could be omitted. Here is relevant code fragment:
if isempty(level), % Get level from user level = 0.5; % Use default for now end
Experienced software developers will be amused by the code comment above, "Use default for now". This indicates that the developer intended to go back and do something else here before shipping but never did. Anyway, you can see that a LEVEL of 0.5 is used if you don't specify it yourself.
MATLAB 5 and Image Processing Toolbox version 2.0 shipped in early 1998. These were very big releases for both products. MATLAB 5 featured multidimensional arrays, cell arrays, structs, and many other features. MATLAB 5 also had something else that was big for image processing: numeric arrays that weren't double precision. At the time, you could make uint8, int8, uint16, int16, uint32, int32, and single arrays. However, there was almost no functional support or operator support these arrays. The capability was so limited that we didn't even mention it in the MATLAB 5 documentation.
Image Processing Toolbox 2.0 provided support for (and documented) uint8 arrays. The other types went undocumented and largely unsupported in both MATLAB and the toolbox for a while longer.
Multidimensional array and uint8 support affected almost every function in the toolbox, so version 2.0 was a complex release, especially with respect to compatibility. We wanted to be able to handle uint8 and multidimensional arrays smoothly, to the degree possible, with existing user code.
One of the design questions that arose during this transition concerned the LEVEL argument for im2bw. Should the interpretation of LEVEL be different, depending on the data type of the input image? To increase the chance that existing user code would work as expected without change, even if the image data type changed from double to uint8, we adopted the convention that LEVEL would continue to be specified in the range [0,1], independent of the input image data type. That is, a LEVEL of 0.5 has the same visual effect for a double input image as it does for a uint8 input image.
Now, image processing as a discipline is infamous for its "magic numbers," such as threshold values like LEVEL, that need to be tweaked for every data set. Sometime around 1999 or 2000, we reviewed the literature about algorithms to compute thresholds automatically. There were only a handful that seemed to work reasonably well for a broad class of images, and one in particular seemed to be both popular and computationally efficient: N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, 1979, pp. 62-66. This is the one we chose to implement for the toolbox. It is the algorithm under the hood of the function graythresh, which was introduced in version 3.0 of the toolbox in 2001.
The function graythresh was designed to work well with the function im2bw. It takes a gray-scale image and returns the same normalized LEVEL value that im2bw uses. For example:
level = graythresh(I)
level = 0.3451
bw = im2bw(I,level); imshow(bw) title('Level computed by graythresh')
Aside from multilevel thresholding introduced in R2012b, this has been the state of image binarization in the Image Processing Toolbox for about the last 15 years.
There are a few weaknesses in this set of functional designs, though, and these weaknesses eventually led the development to consider an overhaul.
- Most people felt that the value returned by graythresh would have been a better default LEVEL than 0.5.
- If you don't need to save the value of LEVEL, then you end up calling the functions in a slightly awkward way, passing the input image to each of the two functions: bw = im2bw(I,graythresh(I))
- Although Otsu's method really only needs to know the image histogram, you have to pass in the image itself to the graythresh function. This is awkward for some use cases, such as using the collective histogram of multiple images in a dataset to compute a single threshold.
- There was no locally adaptive thresholding method in the toolbox.
Next time I plan to discuss the new image binarization functional designs in R2016a.
Also, thanks very much to ez, PierreC, Matt, and Mark for their comments on the previous post.
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.