Steve on Image Processing

November 26th, 2011

Exploring shortest paths – part 2

Earlier this month I started exploring the question of computing the shortest path from one object in a binary image to another. I described the following basic algorithm:

  1. Compute the distance transform for just the upper left block of foreground pixels.
  2. Compute the distance transform for just the lower right block of foreground pixels.
  3. Add the two distance transforms together.
  4. The pixels in the regional minimum of the result lie along one or more of the shortest paths from one block to the other.

This algorithm only works for path-based approximations to the Euclidean distance transform; it doesn't work for the Euclidean distance transform itself.

The Image Processing Toolbox function supports three path-based approximations to the Euclidean distance transform: 'cityblock', 'chessboard', and 'quasi-euclidean'.

Today I want to compare these three and discuss ambiguities associated with the idea of "shortest path" on an image.

Let's work with another small test image:

bw = logical([ ...
    0 0 0 0 0 0 0
    0 1 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 0 0
    0 0 0 0 0 1 0
    0 0 0 0 0 0 0  ]);

One way to get two binary images, each containing one object from the original, is to use a label matrix:

L = bwlabel(bw);
bw1 = (L == 1)
bw1 =

     0     0     0     0     0     0     0
     0     1     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0

bw2 = (L == 2)
bw2 =

     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     1     0
     0     0     0     0     0     0     0

Step 1 of the algorithm is to compute the distance transform for the first object. I'll use the 'cityblock' distance transform approximation. In this approximation, only horizontal and vertical path segments are allowed. Diagonal segments are not allowed. The distance between a pixel and any of its N, E, S, or W neighbors is 1.

D1 = bwdist(bw1, 'cityblock')
D1 =

     2     1     2     3     4     5     6
     1     0     1     2     3     4     5
     2     1     2     3     4     5     6
     3     2     3     4     5     6     7
     4     3     4     5     6     7     8
     5     4     5     6     7     8     9
     6     5     6     7     8     9    10
     7     6     7     8     9    10    11
     8     7     8     9    10    11    12
     9     8     9    10    11    12    13
    10     9    10    11    12    13    14

Step 2 is to compute the distance transform for the second object.

D2 = bwdist(bw2, 'cityblock')
D2 =

    14    13    12    11    10     9    10
    13    12    11    10     9     8     9
    12    11    10     9     8     7     8
    11    10     9     8     7     6     7
    10     9     8     7     6     5     6
     9     8     7     6     5     4     5
     8     7     6     5     4     3     4
     7     6     5     4     3     2     3
     6     5     4     3     2     1     2
     5     4     3     2     1     0     1
     6     5     4     3     2     1     2

Step 3 is to add the distance transforms together.

D = D1 + D2
D =

    16    14    14    14    14    14    16
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    14    12    12    12    12    12    14
    16    14    14    14    14    14    16

The smallest value of D, 12, is the minimum path length (for paths consisting only of horizontal and vertical segments) from one object to the other.

Step 4 is to find the set of pixels in the regional minimum of D. This set represents the shortest path.

paths = imregionalmin(D);

To help visualize the path I'll use imoverlay from the MATLAB Central File Exchange. The code below will overlay the pixels on the shortest path in gray.

P = false(size(bw));
P = imoverlay(P, paths, [.5 .5 .5]);
P = imoverlay(P, bw, [1 1 1]);
imshow(P, 'InitialMagnification', 'fit')

Uh oh, why do we have a big rectangular block of gray instead of a single path? The answer is that there isn't a unique shortest path. There are many paths you could travel to get from one object to the other that all have the same path length, 12. Here are just a view of them:

subplot(2,2,1)
imshow(P, 'InitialMagnification', 'fit')
x = [2 6 6];
y = [2 2 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

subplot(2,2,2)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 6];
y = [2 10 10];
hold on
plot(x, y, 'g', 'LineWidth', 2);
hold off

subplot(2,2,3)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 3 4 4 5 5 6 6 6 6 6 6];
y = [2 2 3 3 4 4 5 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

subplot(2,2,4)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 2 2 2 2 3 3 4 4 5 5 6];
y = [2 3 4 5 6 7 7 8 8 9 9 10 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

Next, let's look at the 'chessboard' distance. In this distance transform approximation, a path can consist of horizontal, vertical, and diagonal segments, and the path length between a pixel and any of its neighbors (N, NE, E, SE, S, SW, W, and NW) is 1.0.

D1 = bwdist(bw1, 'chessboard')
D2 = bwdist(bw2, 'chessboard')
D = D1 + D2
D1 =

     1     1     1     2     3     4     5
     1     0     1     2     3     4     5
     1     1     1     2     3     4     5
     2     2     2     2     3     4     5
     3     3     3     3     3     4     5
     4     4     4     4     4     4     5
     5     5     5     5     5     5     5
     6     6     6     6     6     6     6
     7     7     7     7     7     7     7
     8     8     8     8     8     8     8
     9     9     9     9     9     9     9


D2 =

     9     9     9     9     9     9     9
     8     8     8     8     8     8     8
     7     7     7     7     7     7     7
     6     6     6     6     6     6     6
     5     5     5     5     5     5     5
     5     4     4     4     4     4     4
     5     4     3     3     3     3     3
     5     4     3     2     2     2     2
     5     4     3     2     1     1     1
     5     4     3     2     1     0     1
     5     4     3     2     1     1     1


D =

    10    10    10    11    12    13    14
     9     8     9    10    11    12    13
     8     8     8     9    10    11    12
     8     8     8     8     9    10    11
     8     8     8     8     8     9    10
     9     8     8     8     8     8     9
    10     9     8     8     8     8     8
    11    10     9     8     8     8     8
    12    11    10     9     8     8     8
    13    12    11    10     9     8     9
    14    13    12    11    10    10    10

You can see that the shortest path length for the 'chessboard' distance is 8 instead of 12. Let's look at the pixels on the various shortest paths.

paths = imregionalmin(D)

P = false(size(bw));
P = imoverlay(P, paths, [.5 .5 .5]);
P = imoverlay(P, bw, [1 1 1]);
clf
imshow(P, 'InitialMagnification', 'fit')
paths =

     0     0     0     0     0     0     0
     0     1     0     0     0     0     0
     1     1     1     0     0     0     0
     1     1     1     1     0     0     0
     1     1     1     1     1     0     0
     0     1     1     1     1     1     0
     0     0     1     1     1     1     1
     0     0     0     1     1     1     1
     0     0     0     0     1     1     1
     0     0     0     0     0     1     0
     0     0     0     0     0     0     0

Now the set of gray pixels is not rectangular, but it still encompasses multiple possible shortest paths. And, surprisingly, it seems to include some pixels on which the path actually moves temporarily away from its destination.

Here are several possible paths, all of length 8 (according to the 'chessboard' distance transform).

subplot(2,2,1)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 4 5 6 6 6 6 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

subplot(2,2,2)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 2 2 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2);
hold off

subplot(2,2,3)
imshow(P, 'InitialMagnification', 'fit')
x = [2 1 2 1 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

subplot(2,2,4)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 4 3 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off

Finally, let's look at the 'quasi-euclidean' distance transform. In this variation, paths from one pixel to another can consist of horizontal, vertical, and diagonal segments. The distance from a pixel to one of its horizontal or vertical neighbors is 1, while the distance to one of its diagonal neighbors is sqrt(2).

Here's the computation again, this time specifying 'quasi-euclidean':

D1 = bwdist(bw1, 'quasi-euclidean');
D2 = bwdist(bw2, 'quasi-euclidean');
D = D1 + D2
D =

   12.4853   11.6569   11.6569   12.2426   12.8284   13.4142   14.8284
   11.0711    9.6569   10.2426   10.8284   11.4142   12.0000   13.4142
   10.4853    9.6569    9.6569   10.2426   10.8284   11.4142   12.8284
   10.4853    9.6569    9.6569    9.6569   10.2426   10.8284   12.2426
   10.4853    9.6569    9.6569    9.6569    9.6569   10.2426   11.6569
   11.0711    9.6569    9.6569    9.6569    9.6569    9.6569   11.0711
   11.6569   10.2426    9.6569    9.6569    9.6569    9.6569   10.4853
   12.2426   10.8284   10.2426    9.6569    9.6569    9.6569   10.4853
   12.8284   11.4142   10.8284   10.2426    9.6569    9.6569   10.4853
   13.4142   12.0000   11.4142   10.8284   10.2426    9.6569   11.0711
   14.8284   13.4142   12.8284   12.2426   11.6569   11.6569   12.4853

You can see that the shortest path length is approximately 9.6569.

min(D(:))
ans =

    9.6569

As before, let's find the set of pixels that belong to a shortest path.

paths = imregionalmin(D)
paths =

     0     0     0     0     0     0     0
     0     1     0     0     0     0     0
     0     1     1     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     0     0     0
     0     0     0     0     1     1     0
     0     0     0     0     0     1     0
     0     0     0     0     0     0     0

That looks like trouble! The shortest-path pixels don't actually connect the two objects, as you can see below:

P = false(size(bw));
P = imoverlay(P, paths, [.5 .5 .5]);
P = imoverlay(P, bw, [1 1 1]);
clf
imshow(P, 'InitialMagnification', 'fit')

Where has our algorithm gone astray? That's the question I'll tackle next time.

All the posts in this series

  • the basic idea of finding shortest paths by adding two distance transforms together (part 1)
  • the nonuniqueness of the shortest paths (part 2)
  • handling floating-point round-off effects (part 3)
  • using thinning to pick out a single path (part 4)
  • using bwdistgeodesic to find shortest paths subject to constraint (part 5)


Get the MATLAB code

Published with MATLAB® 7.13

November 1st, 2011

Exploring shortest paths – part 1

Blog reader Bogdan asked this question last week:

I am trying to find the length and location of the shortest path between objects. It seems like some application of bwdist should be able to do this. Any advice? Example:

1 0 1
0 1 0
0 0 0
0 1 0
1 0 1

Should yield

0 0 0
0 0 0
0 p 0
0 0 0
0 0 0

With p = path length = 1

I agree, some application of bwdist should do it. Actually, there are several other useful functions that can be applied to this task, including imregionalmin and bwdistgeodesic, a new function that just shipped in R2011b.

I've been tinkering with this problem off and on for a few days, and I think it might take a few posts to explore some of the interesting ideas and issues posed by this problem. Today I'll start with the basics of using bwdist and imregionalmin to identify a set of shortest paths from one object to another.

Let's use this simple test image to start our exploration.

bw = [ ...
    0 0 0 0 0 0 0 0 0 0 0 0
    0 1 1 0 0 0 0 0 0 0 0 0
    0 1 1 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 1 1 0
    0 0 0 0 0 0 0 0 0 1 1 0
    0 0 0 0 0 0 0 0 0 0 0 0   ]
bw =

     0     0     0     0     0     0     0     0     0     0     0     0
     0     1     1     0     0     0     0     0     0     0     0     0
     0     1     1     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     1     1     0
     0     0     0     0     0     0     0     0     0     1     1     0
     0     0     0     0     0     0     0     0     0     0     0     0

We want to find the shortest path (or paths!) from the upper left block of foreground pixels to the lower right block. The key algorithmic idea is this:

  1. Compute the distance transform for just the upper left block of foreground pixels.
  2. Compute the distance transform for just the lower right block of foreground pixels.
  3. Add the two distance transforms together.
  4. The pixels in the regional minimum of the result lie along one or more of the shortest paths from one block to the other.

(For reference, see P. Soille, Morphological Image Analysis: Principles and Applications, 2nd edition, Springer, 2003, section 7.3.4, "Application to minimal path detection," pp. 233-234.)

But we can't use the normal Euclidean distance transform because it doesn't correspond to a path traversal from one pixel center to another. Rather, we have to use one of the distance transform approximations supported by bwdist: 'chessboard', 'cityblock', and 'quasi-euclidean'.

Here's how to do it using 'quasi-euclidean'.

bw1 = [ ...
    0 0 0 0 0 0 0 0 0 0 0 0
    0 1 1 0 0 0 0 0 0 0 0 0
    0 1 1 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0   ];

bw2 = [ ...
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 0 0 0
    0 0 0 0 0 0 0 0 0 1 1 0
    0 0 0 0 0 0 0 0 0 1 1 0
    0 0 0 0 0 0 0 0 0 0 0 0   ];

D1 = bwdist(bw1, 'quasi-euclidean');
D2 = bwdist(bw2, 'quasi-euclidean');

D_sum = D1 + D2;
D_sum(3:5, 3:10)
ans =

    7.8284    7.8284    7.8284    7.8284    7.8284    7.8284    8.4142    9.0000
    8.4142    7.8284    7.8284    7.8284    7.8284    7.8284    7.8284    8.4142
    9.0000    8.4142    7.8284    7.8284    7.8284    7.8284    7.8284    7.8284

You can see that the smallest value of D_sum is about 7.8284, and there's a set of pixels in-between the two objects that all have that value. The function imregionalmin identifies them all:

path_pixels = imregionalmin(D_sum)
path_pixels =

     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     1     1     1     1     1     1     0     0     0     0
     0     0     0     1     1     1     1     1     1     0     0     0
     0     0     0     0     1     1     1     1     1     1     0     0
     0     0     0     0     0     0     0     0     0     0     0     0
     0     0     0     0     0     0     0     0     0     0     0     0

In the quasi-euclidean approximation to the distance transform, the distance from a pixel to one of its horizontal or vertical neighbors is 1, and the distance from a pixel to one of its diagonal neighbors is sqrt(2). The result above says that the shortest such path between the two objects is 7.8284. Furthermore, it says that there is more than one shortest path.

I plan to explore this topic more fully this month. When using the quasi-euclidean distance transform, for example, there are floating-point round-off issues that have to be dealt with when you try to apply the technique above to larger images. Another issue is the effect of using different path-based distance transform approximations, such as 'cityblock' or 'chessboard'. Yet another issue is how to identify specific paths from the family of paths identified by imregionalmin. Along the way I'll experiment with different ways to visualize the results.

Finally, I'll explore how to use a new function, bwdistgeodesic, to find shortest paths when there are constraints on where the paths can go.

All the posts in this series

  • the basic idea of finding shortest paths by adding two distance transforms together (part 1)
  • the nonuniqueness of the shortest paths (part 2)
  • handling floating-point round-off effects (part 3)
  • using thinning to pick out a single path (part 4)
  • using bwdistgeodesic to find shortest paths subject to constraint (part 5)


Get the MATLAB code

Published with MATLAB® 7.13

October 21st, 2011

Five years ago: May, June, and July 2006

Much of the information I posted in this blog years ago is still useful today. Image processing theory hasn't been completely overturned since then, and I'm still talking about MATLAB after all. For the benefit of readers who have joined the party more recently, I will occasionally recap the useful bits that I posted about five years ago.

In May, June, and July of 2006 I was still writing posts in my spatial transforms series (inverse mapping; where is the output image; findbounds; translation confusion; and handling noninvertible cases).

I wrote about the L*a*b* color space a couple of times (a Lab-based uniform color scale; hue shifts near the L*=0 axis).

I described an algorithm trick for computing fast local sums.

I showed how to use regionprops to compute the centroids of the pegs in this MRI phantom:

And I answer a popular question (one that Loren has also addressed in her blog): how to do batch processing on a bunch of files, like this:

 files = dir('*.JPG');
 for k = 1:numel(files)
    rgb = imread(files(k).name);
    rgb = rgb(1:1800, 520:2000, :);
    rgb = imresize(rgb, 0.2, 'bicubic');
    imwrite(rgb, ['cropped\' files(k).name]);
 end


Get the MATLAB code

Published with MATLAB® 7.13

October 4th, 2011

Binary image convex hull – algorithm notes

Today I want to tell a little image processing algorithm story related to my post last week about the new bwconvhull function in the Image Processing Toolbox.

The developer (Brendan) who worked on this function came to see me sometime last year to find out how the 'ConvexImage' measurement offered by regionprops was computed so that he could use the same procedure for bwconvhull. I believed that I knew the answer off the top of my head, so without looking at the code I rattled off the following steps:

  1. Compute the x- and y-coordinates for the four corners of all the foreground pixels in the binary image.
  2. Use convhull to compute the convex hull of the (x,y) pairs from step 1.
  3. Use poly2mask to convert the convex hull polygon to a binary image mask.

A few days later Brendan came back to tell me that, although my description was clear, the code that I wrote ten years ago for regionprops actually does something else.

Oops.

I looked at the code and realized that Brendan was right, and I started to remember that I had actually made this same mistake many years before.

In fact, I did originally implement 'ConvexImage' in regionprops using the procedure outlined above. Before it shipped, though, I discovered (and fixed) a big problem with it.

Let me show you the problem using a small example. Here's a tiny binary image.

bw = [...
    0 0 0 0 0 0 0 0 0
    0 0 0 0 1 0 0 0 0
    0 0 0 1 0 1 0 0 0
    0 0 1 0 0 0 1 0 0
    0 1 0 0 0 0 0 1 0
    0 0 0 0 0 0 0 0 0 ];
imshow(bw, 'InitialMagnification', 'fit')

Now here's the convex image as computed by bwconvhull. The "filled-in" pixels are shown in light gray.

bwch = bwconvhull(bw)
hold on
h = imshow(bwch);
set(h, 'AlphaData', 0.8);
hold off
bwch =

     0     0     0     0     0     0     0     0     0
     0     0     0     0     1     0     0     0     0
     0     0     0     1     1     1     0     0     0
     0     0     1     1     1     1     1     0     0
     0     1     1     1     1     1     1     1     0
     0     0     0     0     0     0     0     0     0

Now I'll graphically illustrate the computational procedure above. First, compute the set of corner locations for all the foreground pixels.

[y,x] = find(bw);
dx = [-0.5 -0.5  0.5  0.5];
dy = [-0.5  0.5 -0.5  0.5];
x_corners = bsxfun(@plus, x, dx);
y_corners = bsxfun(@plus, y, dy);
x_corners = x_corners(:);
y_corners = y_corners(:);

Visualize step 1 by superimposing those corner locations on the input image.

imshow(bw, 'InitialMagnification', 'fit')
hold on
plot(x_corners, y_corners, 'og')
hold off

Step 2: Compute the convex hull of all the corner points. Visualize by superimposing the convex hull in red.

k = convhull(x_corners, y_corners);
x_hull = x_corners(k);
y_hull = y_corners(k);
hold on
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off

Step 3: Use poly2mask to convert the convex hull to a binary image mask.

mask = poly2mask(x_hull, y_hull, 6, 9);

imshow(bw, 'InitialMagnification', 'fit')
hold on
h = imshow(mask);
set(h, 'AlphaData', 0.8)
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off

You can see that there are extra pixels along the diagonal edges that got put into the mask. That's not good. If you repeat the operation, those diagonals will keep filling out until you're left with a rectangle. That's not supposed to happen with repeated application of the convex hull.

The reason this is happening is that the convex hull goes exactly through the center of pixels that are along the diagonal but "outside" the original set of foreground pixels. Because of the tie-breaking rules applied by poly2mask, all (in this case) of those outside pixels got included in the output mask.

The solution that I settled on ten years ago, and which is now also used in bwconvhull, was to modify the first step of the procedure. Instead of collecting the set of corner points of each foreground pixel, the correct procedure collects points along the center of each edge of each foreground pixel.

Here's what that looks like.

dx = [0.0  0.0  0.5 -0.5];
dy = [0.5 -0.5  0.0  0.0];
x_edges = bsxfun(@plus, x, dx);
y_edges = bsxfun(@plus, y, dy);
x_edges = x_edges(:);
y_edges = y_edges(:);

k = convhull(x_edges, y_edges);
x_hull = x_edges(k);
y_hull = y_edges(k);

imshow(bw, 'InitialMagnification', 'fit')
hold on
plot(x_edges, y_edges, 'og')
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off
mask = poly2mask(x_hull, y_hull, 6, 9);
imshow(mask, 'InitialMagnification', 'fit')
hold on
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
plot(x_edges, y_edges, 'og')
hold off

Much better!

I have been fooled more often than I would like to admit by sometimes nonintuitive digital image geometry. For you image processing algorithm people out there, I hope seeing these pictures and techniques will help you avoid similar pitfalls someday.


Get the MATLAB code

Published with MATLAB® 7.13

September 30th, 2011

Binary image convex hull

I've been intending to mention a new function bwconvhull that was introduced in the Image Processing Toolbox last spring in the R2011a release. Now that R2011b is out, I figure I better go ahead and do it!

bwconvhull computes the "convex hull of a binary image." Now I have to admit that this terminology is a little loose, so I'd better clarify. The convex hull of a set of 2-D points is the smallest convex polygon that contains the entire set. Here's an example from the MATLAB documentation for convhull:

xx = -1:.05:1; yy = abs(sqrt(xx));
[x,y] = pol2cart(xx,yy);
k = convhull(x,y);
plot(x(k),y(k),'r-',x,y,'b+')

The polygon in red is the convex hull of the set of points shown in blue. So convhull takes a set of points and returns a polygon, whereas bwconvhull takes a binary image and returns another binary image. What's up with that? It means simply that bwconvhull computes the convex hull of all the foreground pixels in the input image, and then it produces an output binary image with all the pixels inside the convex hull set to white. It's a little easier to show than to say, so here's what it looks like:

bw = imread('text.png');
imshow(bw)
bw2 = bwconvhull(bw);
imshow(bw2);

Let's overlay the foreground pixel locations from the input image (using blue dots) and the convex hull computed by convhull (using a thick red line).

[y, x] = find(bw);
k = convhull(x, y);
hold on
plot(x, y, 'b.')
plot(x(k), y(k), 'r', 'LineWidth', 4)
hold off

An important variation supported by bwconvhull is to compute the convex hulls of the individual objects in the input image. Here's how you do that:

bw3 = bwconvhull(bw, 'objects');
imshow(bw3)

Something called the convex deficiency is sometimes used in shape recognition applications. Loosely speaking, the convex deficiency of a shape is the convex hull of the shape minus the shape. Here's an example using the letter "T" from the text image above.

bwt = bw(7:24, 4:18);
imshow(bwt, 'InitialMagnification', 'fit')
bwtc = bwconvhull(bw_t);
imshow(bwtc, 'InitialMagnification', 'fit')
title('Convex hull image')
bwtcd = bwtc & ~bwt;
imshow(bwtcd, 'InitialMagnification', 'fit')
title('Convex deficiency image')

The convex deficiency of the letter "T" has two connected components. This kind of measurement can be useful for recognizing shapes.


Get the MATLAB code

Published with MATLAB® 7.13

September 27th, 2011

Digital image processing using MATLAB: reading image files

Today's post is part of an ongoing tutorial series on digital image processing using MATLAB. I'm covering topics in roughly the order used in the book Digital Image Processing Using MATLAB.

Contents

Reading images

The MATLAB function imread reads image data from a variety of formats, including:

  • Windows Bitmap (BMP)
  • Windows Cursor resources (CUR)
  • Flexible Image Transport System (FITS)
  • Graphics Interchange Format (GIF)
  • Hierarchical Data Format (HDF)
  • Windows Icon resources (ICO)
  • JPEG 2000
  • Joint Photographic Experts Group (JPEG)
  • Portable Bitmap (PBM)
  • Windows Paintbrush (PCX)
  • Portable Graymap (PGM)
  • Portable Network Graphics (PNG)
  • Portable Any Map (PNM)
  • Portable Pixmap (PPM)
  • Sun Raster (RAS)
  • Tagged Image File Format (TIFF)
  • X Window Dump (XWD)

For example, the following line reads the pixels from a PNG file into the MATLAB variable I:

I = imread('rice.png');

After you run the code above, the Workspace Browser shows you that your variable I is a 256x256 matrix of uint8 (unsigned eight-bit integer) values in the range [40,204].

Although imread uses the file extension (such as .png) to help determine the file format, it can also determine the format automatically. Suppose, for example, you have a set of files in which the file extension is used to indicate a particular data series, which I'll simulate here by copying rice.png to another filename:

s = which('rice.png')
s =

C:\MATLABs\R2011b\toolbox\images\imdemos\rice.png

copyfile(s, 'rice.series-x03a')
dir('rice.*')
rice.series-x03a  

I = imread('rice.series-x03a');
imshow(I)

Reading from multi-image TIFF files

TIFF files can store multiple independent images. The function imread has a syntax for specifying which image you want. The file mri.tif (download link) contains 27 images. Here is how you would read the 1st, 6th, and 11th images in the file.

I1 = imread('mri.tif', 'Index', 1);
I6 = imread('mri.tif', 'Index', 6);
I11 = imread('mri.tif', 'Index', 11);

Reading subsets of an image

For very large image files it can be useful to read in only a subset of the image pixels. The imread function can do this for both TIFF and JPEG 2000 files. Here's how to read rows 50-100 and columns 150-200 of shadow.tif:

I = imread('shadow.tif', 'PixelRegion', {[50 100], [150 200]});
imshow(I)

You can use the MATLAB function imfinfo to read metadata about an image file without reading in all the pixel data. Here's an example:

imfinfo('peppers.png')
ans = 

                  Filename: 'C:\MATLABs\R2011b\toolbox\images\imdemos\peppers.png'
               FileModDate: '16-Dec-2002 06:10:58'
                  FileSize: 287677
                    Format: 'png'
             FormatVersion: []
                     Width: 512
                    Height: 384
                  BitDepth: 24
                 ColorType: 'truecolor'
           FormatSignature: [137 80 78 71 13 10 26 10]
                  Colormap: []
                 Histogram: []
             InterlaceType: 'none'
              Transparency: 'none'
    SimpleTransparencyData: []
           BackgroundColor: []
           RenderingIntent: []
            Chromaticities: []
                     Gamma: []
               XResolution: []
               YResolution: []
            ResolutionUnit: []
                   XOffset: []
                   YOffset: []
                OffsetUnit: []
           SignificantBits: []
              ImageModTime: '16 Jul 2002 16:46:41 +0000'
                     Title: []
                    Author: []
               Description: 'Zesty peppers'
                 Copyright: 'Copyright The MathWorks, Inc.'
              CreationTime: []
                  Software: []
                Disclaimer: []
                   Warning: []
                    Source: []
                   Comment: []
                 OtherText: []

The display shows that peppers.png is a truecolor image with 24 bits per pixel. The size of the file is 287677 bytes, and it was last modified in the morning of December 16, 2002, during a period of heavy snowfall. By capturing the output of imfinfo in a variable, you can write code based on this information, such as:

info = imfinfo('peppers.png');
num_rows = info.Height;
num_cols = info.Width;

Sample image files

The files rice.png, shadow.tif, and peppers.png read by the code above are sample images that ships with Image Processing Toolbox. I use sample images from the toolbox a lot in this blog because I want readers to be able to run the code examples in my posts. You can see a list of the sample image files in the toolbox with this command:

help imdemos
  Image Processing Toolbox --- demos and sample images
 
    iptdemos           - Index of Image Processing Toolbox demos.
 
    ipexaerial         - Registering an Aerial Photo to an Orthophoto.
    ipexangle          - Measuring Angle of Intersection.
    ipexbatch          - Batch Processing ImageFiles in Parallel.
    ipexblind          - Deblurring Images Using a Blind Deconvolution Filter.
    ipexblockprocedge  - Block Processing Large Images.
    ipexblockprocstats - Computing Statistics for Large Images.
    ipexcell           - Detecting a Cell Using Image Segmentation.
    ipexcheckerboard   - Creating a Gallery of Transformed Images.
    ipexconformal      - Exploring a Conformal Mapping.
    ipexcontrast       - Contrast Enhancement Techniques.
    ipexfabric         - Color-based Segmentation Using the L*a*b* Color Space.
    ipexhistology      - Color-based Segmentation Using K-Means Clustering.
    ipexlanstretch     - Enhancing Multispectral Color Composite Images.
    ipexlucy           - Deblurring Images Using a Lucy-Richardson Filter.
    ipexndvi           - Finding Vegetation in a Multispectral Image.
    ipexnormxcorr2     - Registering an Image Using Normalized Cross-Correlation
    ipexpendulum       - Finding the Length of a Pendulum in Motion.
    ipexprops          - Measuring Regions in Grayscale Images.
    ipexradius         - Measuring the Radius of a Roll of Tape.
    ipexreconstruct    - Reconstructing an Image from Projection Data.
    ipexregularized    - Deblurring Images Using a Regularized Filter.
    ipexrice           - Correcting Nonuniform Illumination.
    ipexrotate         - Finding the Rotation and Scale of a Distorted Image.
    ipexroundness      - Identifying Round Objects.
    ipexshear          - Padding and Shearing an Image Simultaneously.
    ipexsnow           - Granulometry of Snowflakes. 
    ipextexturefilter  - Texture Segmentation Using Texture Filters.
    ipextraffic        - Detecting Cars in a Video of Traffic.
    ipexwatershed      - Marker-controlled watershed segmentation.
    ipexwiener         - Deblurring Images Using the Wiener Filter.
 
  Extended-example helper files.
    HistogramAccumulator           - Used by blockproc stats example.
    batchDetectCells               - Used by batch processing example.
    batchProcessFiles              - Used by batch processing example.
    conformalForward1              - Used by conformal mapping example.
    conformalForward2              - Used by conformal mapping example.
    conformalInverse               - Used by conformal mapping example.
    conformalInverseClip           - Used by conformal mapping example.
    conformalSetupInputAxes        - Used by conformal mapping example.
    conformalSetupOutputAxes       - Used by conformal mapping example.
    conformalShowLines             - Used by conformal mapping example.
    conformalShowCircles           - Used by conformal mapping example.
    conformalShowInput             - Used by conformal mapping example.
    conformalShowOutput            - Used by conformal mapping example.
    propsSynthesizeImage           - Used by measuring regions example.
    LanAdapter                     - Used by blockproc stats example.
 
  Sample MAT-files.
    imdemos.mat           - Images used in demos.
    pendulum.mat          - Used by ipexpendulum.
    regioncoordinates.mat - Used by ipexfabric.
    trees.mat             - Scanned painting.
    westconcordpoints.mat - Used by aerial photo registration example.
    mristack.mat          - Used by help example in IMPLAY.
    cellsequence.mat      - Used by help example in IMPLAY.
 
  Sample FITS images.
    solarspectra.fts
 
  Sample HDR images.
    office.hdr
 
  Sample JPEG images.
    football.jpg
    greens.jpg
 
  Sample PNG images.
    bag.png
    blobs.png
    circles.png
    coins.png
    concordorthophoto.png
    concordaerial.png
    fabric.png
    gantrycrane.png
    glass.png
    hestain.png
    liftingbody.png
    onion.png
    pears.png
    peppers.png
    pillsetc.png
    rice.png
    saturn.png
    snowflakes.png
    tape.png
    testpat1.png
    text.png
    tissue.png
    westconcordorthophoto.png
    westconcordaerial.png
 
  Sample TIFF images.
    AT3_1m4_01.tif
    AT3_1m4_02.tif
    AT3_1m4_03.tif
    AT3_1m4_04.tif
    AT3_1m4_05.tif
    AT3_1m4_06.tif
    AT3_1m4_07.tif
    AT3_1m4_08.tif
    AT3_1m4_09.tif
    AT3_1m4_10.tif
    autumn.tif  
    board.tif
    cameraman.tif
    canoe.tif   
    cell.tif
    circbw.tif
    circuit.tif
    eight.tif   
    forest.tif
    kids.tif
    logo.tif
    mandi.tif
    m83.tif
    moon.tif
    mri.tif
    paper1.tif
    pout.tif
    shadow.tif
    spine.tif
    tire.tif
    trees.tif
 
  Sample Landsat images.
    littlecoriver.lan
    mississippi.lan
    montana.lan
    paris.lan
    rio.lan
    tokyo.lan
 
  Sample AVI files.
    rhinos.avi
    traffic.avi
 
  Sample Analyze 7.5 images.
    brainMRI.img
 
  Photo credits
    board:
 
      Computer circuit board, courtesy of Alexander V. Panasyuk,
      Ph.D., Harvard-Smithsonian Center for Astrophysics.
 
    cameraman:
 
      Copyright Massachusetts Institute of Technology.  Used with
      permission.
 
    cell:
    AT3_1m4_01:
    AT3_1m4_02:
    AT3_1m4_03:
    AT3_1m4_04:
    AT3_1m4_05:
    AT3_1m4_06:
    AT3_1m4_07:
    AT3_1m4_08:
    AT3_1m4_09:
    AT3_1m4_10:
 
      Cancer cells from rat prostates, courtesy of Alan W. Partin, M.D,
      Ph.D., Johns Hopkins University School of Medicine.
 
    circuit:
 
      Micrograph of 16-bit A/D converter circuit, courtesy of Steve
      Decker and Shujaat Nadeem, MIT, 1993. 
 
    concordaerial and westconcordaerial:
 
      Visible color aerial photographs courtesy of mPower3/Emerge.
 
    concordorthophoto and westconcordorthophoto:
 
      Orthoregistered photographs courtesy of Massachusetts Executive Office
      of Environmental Affairs, MassGIS.
 
    forest:
 
      Photograph of Carmanah Ancient Forest, British Columbia, Canada,
      courtesy of Susan Cohen. 
 
    gantrycrane:
 
      Gantry crane used to build a bridge, courtesy of Jeff Mather.
    
    hestain:
 
      Image of tissue stained with hemotoxylin and eosin (H&E) at 40X
      magnification, courtesy of Alan W. Partin, M.D., Ph.D., Johns Hopkins
      University School of Medicine.
 
    liftingbody:
 
      Public domain image of M2-F1 lifting body in tow, courtesy of NASA,
      1964-01-01, Dryden Flight Research Center #E-10962, GRIN database
      #GPN-2000-000097.
 
    mandi:
 
      Bayer pattern-encoded image taken by a camera with a sensor
      alignment of 'bggr', courtesy of Jeremy Barry.
 
    m83:
 
      M83 spiral galaxy astronomical image courtesy of Anglo-Australian
      Observatory, photography by David Malin. 
 
    moon:
 
      Copyright Michael Myers.  Used with permission.
 
    pears:
 
      Copyright Corel.  Used with permission.
 
    tissue:
 
      Cytokeratin CAM 5.2 stain of human prostate tissue, courtesy of 
      Alan W. Partin, M.D, Ph.D., Johns Hopkins University School
      of Medicine.
 
    trees:
 
      Trees with a View, watercolor and ink on paper, copyright Susan
      Cohen.  Used with permission. 
 
    LAN files:
 
      Permission to use Landsat TM data sets provided by Space Imaging,
      LLC, Denver, Colorado.
 
    saturn:
 
      Public domain image courtesy of NASA, Voyager 2 image, 1981-08-24, 
      NASA catalog #PIA01364
 
    solarspectra:
 
      Solar spectra image courtesy of Ann Walker, Boston University.
 
  See also COLORSPACES, IMAGES, IMAGESLIB, IMUITOOLS, IPTFORMATS, IPTUTILS.

The output also shows you the image credit information for the images that are not copyrighted by MathWorks.

For more information

For more information, see Section 2.2 of Digital Image Processing Using MATLAB.

See also the reference pages for imread and imfinfo, as well as the section Reading and Writing Image Data in the Image Processing Toolbox User's Guide.


Get the MATLAB code

Published with MATLAB® 7.13

September 23rd, 2011

Dealing with “Really Big” Images: Image Adapters

I'd like to welcome back guest blogger Brendan Hannigan, for the third in a series of three posts on working with very large images in MATLAB.

In the previous two blog posts, I've been discussing how to avoid Out of Memory errors while working with large images using MATLAB and the Image Processing Toolbox. I first showed how to view and explore arbitrarily large images by creating a reduced resolution data set (R-Set) from an image file. Next, I demonstrated how you can process large images files using a file-to-file workflow, never loading the entire image into memory at once.

Contents

"Right, but my data is not in TIFF, NITF, or JPEG2000, remember?"

Yes, there's the problem. rsetwrite & blockproc support a few file formats "natively", but not everyone works with data in those formats.

There are some issues we face when creating functions like rsetwrite & blockproc which allow incremental processing of files from disc. Here are two:

  1. Not all file formats are amenable to incremental "region-based" I/O.
  2. There are a lot of file formats. Seriously.

That said, we wanted to provide this large data workflow to as many of out customers as possible. So, as of release R2010a, in addition to our "built-in" file formats, both rsetwrite and blockproc also support "image adapter" objects!

The ImageAdapter is an object-oriented MATLAB class. It is actually an "abstract" class, meaning that by itself it is not very useful. What it does do it define an interface for reading and writing image data. All you have to do is to tell us how to read and/or write sub-regions of your particular file format, and then we can do the rest!

"Whoa there buddy!.. Object - "Oriented" ?! That's too complicated for me!"

No, it's not. Don't sweat it, it's really not. I won't go into a full tutorial on how to write MATLAB classes in this blog as there are excellent videos and tutorials available on our website and in our product documentation that cover that. Instead, I will walk through a quick example that recently came up in this very blog.

In 2009 Steve published a blog post titled "MATLAB R2009a - imread and multipage TIFFs". Over the last 2 years many folks have commented on this post in an ongoing discussion about multi-page TIFF files. One customer, we'll call him "Doug", had a problem using blockproc to process arbitrary pages of a multi-page TIFF file.

"Doug" was frustrated because he found that there was no way to tell blockproc that he wanted to process the Nth "page" in his TIFF file. blockproc is hard-wired to process the first page of a TIFF file when passed a multi-page TIFF image. We didn't provide a syntactic option to select which page to process, because we wanted to avoid format-specific syntaxes/parameters in blockproc. Otherwise you can imagine the function interface could get pretty complex pretty fast.

"Ok so, how did Doug solve his problem?"

This was a perfect use case for the ImageAdapter class. Image adapter objects are useful when you want to have more control over the I/O in blockproc and rsetwrite. You may want to just control some specific aspect of how your file is read/written (like Doug) or you might want to read/write a completely new file format.

I wrote Doug a quick image adapter class to solve his problem which I will share with you, but first let's look at how it is used in a quick example. The image adapter class is called PagedTiffAdapter. We'll be using it to work with a multi-paged TIFF image, mri.tif (download link).

% Get some image information, we'll need this later.
filename = 'mri.tif';
page = 5;
info = imfinfo(filename);
cmap = info(page).Colormap;

% Create our PagedTiffAdapter object!
my_adapter = PagedTiffAdapter(filename,page);

% Let's not "do" anything to the data, let's just read it and return it
no_op_fun = @(bs) bs.data;

% Call blockproc using our image adapter object as the input source
single_page = blockproc(my_adapter,[100 100],no_op_fun);

% Display our single page from this TIFF file
imshow(single_page,cmap)

Voila. That's pretty simple right? We've now used blockproc to read in the 5th page of our multi-page TIFF file, mri.tif. Granted, this is not a particularly compelling use of block processing, but I'm just trying to show how you can use image adapter objects in place of "conventional" input images.

Let's have a look at the class now.

classdef PagedTiffAdapter < ImageAdapter
    properties
        Filename
        Info
        Page
    end
    methods
        function obj = PagedTiffAdapter(filename, page)
            obj.Filename = filename;
            obj.Info = imfinfo(filename);
            obj.Page = page;
            obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
        end
        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end
        function result = close(obj) %#ok
        end
    end
end

"Ok, that's some pretty dense code you have there."

The class is quite straightforward. It begins with a classdef line, which defines the name of the class and also indicates that the class inherits from our base-class, ImageAdapter, using the < symbol.

Next we see 2 sub-sections of our class definition, a properties block which holds important data that we will need over the lifespan of each object...

    properties
        Filename
        Info
        Page
    end

...and a methods block which defines the behavior of the objects.

    methods
        function obj = PagedTiffAdapter(filename, page)
            obj.Filename = filename;
            obj.Info = imfinfo(filename);
            obj.Page = page;
            obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
        end
        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end
        function result = close(obj) %#ok
        end
    end

"How do you know what properties and methods you need?"

Ahh, good question. Classes which inherit from the ImageAdapter base class are REQUIRED to have:

  1. A class constructor for initialization (all MATLAB classes require this)
  2. a readRegion method (required by the ImageAdapter base class)
  3. a close method (required by the ImageAdapter base class)
  4. a ImageSize property (required by the ImageAdapter base class)

"Hey wait, you don't define ImageSize in your properties block!!"

That is true. The ImageSize property is defined in the base class, so you don't have to redefine it here, you just have to set it. That's why at the end of my class constructor, I make sure to set the ImageSize property to be the size of the page that I am interested in.

"Well that's not super intuitive, but ok. What goes inside the methods?"

This class is going to be used to read data from a TIFF file, so in the class constructor all we do is gather information about the file that we'll need later and store that information in the appropriate properties.

Let's look the other methods individually. First the close method.

        function result = close(obj) %#ok
        end

The close method, in this case does nothing. The reason it does nothing is that we are doing our actual file I/O using the imread function, which does not require us to open the file handle directly. If we were writing an image adapter to read say, some arbitrarily formatted binary image, then we would likely be opening our file handle in the constructor, storing it in a property, and then in the close method we would close the file handle and do any other necessary clean up tasks. This example class is just very simple, so we have no "cleaning up" to do when we are done, but if we did, we would put that code in our close method. Regardless of what clean up code you need, you must have a close method. The close method is called by blockproc and rsetwrite only once, after all file I/O has completed.

Now let's look closely at the readRegion method.

        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end

This is the real work horse of the class. You will probably never need to call this method (or any method other than the constructor) yourself. Instead the image adapter clients will call these functions. blockproc is going to call this readRegion method when it wants to read a block of the input image. It's your job to figure out which block it needs, read that data, and send it back to it.

"How am I supposed to know which block it needs?"

It'll tell you! It's all contained in the 2 input arguments to the readRegion method, start and count. The start argument is a 2-element vector specifying the [row col] of the first pixel we need. The count argument is a 2-element vector specifying the size of the requested region in [rows cols].

For example, let's say start was [5 9] and count was [2 3]. Your method should return the data in rows 5-6 and columns 9-11, just as if you had indexed a variable like this:

return_to_blockproc = myVariable(5:6,9:11);

So your implementation of readRegion needs to take these 2 input arguments and then return the appropriate image data that they specify.

What will that mean for you? Well it depends on what your image adapter is designed for. In this case we are simply reading a specific page of a TIFF file, so I'm using start and count to construct a 'PixelRegion' argument to the imread function that will fetch only the pixels I am interested in. In your case, your readRegion might be pulling data from some binary formatted file, using a 3rd party mex-file to read a piece of data, or just whatever! That's the beauty of the image adapters, you can get your data from whereever you want.

"Ok I think I got it, but what about writing to new files?"

You can use your class to write to files as well, by defining a writeRegion method. The writeRegion method is (not surprisingly) almost the exact opposite of the readRegion method. Instead of you returning data from a specified block, you will receive data as an input argument and then write that data to our image at a specified block location.

After you write a writeRegion method, you can then use objects of your class as 'Destination' parameters to blockproc, allowing you to both read and write to arbitrarily large files of arbitrary format!

Specifying a writeRegion method is completely optional, but you must specify it if you want to use your image adapter object as a 'Destination' in blockproc. Otherwise, your objects will be "read-only".

In the spirit of brevity, I'm not going to do that for our PagedTiffAdapter. I've found that writeRegion methods are often more complex than their readRegion counterparts. I'll leave that as an exercise for the reader.

"Cool, everything turned out better than expected!"

Thanks! Well that about wraps it up. If you want to learn more about writing and using image adapter objects there is a chapter in the Image Processing Toolbox Users' Guide called "Working with Data in Unsupported Formats". There we walk thought writing an adapter for a more complex binary formatted file. You can also see the documentation for blockproc, rsetwrite, and ImageAdapter.

Thanks again Steve for letting me talk about this stuff! Have a great day!


Get the MATLAB code

Published with MATLAB® 7.13

September 20th, 2011

Engineering education and MATLAB – a personal perspective

Earlier this week I received a text message from my oldest son, who just started his second year at Northeastern University. He wrote to tell me that he’ll be using MATLAB in one his classes. I don’t know if he was excited, but I certainly was! This put me in a reflective mood, thinking about my own career path, engineering education, and how MATLAB has been woven throughout. So I’d like to present a rough timeline of thirty years of engineering education and MATLAB as seen through my own personal perspective.

1982

I arrived at Georgia Tech as an electrical engineering (EE) student. That year I took a required engineering programming course using Fortran, as well as a data structures and algorithms course using Pascal. Cleve’s Fortran-based MATLAB was being distributed informally to his network of academic colleagues on mag tape reels, but learning about MATLAB was still a few years away for me.

1984

Jack started The MathWorks based on the twin ideas that MATLAB would be very useful in controls engineering and that the new IBM PC would eventually catch on among engineers.

I only remember one undergraduate EE course in which we used a software package. It was some flavor of Spice, the circuit simulator.

By that time I had decided to go straight into graduate school as soon as I finished my Bachelor’s degree. I wanted to become a professor.

1986

In a two-week period over the summer, I went from collecting my diploma to starting classes in the Ph.D. program at Georgia Tech. I connected with my thesis advisor and was soon working on active research projects in image processing. I flirted briefly with writing image processing programs in Pascal. Then I switched to C after reading a book about it (while I was supposed to be paying attention in my Calculus of Variations class).

1987

Jim McClellan joined the Georgia Tech faculty in the Digital Signal Processing research group. He was an early MATLAB convert. He got a few grad students together to make a group purchase. I still have my disks:

MATLAB disks from 1987

MATLAB quickly became popular in the DSP group. It did not yet have image display capabilities, though, and I continued to write my image processing code in C.

1988

I wrote a package of “Digital Signal Processing Utilities” for my thesis advisor to use in his undergraduate DSP class. (The fact that I spent a term working enthusiastically on this project instead of my thesis research was an early but unrecognized clue about my eventual career path.) This software was revised by other students later and then incorporated into the book Introduction to Digital Signal Processing: A Computer Laboratory Textbook by Smith and Mersereau. Computer-based textbooks were becoming more common at this time because of the increasing availability of PC labs on campuses, but textbook authors still shied away from relying on commercial software in their textbooks. Home-brew software was common. That started to change rapidly with ...

1989

... the publication of The Student Edition of MATLAB. This book came with a limited-capability version of MATLAB intended for student use. The availability of this inexpensive MATLAB version dramatically accelerated the popularity of MATLAB for use in engineering education. Today there are more than 1400 MATLAB and Simulink based books listed on the MathWorks web site.

1990

I finished my Ph.D. and joined the faculty of the University of Illinois at Chicago. For office equipment I requested a Sun Sparcstation and a copy of MATLAB. MATLAB 4 was released at around this time, although it took a while for MATLAB 4 to be ported to all of the various platforms. MATLAB 4 sported a new graphics system that, among other things, finally had image display. I began to use MATLAB more often for my own research work, and I used the Student Edition in a course.

Around this time I received a letter from MathWorks announcing the imminent release of a new product called “Simulab.” I really wish I had kept this letter. It would have been a collector’s item, because Simulab had to be renamed to Simulink before it was released.

1993

Three years later, academia wasn’t working for me as well as I’d hoped. I tried desperately to think of something I might be good at for a commercial company. MATLAB came to mind.

I eagerly signed up to be a beta tester for the first version of the Image Processing Toolbox. Just after the beta program ended, I interviewed for a job at MathWorks. When the job offer came, I carefully and thoughtfully considered it (for about a minute) and said yes. I left academia and the Chicago cold for software development and the Boston cold. MathWorks at that time had about 190 employees. (Today the number is more than 2200.)

1993–today

Throughout my MathWorks career I’ve continued to be engaged with engineering education. For several years I was on our internal “advisory group” for education. In the mid-1990s I attended the first IEEE International Conference on Image Processing (ICIP). For a special session on education I coauthored (with Mike Orchard) a paper on using MATLAB and C for teaching image processing. I was a cheerleader for the idea of removing limitations from the Student Version of MATLAB so that it could handle image processing problems well. (Today's Student Version includes the complete Image Processing Toolbox!)

Sometime around 2001 or so, Rafael Gonzalez (lead author of Digital Image Processing) visited MathWorks and I had a chance to meet him. He immediately bent my ear about the idea of writing a MATLAB based textbook on image processing. In 2004, the first edition of Digital Image Processing Using MATLAB was finally published. A few years later, when I was visiting universities with my son who was about to graduate high school, I would sometimes sneak away during the campus tour to see if the university library had the book.

In 2006 the ICIP conference was held in Atlanta, my old stomping ground. I was eager to attend, and I arranged to give a special session on software development tools and techniques that I learned at MathWorks and that I thought would be useful to engineering researchers. I was able to reconnect with many of my academic colleagues and had a really great time. Over the next couple of years I traveled around to 7-8 universities to give that talk, mostly to graduate departments.

At one of those university visits, I met up with my thesis advisor. He told me that he felt engineering students knew less about programming today than they used to. Engineering programming was getting squeezed out of the curriculum. Increasingly, students these days receive only a very brief introduction to programming, often using MATLAB. This is consistent with the trends that we have seen at MathWorks.

During the last couple of years I’ve been much more involved in the design of MATLAB, and as a result I’ve been doing somewhat less image processing work. We are trying to understand our users and their needs better at all levels, including our student users who come to MATLAB with relatively little or no programming experience. We are working hard to make the MATLAB experience as good as we can for all of our users.

It’s been a true joy of my MathWorks career that the tools I help to design and build have had (and continue to have) such a beneficial impact on engineering and science education worldwide. To all the students out there just now settling down to your fall classes, I offer my best wishes for your success. Have fun and keep up with the school work!

September 2nd, 2011

“Area opening” terminology question

An experienced user of our products recently told us we got it wrong when we named bwareaopen. This function removes foreground objects from a binary image that are smaller in area than a given threshold. The user said (paraphrased) that "bwareaopen isn't doing an opening at all, it's just doing connected-component labeling. And I would look for some variation of 'size filter' in the name or description, and that doesn't appear at all."

This is a good example of a terminology question that arises in toolbox design fairly often. When do we use a term that is familiar to specialists in the area but might not be familiar to other users? In this case, the term area opening is familiar to those who know a lot about mathematical morphology as applied to image processing.

In morphology, this operation is indeed called an area opening, and that's why I named the function this way about ten years ago. The morphology folks have a fairly general notion of openings. Area opening is a type of opening they call an "algebraic opening." Such openings are characterized by the following properties:

  • Increasing. Operator \( T\{\cdot\} \) is increasing if, for all images \( f \) and \( g \), \( f <= g \) implies \( T\{f\} <= T\{g\} \).
  • Anti-extensive. Operator \( T\{\cdot\} \) is anti-extensive if, for all images \( f \), \( T\{f\} <= f \).
  • Idempotent. Operator \( T\{\cdot\} \) is idempotent if, for all images \( f \), \( T\{T\{f\}\} = T\{f\} \).

Morphological opening (erosion followed by dilation), which is the most familiar kind of opening, satisfies these properties, and so does the area opening.

That said, I appreciate the merit in considering a different term that is less "jargony." We might kick around the idea of renaming this function.

Readers, do you have any suggestions? I have an idea that I kind of like, but I'd like to hear what you have to say first.


Get the MATLAB code

Published with MATLAB® 7.12

August 26th, 2011

Digital image processing using MATLAB: digital image representation

Today I'm starting an regular, occasional series with tutorial material on digital image processing using MATLAB. I'm going to look at topics in roughly the order used in the book Digital Image Processing Using MATLAB, Gatesmark Publishing, 2009, by Gonzalez, Woods, and Eddins. (Yes, that's me. It was a lot of nights and weekends.)

Contents

Digital image representation

Let's start with digital image representation. (I do not plan to cover image formation. For that topic, see Chapter 2 of Digital Image Processing, Prentice Hall, 2008, by Gonzalez and Woods.) The common mathematical representation of an image is a function of two continuous spatial coordinates: \( f(x,y) \). The value \( f(x_0,y_0) \) is often called the image intensity at \( (x_0,y_0) \), although that term is used loosely because it often does not represent actual light intensity. The term gray level is also commonly used.

You can represent a color image mathematically in a couple of different ways. One way is to use a collection of image functions, one per color component, such as \( r(x,y) \), \( g(x,y) \), and \( b(x,y) \). Another way is to use a vector-valued function, \( {\mathbf f}(x,y) \).

Many functions in the Image Processing Toolbox also support higher-dimensional images. For example, \( f \) might be a function of three spatial variable, as in \( f(x,y,z) \). Image Processing Toolbox documentation calls this a multidimensional image.

Converting the amplitude values of \( f \) to a set of discrete values is called quantization. Capturing those values at discrete spatial coordinates is called sampling. When \( x \), \( y \), and the values of \( f \) are all finite, discrete quantities, we call f a digital image.

Coordinate conventions

It's important to pay attention to different coordinate system conventions that may be used in image processing. Often the center of the upper-left pixel is considered to be the origin, (0,0). There is more variation in the assignment of \( x \) and \( y \) axes. The coordinate system in the diagram below, with the \( x \)-axis pointing down and the \( y \)-axis pointing to the right, is used in Digital Image Processing Using MATLAB in order to be consistent with the Gonzalez and Woods book, Digital Image Processing.

From Figure 2.1, Digital Image Processing Using MATLAB, 2nd ed. Used with permission.

When displaying images in MATLAB, the usual convention is for the center of the upper-left pixel to be at (1,1), the \( x \)-axis to point to the right, and the \( y \)-axis to point down.

Images as matrices and arrays

Digital images are very conveniently represented as matrices, which happens to be great for working with in MATLAB. A monochrome image matrix looks like this:

   f(1,1)  f(1,2)  ...  f(1,N)
   f(2,1)  f(2,2)  ...  f(2,N)
     .       .            .
     .       .            .
     .       .            .
   f(M,1)  f(M,2)  ...  f(M,N)

(This happy marriage of a convenient digital image representation and the MATLAB strength at working with matrices is, more or less, the reason I ended up working at MathWorks.)

We represent color images in MATLAB as multidimensional arrays. For example, an RGB image (with three color components) is represented as an M-by-N-by-3 array. A CMYK image (with four color components) would use an M-by-N-by-4 array.

Multidimensional images are also represented in MATLAB as multidimensional arrays. Therefore we rely on context to distinguish between an RGB image (M-by-N-by-3) and a three-dimensional image with three z-plane slices (also M-b-N-by-3). For example, if you pass an M-by-N-by-3 array to rgb2gray, it is clear from the context that the input should be interpreted as an RGB color image and not as a three-dimensional image.

For more information

For more information, see Section 2.1 of Digital Image Processing Using MATLAB. See also the section Image Coordinate Systems in the Image Processing Toolbox User's Guide.


Get the MATLAB code

Published with MATLAB® 7.12


MathWorks
Steve Eddins is a software development manager in the MATLAB and image processing areas at MathWorks. Steve coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.