Earlier this month I started exploring the question of computing the shortest path from one object in a binary image to another. I described
the following basic algorithm:
Compute the distance transform for just the upper left block of foreground pixels.
Compute the distance transform for just the lower right block of foreground pixels.
Add the two distance transforms together.
The pixels in the regional minimum of the result lie along one or more of the shortest paths from one block to the other.
This algorithm only works for path-based approximations to the Euclidean distance transform; it doesn't work for the Euclidean
distance transform itself.
The Image Processing Toolbox function supports three path-based approximations to the Euclidean distance transform: 'cityblock',
'chessboard', and 'quasi-euclidean'.
Today I want to compare these three and discuss ambiguities associated with the idea of "shortest path" on an image.
Step 1 of the algorithm is to compute the distance transform for the first object. I'll use the 'cityblock' distance transform
approximation. In this approximation, only horizontal and vertical path segments are allowed. Diagonal segments are not allowed.
The distance between a pixel and any of its N, E, S, or W neighbors is 1.
The smallest value of D, 12, is the minimum path length (for paths consisting only of horizontal and vertical segments) from one object to the other.
Step 4 is to find the set of pixels in the regional minimum of D. This set represents the shortest path.
paths = imregionalmin(D);
To help visualize the path I'll use imoverlay from the MATLAB Central File Exchange. The code below will overlay the pixels on the shortest path in gray.
P = false(size(bw));
P = imoverlay(P, paths, [.5 .5 .5]);
P = imoverlay(P, bw, [1 1 1]);
imshow(P, 'InitialMagnification', 'fit')
Uh oh, why do we have a big rectangular block of gray instead of a single path? The answer is that there isn't a unique shortest
path. There are many paths you could travel to get from one object to the other that all have the same path length, 12. Here
are just a view of them:
subplot(2,2,1)
imshow(P, 'InitialMagnification', 'fit')
x = [2 6 6];
y = [2 2 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
subplot(2,2,2)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 6];
y = [2 10 10];
hold on
plot(x, y, 'g', 'LineWidth', 2);
hold off
subplot(2,2,3)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 3 4 4 5 5 6 6 6 6 6 6];
y = [2 2 3 3 4 4 5 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
subplot(2,2,4)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 2 2 2 2 3 3 4 4 5 5 6];
y = [2 3 4 5 6 7 7 8 8 9 9 10 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
Next, let's look at the 'chessboard' distance. In this distance transform approximation, a path can consist of horizontal,
vertical, and diagonal segments, and the path length between a pixel and any of its neighbors (N, NE, E, SE, S, SW, W, and
NW) is 1.0.
Now the set of gray pixels is not rectangular, but it still encompasses multiple possible shortest paths. And, surprisingly,
it seems to include some pixels on which the path actually moves temporarily away from its destination.
Here are several possible paths, all of length 8 (according to the 'chessboard' distance transform).
subplot(2,2,1)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 4 5 6 6 6 6 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
subplot(2,2,2)
imshow(P, 'InitialMagnification', 'fit')
x = [2 2 2 2 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2);
hold off
subplot(2,2,3)
imshow(P, 'InitialMagnification', 'fit')
x = [2 1 2 1 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
subplot(2,2,4)
imshow(P, 'InitialMagnification', 'fit')
x = [2 3 4 3 2 3 4 5 6];
y = [2 3 4 5 6 7 8 9 10];
hold on
plot(x, y, 'g', 'LineWidth', 2)
hold off
Finally, let's look at the 'quasi-euclidean' distance transform. In this variation, paths from one pixel to another can consist
of horizontal, vertical, and diagonal segments. The distance from a pixel to one of its horizontal or vertical neighbors is
1, while the distance to one of its diagonal neighbors is sqrt(2).
Here's the computation again, this time specifying 'quasi-euclidean':
I am trying to find the length and location of the shortest path between objects. It seems like some application of bwdist
should be able to do this. Any advice? Example:
1 0 1
0 1 0
0 0 0
0 1 0
1 0 1
Should yield
0 0 0
0 0 0
0 p 0
0 0 0
0 0 0
With p = path length = 1
I agree, some application of bwdist should do it. Actually, there are several other useful functions that can be applied to this task, including imregionalmin and bwdistgeodesic, a new function that just shipped in R2011b.
I've been tinkering with this problem off and on for a few days, and I think it might take a few posts to explore some of
the interesting ideas and issues posed by this problem. Today I'll start with the basics of using bwdist and imregionalmin to identify a set of shortest paths from one object to another.
Let's use this simple test image to start our exploration.
We want to find the shortest path (or paths!) from the upper left block of foreground pixels to the lower right block. The
key algorithmic idea is this:
Compute the distance transform for just the upper left block of foreground pixels.
Compute the distance transform for just the lower right block of foreground pixels.
Add the two distance transforms together.
The pixels in the regional minimum of the result lie along one or more of the shortest paths from one block to the other.
(For reference, see P. Soille, Morphological Image Analysis: Principles and Applications, 2nd edition, Springer, 2003, section
7.3.4, "Application to minimal path detection," pp. 233-234.)
But we can't use the normal Euclidean distance transform because it doesn't correspond to a path traversal from one pixel
center to another. Rather, we have to use one of the distance transform approximations supported by bwdist: 'chessboard', 'cityblock', and 'quasi-euclidean'.
You can see that the smallest value of D_sum is about 7.8284, and there's a set of pixels in-between the two objects that all have that value. The function imregionalmin identifies them all:
In the quasi-euclidean approximation to the distance transform, the distance from a pixel to one of its horizontal or vertical
neighbors is 1, and the distance from a pixel to one of its diagonal neighbors is sqrt(2). The result above says that the
shortest such path between the two objects is 7.8284. Furthermore, it says that there is more than one shortest path.
I plan to explore this topic more fully this month. When using the quasi-euclidean distance transform, for example, there
are floating-point round-off issues that have to be dealt with when you try to apply the technique above to larger images.
Another issue is the effect of using different path-based distance transform approximations, such as 'cityblock' or 'chessboard'. Yet another issue is how to identify specific paths from the family of paths identified by imregionalmin. Along the way I'll experiment with different ways to visualize the results.
Finally, I'll explore how to use a new function, bwdistgeodesic, to find shortest paths when there are constraints on where the paths can go.
All the posts in this series
the basic idea of finding shortest paths by adding two distance transforms together (part 1)
Much of the information I posted in this blog years ago is still useful today. Image processing theory hasn't been completely
overturned since then, and I'm still talking about MATLAB after all. For the benefit of readers who have joined the party
more recently, I will occasionally recap the useful bits that I posted about five years ago.
Today I want to tell a little image processing algorithm story related to my post last week about the new bwconvhull function in the Image Processing Toolbox.
The developer (Brendan) who worked on this function came to see me sometime last year to find out how the 'ConvexImage' measurement offered by regionprops was computed so that he could use the same procedure for bwconvhull. I believed that I knew the answer off the top of my head, so without looking at the code I rattled off the following steps:
Compute the x- and y-coordinates for the four corners of all the foreground pixels in the binary image.
Use convhull to compute the convex hull of the (x,y) pairs from step 1.
Use poly2mask to convert the convex hull polygon to a binary image mask.
A few days later Brendan came back to tell me that, although my description was clear, the code that I wrote ten years ago
for regionprops actually does something else.
Oops.
I looked at the code and realized that Brendan was right, and I started to remember that I had actually made this same mistake
many years before.
In fact, I did originally implement 'ConvexImage' in regionprops using the procedure outlined above. Before it shipped, though, I discovered (and fixed) a big problem with it.
Let me show you the problem using a small example. Here's a tiny binary image.
Visualize step 1 by superimposing those corner locations on the input image.
imshow(bw, 'InitialMagnification', 'fit')
hold on
plot(x_corners, y_corners, 'og')
hold off
Step 2: Compute the convex hull of all the corner points. Visualize by superimposing the convex hull in red.
k = convhull(x_corners, y_corners);
x_hull = x_corners(k);
y_hull = y_corners(k);
hold on
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off
Step 3: Use poly2mask to convert the convex hull to a binary image mask.
mask = poly2mask(x_hull, y_hull, 6, 9);
imshow(bw, 'InitialMagnification', 'fit')
hold on
h = imshow(mask);
set(h, 'AlphaData', 0.8)
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off
You can see that there are extra pixels along the diagonal edges that got put into the mask. That's not good. If you repeat
the operation, those diagonals will keep filling out until you're left with a rectangle. That's not supposed to happen with
repeated application of the convex hull.
The reason this is happening is that the convex hull goes exactly through the center of pixels that are along the diagonal
but "outside" the original set of foreground pixels. Because of the tie-breaking rules applied by poly2mask, all (in this case) of those outside pixels got included in the output mask.
The solution that I settled on ten years ago, and which is now also used in bwconvhull, was to modify the first step of the procedure. Instead of collecting the set of corner points of each foreground pixel, the correct procedure collects points along the center of each edge of each foreground pixel.
Here's what that looks like.
dx = [0.0 0.0 0.5 -0.5];
dy = [0.5 -0.5 0.0 0.0];
x_edges = bsxfun(@plus, x, dx);
y_edges = bsxfun(@plus, y, dy);
x_edges = x_edges(:);
y_edges = y_edges(:);
k = convhull(x_edges, y_edges);
x_hull = x_edges(k);
y_hull = y_edges(k);
imshow(bw, 'InitialMagnification', 'fit')
hold on
plot(x_edges, y_edges, 'og')
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
hold off
mask = poly2mask(x_hull, y_hull, 6, 9);
imshow(mask, 'InitialMagnification', 'fit')
hold on
plot(x_hull, y_hull, 'r', 'LineWidth', 4)
plot(x_edges, y_edges, 'og')
hold off
Much better!
I have been fooled more often than I would like to admit by sometimes nonintuitive digital image geometry. For you image processing
algorithm people out there, I hope seeing these pictures and techniques will help you avoid similar pitfalls someday.
I've been intending to mention a new function bwconvhull that was introduced in the Image Processing Toolbox last spring in the R2011a release. Now that R2011b is out, I figure I
better go ahead and do it!
bwconvhull computes the "convex hull of a binary image." Now I have to admit that this terminology is a little loose, so I'd better
clarify. The convex hull of a set of 2-D points is the smallest convex polygon that contains the entire set. Here's an example
from the MATLAB documentation for convhull:
xx = -1:.05:1; yy = abs(sqrt(xx));
[x,y] = pol2cart(xx,yy);
k = convhull(x,y);
plot(x(k),y(k),'r-',x,y,'b+')
The polygon in red is the convex hull of the set of points shown in blue. So convhull takes a set of points and returns a polygon, whereas bwconvhull takes a binary image and returns another binary image. What's up with that? It means simply that bwconvhull computes the convex hull of all the foreground pixels in the input image, and then it produces an output binary image with
all the pixels inside the convex hull set to white. It's a little easier to show than to say, so here's what it looks like:
bw = imread('text.png');
imshow(bw)
bw2 = bwconvhull(bw);
imshow(bw2);
Let's overlay the foreground pixel locations from the input image (using blue dots) and the convex hull computed by convhull (using a thick red line).
[y, x] = find(bw);
k = convhull(x, y);
hold on
plot(x, y, 'b.')
plot(x(k), y(k), 'r', 'LineWidth', 4)
hold off
An important variation supported by bwconvhull is to compute the convex hulls of the individual objects in the input image. Here's how you do that:
bw3 = bwconvhull(bw, 'objects');
imshow(bw3)
Something called the convex deficiency is sometimes used in shape recognition applications. Loosely speaking, the convex deficiency of a shape is the convex hull
of the shape minus the shape. Here's an example using the letter "T" from the text image above.
The MATLAB function imread reads image data from a variety of formats, including:
Windows Bitmap (BMP)
Windows Cursor resources (CUR)
Flexible Image Transport System (FITS)
Graphics Interchange Format (GIF)
Hierarchical Data Format (HDF)
Windows Icon resources (ICO)
JPEG 2000
Joint Photographic Experts Group (JPEG)
Portable Bitmap (PBM)
Windows Paintbrush (PCX)
Portable Graymap (PGM)
Portable Network Graphics (PNG)
Portable Any Map (PNM)
Portable Pixmap (PPM)
Sun Raster (RAS)
Tagged Image File Format (TIFF)
X Window Dump (XWD)
For example, the following line reads the pixels from a PNG file into the MATLAB variable I:
I = imread('rice.png');
After you run the code above, the Workspace Browser shows you that your variable I is a 256x256 matrix of uint8 (unsigned eight-bit integer) values in the range [40,204].
Although imread uses the file extension (such as .png) to help determine the file format, it can also determine the format automatically.
Suppose, for example, you have a set of files in which the file extension is used to indicate a particular data series, which
I'll simulate here by copying rice.png to another filename:
s = which('rice.png')
s =
C:\MATLABs\R2011b\toolbox\images\imdemos\rice.png
copyfile(s, 'rice.series-x03a')
dir('rice.*')
rice.series-x03a
I = imread('rice.series-x03a');
imshow(I)
Reading from multi-image TIFF files
TIFF files can store multiple independent images. The function imread has a syntax for specifying which image you want. The file mri.tif (download link) contains 27 images. Here is how you would read the 1st, 6th, and 11th images in the file.
For very large image files it can be useful to read in only a subset of the image pixels. The imread function can do this for both TIFF and JPEG 2000 files. Here's how to read rows 50-100 and columns 150-200 of shadow.tif:
I = imread('shadow.tif', 'PixelRegion', {[50 100], [150 200]});
imshow(I)
You can use the MATLAB function imfinfo to read metadata about an image file without reading in all the pixel data. Here's an example:
The display shows that peppers.png is a truecolor image with 24 bits per pixel. The size of the file is 287677 bytes, and
it was last modified in the morning of December 16, 2002, during a period of heavy snowfall. By capturing the output of imfinfo in a variable, you can write code based on this information, such as:
info = imfinfo('peppers.png');
num_rows = info.Height;
num_cols = info.Width;
Sample image files
The files rice.png, shadow.tif, and peppers.png read by the code above are sample images that ships with Image Processing
Toolbox. I use sample images from the toolbox a lot in this blog because I want readers to be able to run the code examples
in my posts. You can see a list of the sample image files in the toolbox with this command:
help imdemos
Image Processing Toolbox --- demos and sample images
iptdemos - Index of Image Processing Toolbox demos.
ipexaerial - Registering an Aerial Photo to an Orthophoto.
ipexangle - Measuring Angle of Intersection.
ipexbatch - Batch Processing ImageFiles in Parallel.
ipexblind - Deblurring Images Using a Blind Deconvolution Filter.
ipexblockprocedge - Block Processing Large Images.
ipexblockprocstats - Computing Statistics for Large Images.
ipexcell - Detecting a Cell Using Image Segmentation.
ipexcheckerboard - Creating a Gallery of Transformed Images.
ipexconformal - Exploring a Conformal Mapping.
ipexcontrast - Contrast Enhancement Techniques.
ipexfabric - Color-based Segmentation Using the L*a*b* Color Space.
ipexhistology - Color-based Segmentation Using K-Means Clustering.
ipexlanstretch - Enhancing Multispectral Color Composite Images.
ipexlucy - Deblurring Images Using a Lucy-Richardson Filter.
ipexndvi - Finding Vegetation in a Multispectral Image.
ipexnormxcorr2 - Registering an Image Using Normalized Cross-Correlation
ipexpendulum - Finding the Length of a Pendulum in Motion.
ipexprops - Measuring Regions in Grayscale Images.
ipexradius - Measuring the Radius of a Roll of Tape.
ipexreconstruct - Reconstructing an Image from Projection Data.
ipexregularized - Deblurring Images Using a Regularized Filter.
ipexrice - Correcting Nonuniform Illumination.
ipexrotate - Finding the Rotation and Scale of a Distorted Image.
ipexroundness - Identifying Round Objects.
ipexshear - Padding and Shearing an Image Simultaneously.
ipexsnow - Granulometry of Snowflakes.
ipextexturefilter - Texture Segmentation Using Texture Filters.
ipextraffic - Detecting Cars in a Video of Traffic.
ipexwatershed - Marker-controlled watershed segmentation.
ipexwiener - Deblurring Images Using the Wiener Filter.
Extended-example helper files.
HistogramAccumulator - Used by blockproc stats example.
batchDetectCells - Used by batch processing example.
batchProcessFiles - Used by batch processing example.
conformalForward1 - Used by conformal mapping example.
conformalForward2 - Used by conformal mapping example.
conformalInverse - Used by conformal mapping example.
conformalInverseClip - Used by conformal mapping example.
conformalSetupInputAxes - Used by conformal mapping example.
conformalSetupOutputAxes - Used by conformal mapping example.
conformalShowLines - Used by conformal mapping example.
conformalShowCircles - Used by conformal mapping example.
conformalShowInput - Used by conformal mapping example.
conformalShowOutput - Used by conformal mapping example.
propsSynthesizeImage - Used by measuring regions example.
LanAdapter - Used by blockproc stats example.
Sample MAT-files.
imdemos.mat - Images used in demos.
pendulum.mat - Used by ipexpendulum.
regioncoordinates.mat - Used by ipexfabric.
trees.mat - Scanned painting.
westconcordpoints.mat - Used by aerial photo registration example.
mristack.mat - Used by help example in IMPLAY.
cellsequence.mat - Used by help example in IMPLAY.
Sample FITS images.
solarspectra.fts
Sample HDR images.
office.hdr
Sample JPEG images.
football.jpg
greens.jpg
Sample PNG images.
bag.png
blobs.png
circles.png
coins.png
concordorthophoto.png
concordaerial.png
fabric.png
gantrycrane.png
glass.png
hestain.png
liftingbody.png
onion.png
pears.png
peppers.png
pillsetc.png
rice.png
saturn.png
snowflakes.png
tape.png
testpat1.png
text.png
tissue.png
westconcordorthophoto.png
westconcordaerial.png
Sample TIFF images.
AT3_1m4_01.tif
AT3_1m4_02.tif
AT3_1m4_03.tif
AT3_1m4_04.tif
AT3_1m4_05.tif
AT3_1m4_06.tif
AT3_1m4_07.tif
AT3_1m4_08.tif
AT3_1m4_09.tif
AT3_1m4_10.tif
autumn.tif
board.tif
cameraman.tif
canoe.tif
cell.tif
circbw.tif
circuit.tif
eight.tif
forest.tif
kids.tif
logo.tif
mandi.tif
m83.tif
moon.tif
mri.tif
paper1.tif
pout.tif
shadow.tif
spine.tif
tire.tif
trees.tif
Sample Landsat images.
littlecoriver.lan
mississippi.lan
montana.lan
paris.lan
rio.lan
tokyo.lan
Sample AVI files.
rhinos.avi
traffic.avi
Sample Analyze 7.5 images.
brainMRI.img
Photo credits
board:
Computer circuit board, courtesy of Alexander V. Panasyuk,
Ph.D., Harvard-Smithsonian Center for Astrophysics.
cameraman:
Copyright Massachusetts Institute of Technology. Used with
permission.
cell:
AT3_1m4_01:
AT3_1m4_02:
AT3_1m4_03:
AT3_1m4_04:
AT3_1m4_05:
AT3_1m4_06:
AT3_1m4_07:
AT3_1m4_08:
AT3_1m4_09:
AT3_1m4_10:
Cancer cells from rat prostates, courtesy of Alan W. Partin, M.D,
Ph.D., Johns Hopkins University School of Medicine.
circuit:
Micrograph of 16-bit A/D converter circuit, courtesy of Steve
Decker and Shujaat Nadeem, MIT, 1993.
concordaerial and westconcordaerial:
Visible color aerial photographs courtesy of mPower3/Emerge.
concordorthophoto and westconcordorthophoto:
Orthoregistered photographs courtesy of Massachusetts Executive Office
of Environmental Affairs, MassGIS.
forest:
Photograph of Carmanah Ancient Forest, British Columbia, Canada,
courtesy of Susan Cohen.
gantrycrane:
Gantry crane used to build a bridge, courtesy of Jeff Mather.
hestain:
Image of tissue stained with hemotoxylin and eosin (H&E) at 40X
magnification, courtesy of Alan W. Partin, M.D., Ph.D., Johns Hopkins
University School of Medicine.
liftingbody:
Public domain image of M2-F1 lifting body in tow, courtesy of NASA,
1964-01-01, Dryden Flight Research Center #E-10962, GRIN database
#GPN-2000-000097.
mandi:
Bayer pattern-encoded image taken by a camera with a sensor
alignment of 'bggr', courtesy of Jeremy Barry.
m83:
M83 spiral galaxy astronomical image courtesy of Anglo-Australian
Observatory, photography by David Malin.
moon:
Copyright Michael Myers. Used with permission.
pears:
Copyright Corel. Used with permission.
tissue:
Cytokeratin CAM 5.2 stain of human prostate tissue, courtesy of
Alan W. Partin, M.D, Ph.D., Johns Hopkins University School
of Medicine.
trees:
Trees with a View, watercolor and ink on paper, copyright Susan
Cohen. Used with permission.
LAN files:
Permission to use Landsat TM data sets provided by Space Imaging,
LLC, Denver, Colorado.
saturn:
Public domain image courtesy of NASA, Voyager 2 image, 1981-08-24,
NASA catalog #PIA01364
solarspectra:
Solar spectra image courtesy of Ann Walker, Boston University.
See also COLORSPACES, IMAGES, IMAGESLIB, IMUITOOLS, IPTFORMATS, IPTUTILS.
The output also shows you the image credit information for the images that are not copyrighted by MathWorks.
I'd like to welcome back guest blogger Brendan Hannigan, for the third in a series of three posts on working with very large
images in MATLAB.
In the previous two blog posts, I've been discussing how to avoid Out of Memory errors while working with large images using
MATLAB and the Image Processing Toolbox. I first showed how to view and explore arbitrarily large images by creating a reduced
resolution data set (R-Set) from an image file. Next, I demonstrated how you can process large images files using a file-to-file
workflow, never loading the entire image into memory at once.
"Right, but my data is not in TIFF, NITF, or JPEG2000, remember?"
Yes, there's the problem. rsetwrite & blockproc support a few file formats "natively", but not everyone works with data in those formats.
There are some issues we face when creating functions like rsetwrite & blockproc which allow incremental processing of files from disc. Here are two:
Not all file formats are amenable to incremental "region-based" I/O.
There are a lot of file formats. Seriously.
That said, we wanted to provide this large data workflow to as many of out customers as possible. So, as of release R2010a,
in addition to our "built-in" file formats, both rsetwrite and blockproc also support "image adapter" objects!
The ImageAdapter is an object-oriented MATLAB class. It is actually an "abstract" class, meaning that by itself it is not very useful. What
it does do it define an interface for reading and writing image data. All you have to do is to tell us how to read and/or write
sub-regions of your particular file format, and then we can do the rest!
"Whoa there buddy!.. Object - "Oriented" ?! That's too complicated for me!"
No, it's not. Don't sweat it, it's really not. I won't go into a full tutorial on how to write MATLAB classes in this blog
as there are excellent videos and tutorials available on our website and in our product documentation that cover that. Instead,
I will walk through a quick example that recently came up in this very blog.
"Doug" was frustrated because he found that there was no way to tell blockproc that he wanted to process the Nth "page" in his TIFF file. blockproc is hard-wired to process the first page of a TIFF file when passed a multi-page TIFF image. We didn't provide a syntactic
option to select which page to process, because we wanted to avoid format-specific syntaxes/parameters in blockproc. Otherwise you can imagine the function interface could get pretty complex pretty fast.
"Ok so, how did Doug solve his problem?"
This was a perfect use case for the ImageAdapter class. Image adapter objects are useful when you want to have more control over the I/O in blockproc and rsetwrite. You may want to just control some specific aspect of how your file is read/written (like Doug) or you might want to read/write
a completely new file format.
I wrote Doug a quick image adapter class to solve his problem which I will share with you, but first let's look at how it
is used in a quick example. The image adapter class is called PagedTiffAdapter. We'll be using it to work with a multi-paged TIFF image, mri.tif (download link).
% Get some image information, we'll need this later.
filename = 'mri.tif';
page = 5;
info = imfinfo(filename);
cmap = info(page).Colormap;
% Create our PagedTiffAdapter object!
my_adapter = PagedTiffAdapter(filename,page);
% Let's not "do" anything to the data, let's just read it and return it
no_op_fun = @(bs) bs.data;
% Call blockproc using our image adapter object as the input source
single_page = blockproc(my_adapter,[100 100],no_op_fun);
% Display our single page from this TIFF file
imshow(single_page,cmap)
Voila. That's pretty simple right? We've now used blockproc to read in the 5th page of our multi-page TIFF file, mri.tif. Granted, this is not a particularly compelling use of block processing, but I'm just trying to show how you can use image
adapter objects in place of "conventional" input images.
Let's have a look at the class now.
classdef PagedTiffAdapter < ImageAdapter
properties
Filename
Info
Page
end
methods
function obj = PagedTiffAdapter(filename, page)
obj.Filename = filename;
obj.Info = imfinfo(filename);
obj.Page = page;
obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
end
function result = readRegion(obj, start, count)
result = imread(obj.Filename,'Index',obj.Page,...
'Info',obj.Info,'PixelRegion', ...
{[start(1), start(1) + count(1) - 1], ...
[start(2), start(2) + count(2) - 1]});
end
function result = close(obj) %#ok
end
end
end
"Ok, that's some pretty dense code you have there."
The class is quite straightforward. It begins with a classdef line, which defines the name of the class and also indicates that the class inherits from our base-class, ImageAdapter, using the < symbol.
Next we see 2 sub-sections of our class definition, a properties block which holds important data that we will need over the lifespan of each object...
properties
Filename
Info
Page
end
...and a methods block which defines the behavior of the objects.
methods
function obj = PagedTiffAdapter(filename, page)
obj.Filename = filename;
obj.Info = imfinfo(filename);
obj.Page = page;
obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
end
function result = readRegion(obj, start, count)
result = imread(obj.Filename,'Index',obj.Page,...
'Info',obj.Info,'PixelRegion', ...
{[start(1), start(1) + count(1) - 1], ...
[start(2), start(2) + count(2) - 1]});
end
function result = close(obj) %#ok
end
end
"How do you know what properties and methods you need?"
Ahh, good question. Classes which inherit from the ImageAdapter base class are REQUIRED to have:
A class constructor for initialization (all MATLAB classes require this)
a readRegion method (required by the ImageAdapter base class)
a close method (required by the ImageAdapter base class)
a ImageSize property (required by the ImageAdapter base class)
"Hey wait, you don't define ImageSize in your properties block!!"
That is true. The ImageSize property is defined in the base class, so you don't have to redefine it here, you just have to set it. That's why at the end of my class constructor, I make sure to set the ImageSize property to be the size of the page that I am interested in.
"Well that's not super intuitive, but ok. What goes inside the methods?"
This class is going to be used to read data from a TIFF file, so in the class constructor all we do is gather information
about the file that we'll need later and store that information in the appropriate properties.
Let's look the other methods individually. First the close method.
function result = close(obj) %#ok
end
The close method, in this case does nothing. The reason it does nothing is that we are doing our actual file I/O using the imread function, which does not require us to open the file handle directly. If we were writing an image adapter to read say, some
arbitrarily formatted binary image, then we would likely be opening our file handle in the constructor, storing it in a property,
and then in the close method we would close the file handle and do any other necessary clean up tasks. This example class is just very simple,
so we have no "cleaning up" to do when we are done, but if we did, we would put that code in our close method. Regardless of what clean up code you need, you must have a close method. The close method is called by blockproc and rsetwrite only once, after all file I/O has completed.
Now let's look closely at the readRegion method.
function result = readRegion(obj, start, count)
result = imread(obj.Filename,'Index',obj.Page,...
'Info',obj.Info,'PixelRegion', ...
{[start(1), start(1) + count(1) - 1], ...
[start(2), start(2) + count(2) - 1]});
end
This is the real work horse of the class. You will probably never need to call this method (or any method other than the
constructor) yourself. Instead the image adapter clients will call these functions. blockproc is going to call this readRegion method when it wants to read a block of the input image. It's your job to figure out which block it needs, read that data,
and send it back to it.
"How am I supposed to know which block it needs?"
It'll tell you! It's all contained in the 2 input arguments to the readRegion method, start and count. The start argument is a 2-element vector specifying the [row col] of the first pixel we need. The count argument is a 2-element vector specifying the size of the requested region in [rows cols].
For example, let's say start was [5 9] and count was [2 3]. Your method should return the data in rows 5-6 and columns 9-11, just as if you had indexed a variable like this:
return_to_blockproc = myVariable(5:6,9:11);
So your implementation of readRegion needs to take these 2 input arguments and then return the appropriate image data that they specify.
What will that mean for you? Well it depends on what your image adapter is designed for. In this case we are simply reading
a specific page of a TIFF file, so I'm using start and count to construct a 'PixelRegion' argument to the imread function that will fetch only the pixels I am interested in. In your case, your readRegion might be pulling data from some binary formatted file, using a 3rd party mex-file to read a piece of data, or just whatever!
That's the beauty of the image adapters, you can get your data from whereever you want.
"Ok I think I got it, but what about writing to new files?"
You can use your class to write to files as well, by defining a writeRegion method. The writeRegion method is (not surprisingly) almost the exact opposite of the readRegion method. Instead of you returning data from a specified block, you will receive data as an input argument and then write
that data to our image at a specified block location.
After you write a writeRegion method, you can then use objects of your class as 'Destination' parameters to blockproc, allowing you to both read and write to arbitrarily large files of arbitrary format!
Specifying a writeRegion method is completely optional, but you must specify it if you want to use your image adapter object as a 'Destination' in blockproc. Otherwise, your objects will be "read-only".
In the spirit of brevity, I'm not going to do that for our PagedTiffAdapter. I've found that writeRegion methods are often more complex than their readRegion counterparts. I'll leave that as an exercise for the reader.
"Cool, everything turned out better than expected!"
Thanks! Well that about wraps it up. If you want to learn more about writing and using image adapter objects there is a
chapter in the Image Processing Toolbox Users' Guide called "Working with Data in Unsupported Formats". There we walk thought
writing an adapter for a more complex binary formatted file. You can also see the documentation for blockproc, rsetwrite, and ImageAdapter.
Thanks again Steve for letting me talk about this stuff! Have a great day!
Earlier this week I received a text message from my oldest son, who just started his second year at Northeastern University. He wrote to tell me that he’ll be using MATLAB in one his classes. I don’t know if he was excited, but I certainly was! This put me in a reflective mood, thinking about my own career path, engineering education, and how MATLAB has been woven throughout. So I’d like to present a rough timeline of thirty years of engineering education and MATLAB as seen through my own personal perspective.
1982
I arrived at Georgia Tech as an electrical engineering (EE) student. That year I took a required engineering programming course using Fortran, as well as a data structures and algorithms course using Pascal. Cleve’s Fortran-based MATLAB was being distributed informally to his network of academic colleagues on mag tape reels, but learning about MATLAB was still a few years away for me.
1984
Jack started The MathWorks based on the twin ideas that MATLAB would be very useful in controls engineering and that the new IBM PC would eventually catch on among engineers.
I only remember one undergraduate EE course in which we used a software package. It was some flavor of Spice, the circuit simulator.
By that time I had decided to go straight into graduate school as soon as I finished my Bachelor’s degree. I wanted to become a professor.
1986
In a two-week period over the summer, I went from collecting my diploma to starting classes in the Ph.D. program at Georgia Tech. I connected with my thesis advisor and was soon working on active research projects in image processing. I flirted briefly with writing image processing programs in Pascal. Then I switched to C after reading a book about it (while I was supposed to be paying attention in my Calculus of Variations class).
1987
Jim McClellan joined the Georgia Tech faculty in the Digital Signal Processing research group. He was an early MATLAB convert. He got a few grad students together to make a group purchase. I still have my disks:
MATLAB quickly became popular in the DSP group. It did not yet have image display capabilities, though, and I continued to write my image processing code in C.
1988
I wrote a package of “Digital Signal Processing Utilities” for my thesis advisor to use in his undergraduate DSP class. (The fact that I spent a term working enthusiastically on this project instead of my thesis research was an early but unrecognized clue about my eventual career path.) This software was revised by other students later and then incorporated into the book Introduction to Digital Signal Processing: A Computer Laboratory Textbook by Smith and Mersereau. Computer-based textbooks were becoming more common at this time because of the increasing availability of PC labs on campuses, but textbook authors still shied away from relying on commercial software in their textbooks. Home-brew software was common. That started to change rapidly with ...
1989
... the publication of The Student Edition of MATLAB. This book came with a limited-capability version of MATLAB intended for student use. The availability of this inexpensive MATLAB version dramatically accelerated the popularity of MATLAB for use in engineering education. Today there are more than 1400 MATLAB and Simulink based books listed on the MathWorks web site.
1990
I finished my Ph.D. and joined the faculty of the University of Illinois at Chicago. For office equipment I requested a Sun Sparcstation and a copy of MATLAB. MATLAB 4 was released at around this time, although it took a while for MATLAB 4 to be ported to all of the various platforms. MATLAB 4 sported a new graphics system that, among other things, finally had image display. I began to use MATLAB more often for my own research work, and I used the Student Edition in a course.
Around this time I received a letter from MathWorks announcing the imminent release of a new product called “Simulab.” I really wish I had kept this letter. It would have been a collector’s item, because Simulab had to be renamed to Simulink before it was released.
1993
Three years later, academia wasn’t working for me as well as I’d hoped. I tried desperately to think of something I might be good at for a commercial company. MATLAB came to mind.
I eagerly signed up to be a beta tester for the first version of the Image Processing Toolbox. Just after the beta program ended, I interviewed for a job at MathWorks. When the job offer came, I carefully and thoughtfully considered it (for about a minute) and said yes. I left academia and the Chicago cold for software development and the Boston cold. MathWorks at that time had about 190 employees. (Today the number is more than 2200.)
1993–today
Throughout my MathWorks career I’ve continued to be engaged with engineering education. For several years I was on our internal “advisory group” for education. In the mid-1990s I attended the first IEEE International Conference on Image Processing (ICIP). For a special session on education I coauthored (with Mike Orchard) a paper on using MATLAB and C for teaching image processing. I was a cheerleader for the idea of removing limitations from the Student Version of MATLAB so that it could handle image processing problems well. (Today's Student Version includes the complete Image Processing Toolbox!)
Sometime around 2001 or so, Rafael Gonzalez (lead author of Digital Image Processing) visited MathWorks and I had a chance to meet him. He immediately bent my ear about the idea of writing a MATLAB based textbook on image processing. In 2004, the first edition of Digital Image Processing Using MATLAB was finally published. A few years later, when I was visiting universities with my son who was about to graduate high school, I would sometimes sneak away during the campus tour to see if the university library had the book.
In 2006 the ICIP conference was held in Atlanta, my old stomping ground. I was eager to attend, and I arranged to give a special session on software development tools and techniques that I learned at MathWorks and that I thought would be useful to engineering researchers. I was able to reconnect with many of my academic colleagues and had a really great time. Over the next couple of years I traveled around to 7-8 universities to give that talk, mostly to graduate departments.
At one of those university visits, I met up with my thesis advisor. He told me that he felt engineering students knew less about programming today than they used to. Engineering programming was getting squeezed out of the curriculum. Increasingly, students these days receive only a very brief introduction to programming, often using MATLAB. This is consistent with the trends that we have seen at MathWorks.
During the last couple of years I’ve been much more involved in the design of MATLAB, and as a result I’ve been doing somewhat less image processing work. We are trying to understand our users and their needs better at all levels, including our student users who come to MATLAB with relatively little or no programming experience. We are working hard to make the MATLAB experience as good as we can for all of our users.
It’s been a true joy of my MathWorks career that the tools I help to design and build have had (and continue to have) such a beneficial impact on engineering and science education worldwide. To all the students out there just now settling down to your fall classes, I offer my best wishes for your success. Have fun and keep up with the school work!
An experienced user of our products recently told us we got it wrong when we named bwareaopen. This function removes foreground objects from a binary image that are smaller in area than a given threshold. The user said
(paraphrased) that "bwareaopen isn't doing an opening at all, it's just doing connected-component labeling. And I would look
for some variation of 'size filter' in the name or description, and that doesn't appear at all."
This is a good example of a terminology question that arises in toolbox design fairly often. When do we use a term that is
familiar to specialists in the area but might not be familiar to other users? In this case, the term area opening is familiar to those who know a lot about mathematical morphology as applied to image processing.
In morphology, this operation is indeed called an area opening, and that's why I named the function this way about ten years
ago. The morphology folks have a fairly general notion of openings. Area opening is a type of opening they call an "algebraic
opening." Such openings are characterized by the following properties:
Increasing. Operator \( T\{\cdot\} \) is increasing if, for all images \( f \) and \( g \), \( f <= g \) implies \( T\{f\} <= T\{g\} \).
Anti-extensive. Operator \( T\{\cdot\} \) is anti-extensive if, for all images \( f \), \( T\{f\} <= f \).
Idempotent. Operator \( T\{\cdot\} \) is idempotent if, for all images \( f \), \( T\{T\{f\}\} = T\{f\} \).
Morphological opening (erosion followed by dilation), which is the most familiar kind of opening, satisfies these properties,
and so does the area opening.
That said, I appreciate the merit in considering a different term that is less "jargony." We might kick around the idea of
renaming this function.
Readers, do you have any suggestions? I have an idea that I kind of like, but I'd like to hear what you have to say first.
Today I'm starting an regular, occasional series with tutorial material on digital image processing using MATLAB. I'm going
to look at topics in roughly the order used in the book Digital Image Processing Using MATLAB, Gatesmark Publishing, 2009, by Gonzalez, Woods, and Eddins. (Yes, that's me. It was a lot of nights and weekends.)
Let's start with digital image representation. (I do not plan to cover image formation. For that topic, see Chapter 2 of Digital Image Processing, Prentice Hall, 2008, by Gonzalez and Woods.) The common mathematical representation of an image is a function of two continuous
spatial coordinates: \( f(x,y) \). The value \( f(x_0,y_0) \) is often called the image intensity at \( (x_0,y_0) \), although that term is used loosely because it often does not represent actual light intensity. The term gray level is also commonly used.
You can represent a color image mathematically in a couple of different ways. One way is to use a collection of image functions,
one per color component, such as \( r(x,y) \), \( g(x,y) \), and \( b(x,y) \). Another way is to use a vector-valued function, \( {\mathbf f}(x,y) \).
Many functions in the Image Processing Toolbox also support higher-dimensional images. For example, \( f \) might be a function of three spatial variable, as in \( f(x,y,z) \). Image Processing Toolbox documentation calls this a multidimensional image.
Converting the amplitude values of \( f \) to a set of discrete values is called quantization. Capturing those values at discrete spatial coordinates is called sampling. When \( x \), \( y \), and the values of \( f \) are all finite, discrete quantities, we call f a digital image.
Coordinate conventions
It's important to pay attention to different coordinate system conventions that may be used in image processing. Often the
center of the upper-left pixel is considered to be the origin, (0,0). There is more variation in the assignment of \( x \) and \( y \) axes. The coordinate system in the diagram below, with the \( x \)-axis pointing down and the \( y \)-axis pointing to the right, is used in Digital Image Processing Using MATLAB in order to be consistent with the Gonzalez and Woods book, Digital Image Processing.
From Figure 2.1, Digital Image Processing Using MATLAB, 2nd ed. Used with permission.
When displaying images in MATLAB, the usual convention is for the center of the upper-left pixel to be at (1,1), the \( x \)-axis to point to the right, and the \( y \)-axis to point down.
Images as matrices and arrays
Digital images are very conveniently represented as matrices, which happens to be great for working with in MATLAB. A monochrome
image matrix looks like this:
(This happy marriage of a convenient digital image representation and the MATLAB strength at working with matrices is, more
or less, the reason I ended up working at MathWorks.)
We represent color images in MATLAB as multidimensional arrays. For example, an RGB image (with three color components) is
represented as an M-by-N-by-3 array. A CMYK image (with four color components) would use an M-by-N-by-4 array.
Multidimensional images are also represented in MATLAB as multidimensional arrays. Therefore we rely on context to distinguish
between an RGB image (M-by-N-by-3) and a three-dimensional image with three z-plane slices (also M-b-N-by-3). For example,
if you pass an M-by-N-by-3 array to rgb2gray, it is clear from the context that the input should be interpreted as an RGB color image and not as a three-dimensional image.
Steve Eddins is a software development manager in the MATLAB and image processing areas at MathWorks. Steve coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.
Recent Comments