Steve on Image Processing

April 24th, 2009

Locating the US continental divide, part 1 - Introduction

Today I'm introducing a series on computing the location of the US continental divide. (I was inspired to tackle this topic by Brian Hayes' blog post on the divide; thanks very much to Mike and Ned for sending me the link.)

The series will start with importing global elevation data for Earth, and it will finish with this visualization of the Atlantic and Pacific catchment basins for the continental US:

My regular blog readers might be happy to know that this time I've planned and written out the whole series in advance. See below for the series summary.

In this first post, I'll tackle data import and display.

The data for this project comes from the Global Land One-km Base Elevation Project, available from the National Geophysical Data Center. The global elevation data set is available online in the form of 16 tiles, labeled A through P. Tile E and and tile F cover the continental United States, Central America, and portions of Canada and South America.

After you've downloaded and uncompressed the files, you have to figure out how to import them. As it turns out, these files are not in a structured format, such as TIFF or HDF, that can be read automatically. Instead, you need to poke around the data documentation on the web to find out how to read the files. This is fairly common with scientific data sources. The documentation for this data set includes a section called "Data Format & Importing GLOBE Data." By perusing this section I found that the files I downloaded contain 16-bit signed integers organized in a raster of 6000 rows and 10800 columns, stored by row. There are no header bytes, and the integers are stored using little-endian byte order.

The appropriate MATLAB function to use is multibandread.

data_size = [6000 10800 1];  % The data has 1 band.
precision = 'int16=>int16';  % Read 16-bit signed ints into a int16 array.
header_bytes = 0;
interleave = 'bsq';          % Band sequential. Not critical for 1 band.
byte_order = 'ieee-le';

E = multibandread('e10g', data_size, precision, header_bytes, ...
    interleave, byte_order);
F = multibandread('f10g', data_size, precision, header_bytes, ...
    interleave, byte_order);

dem = [E, F];  % dem stands for "Digital Elevation Model"

Next I'll use imshow, using some optional inputs, to display the digital elevation model as an image.

imshow(dem, [], 'InitialMagnification', 'fit')

About those optional inputs: When displaying gridded data as an image, it is often useful to control the black-to-white display range. If you supply a vector as the second input to imshow, like this:

   imshow(dem, [a b])

then imshow uses [a b] as the image display range. A value of a gets displayed as black, a value of b gets displayed as white, and values in-between are displayed as shades of gray.

For the special case where the second input is empty:

   imshow(dem, [])

imshow uses the data range as the display range. That is, imshow(dem, []) is equivalent to imshow(dem, [min(dem(:)), max(dem(:))]).

The other optional inputs, 'InitialMagnification', 'fit', tell imshow to magnify or shrink the image display as needed to fit nicely within the current figure window. I'm using that option here so that my image displays will fit in the column for this blog post.

For this data, the autoscaling provided by the imshow(dem, []) syntax has given us an image display that is too dark, particularly on the eastern part of the continent. Let's set a white level that's lower than the data maximum; that will have the effect of brightening the image.

min(dem(:))
ans =

   -500

max(dem(:))
ans =

   6085

display_range = [-500 3000];
imshow(dem, display_range, 'InitialMagnification', 'fit')

Next I'll crop the image. The Image Processing Toolbox has interactive cropping tools (imcrop, imtool), but I can't use those here.

dem_cropped = dem(1:4000, 6000:14500);
imshow(dem_cropped, display_range, 'InitialMagnification', 'fit')

Finally, I'll save dem_cropped to a MAT-file for use in the rest of the series.

save continental_divide dem_cropped

About this Series

This series explores the problem of computing the location of the continental divide for the United States. The divide separates the Atlantic and Pacific Ocean catchment basins for the North American continent.

As an algorithm development problem, computing the divide lets us explore aspects of data import and visualization, manipulating binary image masks, label matrices, regional minima, and the watershed transform.

  • Part 1 - Introduction. Data import and display. multibandread, imshow.
  • Part 2 - Watershed transform. watershed, label2rgb.
  • Part 3 - Regional minima. imerode, imregionalmin.
  • Part 4 - Ocean masks. binary image manipulation, bwselect.
  • Part 5 - Minima imposition. imimposemin.
  • Part 6 - Computing and visualizing the divide. watershed, label2rgb, bwboundaries.
  • Part 7 - Putting it all together. One script that does everything, from data import through computation and visualization of the divide.

Data credit: GLOBE Task Team and others (Hastings, David A., Paula K. Dunbar, Gerald M. Elphingstone, Mark Bootz, Hiroshi Murakami, Hiroshi Maruyama, Hiroshi Masaharu, Peter Holland, John Payne, Nevin A. Bryant, Thomas L. Logan, J.-P. Muller, Gunter Schreier, and John S. MacDonald), eds., 1999. The Global Land One-kilometer Base Elevation (GLOBE) Digital Elevation Model, Version 1.0. National Oceanic and Atmospheric Administration, National Geophysical Data Center, 325 Broadway, Boulder, Colorado 80305-3328, U.S.A. Digital data base on the World Wide Web (URL: http://www.ngdc.noaa.gov/mgg/topo/globe.html) and CD-ROMs.


Get the MATLAB code

Published with MATLAB® 7.8

13 Responses to “Locating the US continental divide, part 1 - Introduction”

  1. Tom Elmer replied on :

    Let’s set a black level that’s lower than the data minimum; that will have the effect of brightening the image.

    Aren’t you doing the opposite? Setting a white level that is lower than the data maximum. (i.e. shifting things towards white to make them visible)

  2. Steve replied on :

    Tom—Sharp eyes! I was in the process of fixing this statement at the same time you were entering your comment. Yes, you’re totally right. In an early draft of this post, I was using display ranges like [-2000 6000]. But as I was making some final edits, I still wasn’t happy with the visual appearance of the display, so I tinkered with the display range to get something better. Unfortunately I forgot to edit the accompanying text to match the code.

  3. Tom Elmer replied on :

    At least it shows people are interested enough to pay attention :)

    I can’t wait for the rest of the series. (I do not think I have ever needed to use the watershed transform before, and I am interested in what it can do.)

  4. Rich Hoeg replied on :

    Working with the continental divide in the Rockies is too easy! Your next assignment … map all three major continental watershed divides found in Minnesota! There is one point in northern Minnesota where all three come together (Hudson Bay, St. Lawrence River, Mississippi River). Cheers!

  5. Thai Hoang replied on :

    My Matlab runs out of memory with this new post.

  6. Steve replied on :

    Thai—Yes, these data sets are pretty big. I used a 64-bit Linux computer with something like 6 GB of RAM.

  7. T replied on :

    What’s the reason for using multibandread instead of fread? I tried it on different (though relatively large) images that also have one “band” and it seems slower than fread…

  8. Steve replied on :

    T—You can use fread if you like. multibandread is more convenient to use for many “raw” files, but these are simple enough that fread is pretty easy.

  9. Kazım replied on :

    Very beatiful work.Thanks for shared.

  10. Matteo replied on :

    Hi Steve, great Series. I am trying to use it on tiles A and B to get Canadian continental divide. But I have problems loading data with fread. When I display with imshow it is not right. Upon inspection I noticed columns are arranged in a series 1 to 8 with 9th column the one that should follow column 1. Here is my code.

    fid = fopen('b10g','r',byte_order);
    [B] = fread(fid,[4800 10800],precision);
    fid1 = fopen('a10g','r', byte_order);
    [A] = fread(fid1,[4800 10800],precision);
    dem = [A,B];
    dem_cropped = dem(1500:4000, 6000:16500);
    imshow(dem_cropped, display_range);
    

    Do you notice anything wrong?
    Thank you so much for any advice.

  11. Steve replied on :

    Matteo‐Values in the data files are stored by row, whereas your fread call is filling the output matrix by column. There’s a mismatch between the number of columns in the input file and the number of rows you specified in the call to fread, so the data is going to wrap around in funny ways. Try something like this:

    B = fread(fid, [10800 4800], precision);
    B = B';
    

    multibandread takes care of such details for you.

  12. Matteo replied on :

    Thank you Steve. Beginner’s mistakes…

  13. Steve replied on :

    Matteo—No problem. That issue gets even experienced users from time to time.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Steve Eddins manages the Image & Geospatial development team at The MathWorks and coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.

  • Sana: hi steve, could you explain to me how i would be able to use the dir function, to do a loop through a directory...
  • Nishtha: Sir, I have preprocessed the image in following steps: [1] adaptive histogram equalization [2] thresholding...
  • Kristof: I also strongly support the idea. I have just recently bumped into the problem that im2single was not...
  • Steve: David—I’ m glad you found it useful!
  • David Lalejini: I found your example very useful for finding connected nodes in a large set of input pairs. I start...
  • tommy: Dear Steve, I have a question,please if you are kind to help me regarding the accumulator array dimensions of...
  • Steve: Abc—I don’t know how to distinguish the faces. You might try posting your question in the MATLAB...
  • Manju: well if we have a few ovals within each other like in a cell how do we measure the distance from the center...
  • Steve: Manju—What do you mean? How is each region defined?
  • Manju: if we have 2-3 regions within each other how do we measure the regions of each one?

These postings are the author's and don't necessarily represent the opinions of The MathWorks.