Dealing with “Really Big” Images: Image Adapters

Posted by Steve Eddins, September 23, 2011

3 views (last 30 days) | 0 Likes | 5 comments

I'd like to welcome back guest blogger Brendan Hannigan, for the third in a series of three posts on working with very large images in MATLAB.

In the previous two blog posts, I've been discussing how to avoid Out of Memory errors while working with large images using MATLAB and the Image Processing Toolbox. I first showed how to view and explore arbitrarily large images by creating a reduced resolution data set (R-Set) from an image file. Next, I demonstrated how you can process large images files using a file-to-file workflow, never loading the entire image into memory at once.

"Right, but my data is not in TIFF, NITF, or JPEG2000, remember?"
"Whoa there buddy!.. Object - "Oriented" ?! That's too complicated for me!"
"Ok so, how did Doug solve his problem?"
"Ok, that's some pretty dense code you have there."
"How do you know what properties and methods you need?"
"Hey wait, you don't define ImageSize in your properties block!!"
"Well that's not super intuitive, but ok. What goes inside the methods?"
"How am I supposed to know which block it needs?"
"Ok I think I got it, but what about writing to new files?"
"Cool, everything turned out better than expected!"

"Right, but my data is not in TIFF, NITF, or JPEG2000, remember?"

Yes, there's the problem. rsetwrite & blockproc support a few file formats "natively", but not everyone works with data in those formats.

There are some issues we face when creating functions like rsetwrite & blockproc which allow incremental processing of files from disc. Here are two:

Not all file formats are amenable to incremental "region-based" I/O.
There are a lot of file formats. Seriously.

That said, we wanted to provide this large data workflow to as many of out customers as possible. So, as of release R2010a, in addition to our "built-in" file formats, both rsetwrite and blockproc also support "image adapter" objects!

The ImageAdapter is an object-oriented MATLAB class. It is actually an "abstract" class, meaning that by itself it is not very useful. What it does do it define an interface for reading and writing image data. All you have to do is to tell us how to read and/or write sub-regions of your particular file format, and then we can do the rest!

"Whoa there buddy!.. Object - "Oriented" ?! That's too complicated for me!"

No, it's not. Don't sweat it, it's really not. I won't go into a full tutorial on how to write MATLAB classes in this blog as there are excellent videos and tutorials available on our website and in our product documentation that cover that. Instead, I will walk through a quick example that recently came up in this very blog.

In 2009 Steve published a blog post titled "MATLAB R2009a - imread and multipage TIFFs". Over the last 2 years many folks have commented on this post in an ongoing discussion about multi-page TIFF files. One customer, we'll call him "Doug", had a problem using blockproc to process arbitrary pages of a multi-page TIFF file.

"Doug" was frustrated because he found that there was no way to tell blockproc that he wanted to process the Nth "page" in his TIFF file. blockproc is hard-wired to process the first page of a TIFF file when passed a multi-page TIFF image. We didn't provide a syntactic option to select which page to process, because we wanted to avoid format-specific syntaxes/parameters in blockproc. Otherwise you can imagine the function interface could get pretty complex pretty fast.

"Ok so, how did Doug solve his problem?"

This was a perfect use case for the ImageAdapter class. Image adapter objects are useful when you want to have more control over the I/O in blockproc and rsetwrite. You may want to just control some specific aspect of how your file is read/written (like Doug) or you might want to read/write a completely new file format.

I wrote Doug a quick image adapter class to solve his problem which I will share with you, but first let's look at how it is used in a quick example. The image adapter class is called PagedTiffAdapter. We'll be using it to work with a multi-paged TIFF image, mri.tif (download link).

% Get some image information, we'll need this later.
filename = 'mri.tif';
page = 5;
info = imfinfo(filename);
cmap = info(page).Colormap;

% Create our PagedTiffAdapter object!
my_adapter = PagedTiffAdapter(filename,page);

% Let's not "do" anything to the data, let's just read it and return it
no_op_fun = @(bs) bs.data;

% Call blockproc using our image adapter object as the input source
single_page = blockproc(my_adapter,[100 100],no_op_fun);

% Display our single page from this TIFF file
imshow(single_page,cmap)

Voila. That's pretty simple right? We've now used blockproc to read in the 5th page of our multi-page TIFF file, mri.tif. Granted, this is not a particularly compelling use of block processing, but I'm just trying to show how you can use image adapter objects in place of "conventional" input images.

Let's have a look at the class now.

classdef PagedTiffAdapter < ImageAdapter
    properties
        Filename
        Info
        Page
    end
    methods
        function obj = PagedTiffAdapter(filename, page)
            obj.Filename = filename;
            obj.Info = imfinfo(filename);
            obj.Page = page;
            obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
        end
        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end
        function result = close(obj) %#ok
        end
    end
end

"Ok, that's some pretty dense code you have there."

The class is quite straightforward. It begins with a classdef line, which defines the name of the class and also indicates that the class inherits from our base-class, ImageAdapter, using the < symbol.

Next we see 2 sub-sections of our class definition, a properties block which holds important data that we will need over the lifespan of each object...

    properties
        Filename
        Info
        Page
    end

...and a methods block which defines the behavior of the objects.

    methods
        function obj = PagedTiffAdapter(filename, page)
            obj.Filename = filename;
            obj.Info = imfinfo(filename);
            obj.Page = page;
            obj.ImageSize = [obj.Info(page).Height obj.Info(page).Width];
        end
        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end
        function result = close(obj) %#ok
        end
    end

"How do you know what properties and methods you need?"

Ahh, good question. Classes which inherit from the ImageAdapter base class are REQUIRED to have:

A class constructor for initialization (all MATLAB classes require this)
a readRegion method (required by the ImageAdapter base class)
a close method (required by the ImageAdapter base class)
a ImageSize property (required by the ImageAdapter base class)

"Hey wait, you don't define ImageSize in your properties block!!"

That is true. The ImageSize property is defined in the base class, so you don't have to redefine it here, you just have to set it. That's why at the end of my class constructor, I make sure to set the ImageSize property to be the size of the page that I am interested in.

"Well that's not super intuitive, but ok. What goes inside the methods?"

This class is going to be used to read data from a TIFF file, so in the class constructor all we do is gather information about the file that we'll need later and store that information in the appropriate properties.

Let's look the other methods individually. First the close method.

        function result = close(obj) %#ok
        end

The close method, in this case does nothing. The reason it does nothing is that we are doing our actual file I/O using the imread function, which does not require us to open the file handle directly. If we were writing an image adapter to read say, some arbitrarily formatted binary image, then we would likely be opening our file handle in the constructor, storing it in a property, and then in the close method we would close the file handle and do any other necessary clean up tasks. This example class is just very simple, so we have no "cleaning up" to do when we are done, but if we did, we would put that code in our close method. Regardless of what clean up code you need, you must have a close method. The close method is called by blockproc and rsetwrite only once, after all file I/O has completed.

Now let's look closely at the readRegion method.

        function result = readRegion(obj, start, count)
            result = imread(obj.Filename,'Index',obj.Page,...
                'Info',obj.Info,'PixelRegion', ...
                {[start(1), start(1) + count(1) - 1], ...
                [start(2), start(2) + count(2) - 1]});
        end

This is the real work horse of the class. You will probably never need to call this method (or any method other than the constructor) yourself. Instead the image adapter clients will call these functions. blockproc is going to call this readRegion method when it wants to read a block of the input image. It's your job to figure out which block it needs, read that data, and send it back to it.

"How am I supposed to know which block it needs?"

It'll tell you! It's all contained in the 2 input arguments to the readRegion method, start and count. The start argument is a 2-element vector specifying the [row col] of the first pixel we need. The count argument is a 2-element vector specifying the size of the requested region in [rows cols].

For example, let's say start was [5 9] and count was [2 3]. Your method should return the data in rows 5-6 and columns 9-11, just as if you had indexed a variable like this:

return_to_blockproc = myVariable(5:6,9:11);

So your implementation of readRegion needs to take these 2 input arguments and then return the appropriate image data that they specify.

What will that mean for you? Well it depends on what your image adapter is designed for. In this case we are simply reading a specific page of a TIFF file, so I'm using start and count to construct a 'PixelRegion' argument to the imread function that will fetch only the pixels I am interested in. In your case, your readRegion might be pulling data from some binary formatted file, using a 3rd party mex-file to read a piece of data, or just whatever! That's the beauty of the image adapters, you can get your data from whereever you want.

"Ok I think I got it, but what about writing to new files?"

You can use your class to write to files as well, by defining a writeRegion method. The writeRegion method is (not surprisingly) almost the exact opposite of the readRegion method. Instead of you returning data from a specified block, you will receive data as an input argument and then write that data to our image at a specified block location.

After you write a writeRegion method, you can then use objects of your class as 'Destination' parameters to blockproc, allowing you to both read and write to arbitrarily large files of arbitrary format!

Specifying a writeRegion method is completely optional, but you must specify it if you want to use your image adapter object as a 'Destination' in blockproc. Otherwise, your objects will be "read-only".

In the spirit of brevity, I'm not going to do that for our PagedTiffAdapter. I've found that writeRegion methods are often more complex than their readRegion counterparts. I'll leave that as an exercise for the reader.

"Cool, everything turned out better than expected!"

Thanks! Well that about wraps it up. If you want to learn more about writing and using image adapter objects there is a chapter in the Image Processing Toolbox Users' Guide called "Working with Data in Unsupported Formats". There we walk thought writing an adapter for a more complex binary formatted file. You can also see the documentation for blockproc, rsetwrite, and ImageAdapter.

Thanks again Steve for letting me talk about this stuff! Have a great day!

Published with MATLAB® 7.13