Steve on Image Processing

Dealing with “Really Big” Images: Block Processing 16

Posted by Steve Eddins,

I'd like to welcome back guest blogger Brendan Hannigan, for the second in a series of three posts on working with very large images in MATLAB.

Contents

Dealing with "Really Big" Images: Block Processing

Hi! This is Brendan Hannigan, back again to continue our discussion on working with very large images in MATLAB. In the previous blog I discussed a couple of different ways to view and explore large images using the Image Processing Toolbox. Today I'll take the next logical step down the "large image workflow" path and we'll explore how to process images that are too large to load into memory.

"Sounds good. let's do this!"

Since the entire image cannot be loaded into memory at one time, we opted for an incremental, file-to-file solution. Basically we want to read a part of your "input" image into memory, process it in some way, and then write the results back to a new file, the "output" image. We continue to do this until the entire image has been processed, avoiding Out of Memory errors.

"This seems familiar to me..."

The flow of data described here is very similar to an existing IPT function, blkproc, which allows for block processing of images, but like most other IPT functions, it only supports in-memory processing. Originally, we considered expanding the scope of blkproc to support file-to-file workflows as well, but there were some syntactic and behavioral issues that I wasn't comfortable with and we would've likely introduced some backwards incompatibilities.

We instead opted for a completely new function in release R2009b. The result? blockproc! Very creative name right? blockproc has all of the capabilities of blkproc and a LOT more!

"Wait, didn't I read on CSSM that blockproc is way slower than blkproc ?"

Ahem... well... Ok fine, yes, it USED to be slower. The initial release of blockproc went out with what I will refer to simply as "performance growth opportunities" (sorry about that).

However, with the release of R2010b, all that has changed! blockproc performance was dramatically improved and is now comparable with blkproc for similar tasks.

Let's have a quick look at how it works!

% read input image
A = imread('peppers.png');
imshow(A);
% define block size and function to run on each block of data
block_size = [64 64];
my_function = @(block_struct) block_struct.data(:,:,[3 2 1]);
% call blockproc!
B = blockproc(A,block_size,my_function);
figure;
imshow(B);

"What just happened?"

What we've done here is swapped the red and blue color channels of our RGB peppers image with some simple indexing. blockproc did this one 64x64 block at a time and assembled the results into a single output image. Things that were "mostly red" became "mostly blue" and vice versa while the purple background remained mostly unchanged.

Basically all blockproc needs is an input image, a block size, and a function handle to run (similar to blkproc);

You may notice that I used an "anonymous function" to create the function handle my_function, but that's not necessary. You can also provide a function handle to a function that's either defined on your path or sitting in your current directory. The only requirement is that the function must accept a "block struct" as it's sole input argument (that we will pass to it, from inside of blockproc), and must return the processed data.

"Uhm... 'block struct'? That seems overly complicated"

The block struct is a MATLAB struct that contains the block's image data (in the .data field) as well as several other pieces of useful information. Most important among these is the .location field, which contains the location (in your input image) where the block came from. This .location information opens up a vast new world of potential uses for blockproc that we won't get into here.

Most of the time you use blockproc you'll probably just use the .data field, but for any operation that changes depending on which block of the image you are processing, the .location field is key, and you'll be thanking me then! You can check out the blockproc doc to learn more about what other information we package into the block struct.

Now for a more "real" example, applying a low-pass filter to an image. First, the conventional way:

% read the original photo into memory
origp = imread('cameraman.tif');
imshow(origp);
% create a Gaussian low-pass filter
h = fspecial('gaussian',5,2);
% compute the derived photo
derp1 = imfilter(origp,h);
imshow(derp1);

Now, with blockproc. We will re-use the original photo, origp, as well as the Gaussian filter.

% create a function handle that we will apply to each block
myFun = @(block_struct) imfilter(block_struct.data,h);
% setup block size
block_size = [64 64];
% compute the new derived photo
derp2 = blockproc(origp,block_size,myFun);
imshow(derp2);

"What's with those lines all over your result?"

What we see here are artifacts from block processing. What happened? Well, as our function, imfilter, processes its input it will require some padding values near the edges of the image. By default, imfilter will use "zero padding" to supply these "pixels" that lie beyond the boundary of the actual image data.

But remember, blockproc processes each block totally independently from its neighbors, so as we processed each block, imfilter was diligently padding each one with zeros, causing these black lines to appear when the results were assembled into our final output image.

"Ok, so how do I fix it?"

Luckily, we thought of that. blockproc has several optional parameters that we can specify to control all aspects of padding and block borders. One in particular is 'BorderSize', which lets us specify some "overlap" between blocks. Since our filter is size 5x5, we need a 2 pixel overlap between all of our blocks. Here's how we do that:

border_size = [2 2];
derp3 = blockproc(origp,block_size,myFun,'BorderSize',border_size);
imshow(derp3);

"Ok great, you fixed it. What does this have to do with large images?"

Right. Here's the "large data" hook: blockproc can take string filenames as its "input image", and can subsequently write the "output" to a file by specifying a new 'Destination' parameter. When you specify a new output 'Destination', blockproc will not return the result image to the MATLAB workspace.

% specify a string filename as our input image
origp_filename = 'cameraman.tif';
% specify a string filename as the 'Destination' of our output data
derp_filename = 'output.tif';
% don't request an output argument this time!
blockproc(origp_filename,block_size,myFun,...
    'BorderSize',border_size,'Destination',derp_filename);
imshow(derp_filename);

Now that the input and output are specified as files instead of workspace variables, you can run this command regardless of image size. Images are read, processed, and written incrementally, one block at a time.

Need to apply a filter on a 3 gigabyte image? No problem! Trying to segment vegetation from a few terabytes of satellite imagery? No problem!

"Hmm, ok yea that's cool. What's the catch?"

There's no catch! Ok, there's a small catch. blockproc only supports reading and writing to TIFF and JPEG2000 format files "natively".

"You're killing me with these file format restrictions! I don't use TIFF!"

Hey don't worry! We have you covered. I said that blockproc only supports TIFF and JPEG2000 "natively". What I meant by that is blockproc has "built-in" support for those file formats, but the function can "adapt" (hint,hint) to many other formats... which I will talk about next time.

Stay tuned!

Thanks, Brendan. -SE


Get the MATLAB code

Published with MATLAB® 7.12

16 CommentsOldest to Newest

WS—No, I don’t know of a general approach. There is some relevant research in the literature on parallel implementations of some global morphological operations.

Regarding the interesting post on blocks and block processing:

Can one skip the processing of a block with little information such as the top left (1,1) block or the top right (1,4) block of the cameraman image (showing only sky) and limit the processing to selected blocks containing image features , e.g. using a colour-gradient criteria or another block-selection threshold or more complex criteria. thank you , gianni

@gianni, you could write a more complicated function to feed into blockproc that decides based on some criteria whether or not to process that block.

If I have a JPEG2000 image that is too large to fit in memory (easily 4GB – 20GB) is there a way I can use the blockproc function to create a single multi-resolution JPEG2000 image ? I realize that I can repeatedly use the blockproc function to create several JPEG2000 images of different resolutions, but the goal is to have a single JPEG2000 image with different resolution levels in the same file.

Alternatively, is there a way to use the rsetwrite function to create a muti-resolution JPEG2000 image instead of an HDF5 file ?

Hi Sid,

I chatted briefly with my co-worker and jpeg2000 wiz Ashish about this. Just to make sure I understand your problem, it sounds like you have a JPEG2000 file with only a single wavelet decomposition level, meaning you only have 1 resolution of data in your file, and you would to convert it to a jpeg2000 file with several resolution levels.

When writing jpeg2000 files, blockproc, by default, writes multi-resolution files. You cannot specify the number of reduction levels in your output image, but blockproc will select that based on some internal criteria.

I think you can just take your original image, and “run it through” blockproc into a new jpeg2000 image, and that resulting image will be multi-resolution.

Try this:

blockproc(‘YOURFILE.jp2′,[1024 1024],…
@(bs) bs.data,’Destination’,'NEWFILE.jp2′);

You can potentially use a larger block size if you want. The larger the block size the faster this operation will take (generally).

See if that works and let us know. Thanks!

-brendan

Brendan -
Thanks for the clarification. I used imfinfo to check one of the JPEG2000 images I had created using blockproc and it does indeed have multiple decomposition levels (8 in this particular case).

I was expecting something different, but I realize now that I did not completely understand how the JPEG2000 format works. I was expecting to see multiple “pages” like one might see in a multi-page tiff, with each “page” having a different decomposition of the original image. I realize now that I already have the image in the format I need along with the ability to read different reduction levels using imread.

By the way, the documentation for blockproc seems to imply that only the TIFF format is supported when using the ‘Destination’ parameter. I did not realize this when I specified “jp2″ as the ‘Destination’ format and so far it’s worked as expected. I am using R2011b.

Thanks again !

great post! i would like to know can we process variable-size overlapping blocks using blockproc. More specifically i would like to can we select for example 16×16 blocks after every 8 pixels horizontally and vertically.

Thank you Steve for your reply. Let me explain query. Suppose i have an image of 64X128 pixels. I want to know that can blockproc be used for processing overlapping blocks. More specifically, if i want to process/divide the image into 16X16 blocks overlapped in a scale of 0.5 i.e. each block overlaps with other by 8 pixels horizontally or vertically. we might say each block selected after every 8 pixels horizontally or vertically. Can this be done using blockproc?

Steve- thanks for the reply. If its possible can you show me how to do it. Does this speedup the computation process than that of two nested for loop.

I’m a little new to the blockproc function, and I was hoping you could answer a question I have about the function handle that you pass to blockproc. I want to be able to use blockproc to cross-correlate (e.g., via normxcorr2) corresponding blocks from two different images. Is this possible with blockproc, and if so, how would I define the input argument for myFun?

These postings are the author's and don't necessarily represent the opinions of MathWorks.