Steve on Image Processing

July 30th, 2008

Reading Massively Multipage TIFFs: An Update

Last fall I wrote about a complaint some MATLAB users had about reading certain kinds of TIFF files. Specifically, some users have TIFF files that each contain tens of thousands of images—or more! One user sent us a single TIFF file containing almost 130,000 individual images. My tongue-in-cheek name for these files is "Massively Multipage TIFF" (MMT).

The problem is that the syntax for reading the k-th image from a TIFF file:

  >> I = imread(filename, k);

takes significantly longer for images at the end of the file. As a result, reading all of the images in such a file takes way too long. I discussed the technical issues in the post "How many images can fit in a TIFF file" last September.

We now have updated files you can use to improve import performance on these TIFF files. The necessary files and instructions can be downloaded here. The patch is only for the most recent MATLAB release, R2008a. Please do not try to apply the patch to earlier releases.

[Update - August 11, 2008 - The download link above now points to an official MathWorks technical support solution page.

With the updated files, images at the end of a TIFF file can be read in the same amount of time as images at the beginning of the file. Note that a different syntax is necessary, so please read the instructions in the download carefully.

You should be aware that these updates will NOT be included in the upcoming R2008b release. These changes were implemented very recently, and the relevant development deadline for R2008b changes was a couple of months ago.

Also, we don't have any updates available yet for improving the performance of creating such TIFF files using imwrite.

17 Responses to “Reading Massively Multipage TIFFs: An Update”

  1. Jason Merrill replied on :

    I think aviread suffers the same quadratic time problem for long movies. Are there any plans to address that problem similarly?

  2. Dave Tarkowski replied on :

    Jason,

    AVIREAD generally does not suffer from issues reading most movies on Windows. There are some files that will cause an increase in the time to read frames at the end of files, generally those without any key frames. AVIREAD on non-Windows platforms is more likely to have this problem.

    I would suggest using MMREADER, the replacement to AVIREAD. Note that MMREADER is not currently available for Linux.

  3. Roy replied on :

    Thanks! nice to see Mathworks responds to its user base. Yet another reason (beside the OOP) to upgrade to the 2008a.

  4. Steve replied on :

    Roy—I love using the new object-oriented programming features in R2008a, and I hope many of our users will, too.

  5. Waqas replied on :

    As a new user, I want to ask the maximum file size that can be read from IMREAD? I downloaded a 94 Mb tiff map, but I am not able to import the file into MATLAB.

    Regards
    Waqas

  6. Giles Kingsley replied on :

    Steve,

    You did not reply to my last query, so I will be pleasantly surprised if you can answer this. I have mosaiced several MrSID images into a TIFF, and the resulting image is giving me a memory error. Do I need a badder machine or can I compress the data somehow? -Giles

  7. Giles Kingsley replied on :

    I suppose this has to do with the previous question, the image size is 1.5 Gb.

  8. Steve replied on :

    Giles—I do not see a previous query from you on this post. It’s possible I may have deleted your comment; it is my policy to delete comments that are not relevant to the posted topic. Also, I make no promises about when I will reply to relevant comments. Often I answer simple questions immediately, but sometimes it takes a week or so for me to catch up on comments. If you are looking for a reliably quick response, you are better off contacting technical support.

    Your image size is quite large, and if you are running on a 32-bit OS you will likely run out of virtual address space (an OS limitation), even if you do have enough RAM.

    See this product support note.

  9. Steve replied on :

    Wagas—It depends on how much memory you have on your computer. You should be able to load a 94 MB TIFF file into MATLAB on a modern computer, unless perhaps the file is highly compressed.

  10. Siddharth Samsi replied on :

    This is a little off topic, but it relates to large TIFFs:
    I was pleasantly surprised with the new functionality in the imshow function in MATLAB R2008a, namely, the ability to auto-reduce/resize and display large TIFFs. (I have tried this with success on a 500 MB TIFF image on a 32-bit machine with only 1.5 GB RAM)
    Are there any plans to incorporate this type of functionality for TIFF files into the imresize function ? While the imread function along with the ‘PixelRegion’ parameter can be used to resize the TIFF and create thumbnails for future use, it would be convenient to simply use the imresize function.

  11. Steve replied on :

    Siddharth—Thanks for the suggestion. We are currently looking at a variety of workflows involving large TIFFs.

  12. Siddharth replied on :

    Great ! I am looking forward to the new functionality.

  13. Vincent replied on :

    Steve- I just had a quick run at the new imread.m patch. It’s much faster than the previous version but not quite as fast as our current solution. We currently solution is to read the tiff stack using the ImageJ java library, and cast the result to a MATLAB array (which is itself very inefficient). Here are the results of my informal benchmark.

    Hardware: 2x Quad Core Xeon 2.3 GHz, 32 GB RAM
    Software: Windows 2003 Server x64, R2008a, ImageJ 1.38x
    Benchmark: Reading a 900 MB TIFF stack split in four files, 3840 slices total each with 239×256 resolution. All reads done from the cache so the benchmark does not depend on hard disk bandwidth.

    Results: ImageJ alone =< 5 s (or 180 MB/s).

    ImageJ within MATLAB = 8.8s (or 100MB/s).

    New imread (see code below) = 11.3s (or 80 MB/s).

    I think that the difference in performance can explained by differences in multithreading support. imread is essentially singlethreaded and ImageJ has native multithreading support.

    Note that this is a toy data set. Our typical data sets are much larger.

    Here is the code:

    info = imfinfo(filename);

    stack = zeros(info(1).Height,info(1).Width,1000,’uint16′);

    for k = 1:length(info)
    stack(:,:,k) = imread(ffilename , ‘Index’, k, ‘Info’, info);
    end

  14. Vincent replied on :

    Oops numbers were wrong. Data set was 450MB large so the numbers are:

    Results: ImageJ alone =< 5 s (or 90 MB/s).

    ImageJ within MATLAB = 8.8s (or 50 MB/s).

    New imread (see code below) = 11.3s (or 40 MB/s).

  15. Steve replied on :

    Vincent—Thanks for giving it a try and reporting back. I’m a bit skeptical that multithreading would explain the performance difference, since I/O-bound ops aren’t likely to benefit from multiple threads. One issue we have to deal with in MATLAB is that array elements are stored in a different order than the way they are stored in every image file format, including TIFF. This results in cache misses that slow things down. An import operation in ImageJ alone wouldn’t have this problem.

  16. Vincent replied on :

    My only data point on multithreading is that the Performance tab of the Task Manager shows increased CPU usage on multiple core in ImageJ but not in MATLAB.

  17. Steve replied on :

    Vincent—OK, thanks for the information.


MathWorks
Steve Eddins is a software development manager in the MATLAB and image processing areas at MathWorks. Steve coauthored Digital Image Processing Using MATLAB. He writes here about image processing concepts, algorithm implementations, and MATLAB.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.