Under the hood of imread
I'm going to play a small trick on you today. Try reading in this JPEG file using imread:
url = 'https://blogs.mathworks.com/images/steve/2014/peppers.jpg';
rgb = imread(url);
imshow(rgb)
So what's the trick? Well, look more closely at this file using imfinfo:
info = imfinfo(url)
info = Filename: 'https://blogs.mathworks.com/images/steve/2014/...' FileModDate: '11-Jul-2014 14:46:33' FileSize: 287677 Format: 'png' FormatVersion: [] Width: 512 Height: 384 BitDepth: 24 ColorType: 'truecolor' FormatSignature: [137 80 78 71 13 10 26 10] Colormap: [] Histogram: [] InterlaceType: 'none' Transparency: 'none' SimpleTransparencyData: [] BackgroundColor: [] RenderingIntent: [] Chromaticities: [] Gamma: [] XResolution: [] YResolution: [] ResolutionUnit: [] XOffset: [] YOffset: [] OffsetUnit: [] SignificantBits: [] ImageModTime: '16 Jul 2002 16:46:41 +0000' Title: [] Author: [] Description: 'Zesty peppers' Copyright: 'Copyright The MathWorks, Inc.' CreationTime: [] Software: [] Disclaimer: [] Warning: [] Source: [] Comment: [] OtherText: []
See it yet? No?
Look at the Format field:
info.Format
ans = png
The function imfinfo is claiming this this JPEG file is really a PNG file, which is a completely different image file format!
So what's going on here? Is this a JPEG file or not?
This trick question is really just an excuse to peek under the hood of imread to see how an interesting piece of it works. (Well, it's interesting to me, at least.)
Before opening the hood, though, let's try reading one more file. And notice that this filename has an extension that has nothing to do with any particular image format.
url2 = 'https://blogs.mathworks.com/images/steve/2014/peppers.fruit_study_2014_Jul_11';
rgb2 = imread(url2);
imshow(rgb2)
OK, so imread can successfully read this image file even without an extension indicating its format.
If you're curious, a lot of the imread code that makes all this work is available for you to look at in your installation of MATLAB. (If, on the other hand, you're not curious, then this would be a good time to go over and read Cleve's blog instead.) For example, you can view the source code for imread in the MATLAB Editor by typing edit imread. (Please don't modify the code, though!)
Here's a partial fragment of code near the top:
if (isempty(fmt_s)) % The format was not specified explicitly.
... snip ...
% Try to determine the file type. [format, fmt_s] = imftype(filename);
Hmm, what's that function imftype?
which imftype
'imftype' not found.
It doesn't appear to exist!
It does exist, but it happens to be a private function. The function which will find it if you tell which to look at little harder.
which -all imftype
/Applications/MATLAB_R2014a.app/toolbox/matlab/imagesci/private/imftype.m % Private to imagesci
Even though you can't directly call this function (that's what private means here), you can still look at it in the MATLAB Editor by typing edit private/imftype.
Here's some code from near the beginning of imftype.
idx = find(filename == '.'); if (~isempty(idx)) extension = lower(filename(idx(end)+1:end)); else extension = ''; end
% Try to get useful imformation from the extension.
if (~isempty(extension))
% Look up the extension in the file format registry. fmt_s = imformats(extension);
if (~isempty(fmt_s))
if (~isempty(fmt_s.isa))
% Call the ISA function for this format. tf = feval(fmt_s.isa, filename);
if (tf)
% The file is of that format. Return the ext field. format = fmt_s.ext{1}; return;
end end end end
In English: If the filename has an extension on it, use the imformats function to get a function that can test to see whether the file really has that format.
So what's that new function imformats in the middle there? Well, you can call this one directly. Try it.
s = imformats
s = 1x19 struct array with fields: ext isa info read write alpha description
That output is not very readable. If we were designing this today, we'd probably make imformats return a table. Fortunately we've got an easy way to convert a struct array into table!
t = struct2table(s)
t = ext isa info read write alpha __________ _______ ___________ _________ _________ _____ {1x1 cell} @isbmp @imbmpinfo @readbmp @writebmp 0 {1x1 cell} @iscur @imcurinfo @readcur '' 1 {1x2 cell} @isfits @imfitsinfo @readfits '' 0 {1x1 cell} @isgif @imgifinfo @readgif @writegif 0 {1x1 cell} @ishdf @imhdfinfo @readhdf @writehdf 0 {1x1 cell} @isico @imicoinfo @readico '' 1 {1x2 cell} @isjp2 @imjp2info @readjp2 @writej2c 0 {1x1 cell} @isjp2 @imjp2info @readjp2 @writejp2 0 {1x2 cell} @isjp2 @imjp2info @readjp2 '' 0 {1x2 cell} @isjpg @imjpginfo @readjpg @writejpg 0 {1x1 cell} @ispbm @impnminfo @readpnm @writepnm 0 {1x1 cell} @ispcx @impcxinfo @readpcx @writepcx 0 {1x1 cell} @ispgm @impnminfo @readpnm @writepnm 0 {1x1 cell} @ispng @impnginfo @readpng @writepng 1 {1x1 cell} @ispnm @impnminfo @readpnm @writepnm 0 {1x1 cell} @isppm @impnminfo @readpnm @writepnm 0 {1x1 cell} @isras @imrasinfo @readras @writeras 1 {1x2 cell} @istif @imtifinfo @readtif @writetif 0 {1x1 cell} @isxwd @imxwdinfo @readxwd @writexwd 0 description __________________________________ 'Windows Bitmap' 'Windows Cursor resources' 'Flexible Image Transport System' 'Graphics Interchange Format' 'Hierarchical Data Format' 'Windows Icon resources' 'JPEG 2000 (raw codestream)' 'JPEG 2000 (Part 1)' 'JPEG 2000 (Part 2)' 'Joint Photographic Experts Group' 'Portable Bitmap' 'Windows Paintbrush' 'Portable Graymap' 'Portable Network Graphics' 'Portable Any Map' 'Portable Pixmap' 'Sun Raster' 'Tagged Image File Format' 'X Window Dump'
Now we've gotten to some interesting stuff! This table represents the guts of how imread, imfinfo, and imwrite knows how to deal with the many different image file formats supported.
If you pass an extension to imformats, it looks through file formats it knows about to see if it matches a standard one.
imformats('jpg')
ans = ext: {'jpg' 'jpeg'} isa: @isjpg info: @imjpginfo read: @readjpg write: @writejpg alpha: 0 description: 'Joint Photographic Experts Group'
Let's go back to the original file peppers.jpg and consider what happens in the code we've seen so far.
1. We did not specify the format explicitly (with a 2nd argument) when we called imread, so imread called imftype to determine the format type.
2. The function imftype found an extension ('.jpg') at the end of the filename, so it asked the imformats function about the extension, and imformats returned a set of function handles useful for doing things with JPEG files. One of the function handles, @isjpg, tests to see whether a file is a JPEG file or not.
To be completely truthful, @isjpg just does a quick check based only on the first few bytes of the file. Look at the code by typing edit private/isjpg. Here are the key lines.
fid = fopen(filename, 'r', 'ieee-le'); assert(fid ~= -1, message('MATLAB:imagesci:validate:fileOpen', filename)); sig = fread(fid, 2, 'uint8'); fclose(fid); tf = isequal(sig, [255; 216]);
OK, I have now taught you enough so that you can thoroughly confuse imread if you really want to. But don't you already have enough hobbies?
3. In this case, the file wasn't actually a JPEG file, so the function handle @isjpg returned 0 (false).
That brings us to the rest of the excitement in imftype.
% Get all formats from the registry. fmt_s = imformats;
% Look through each of the possible formats. for p = 1:length(fmt_s) % Call each ISA function until the format is found. if (~isempty(fmt_s(p).isa)) tf = feval(fmt_s(p).isa, filename); if (tf) % The file is of that format. Return the ext field. format = fmt_s(p).ext{1}; fmt_s = fmt_s(p); return end else warning(message('MATLAB:imagesci:imftype:missingIsaFunction')); end end
In English: For every image file format we know about, run the corresponding isa function handle on the file. If one of the isa functions returns true, then return the corresponding set of information from imformats.
Back to the story for our misnamed image file peppers.jpg. The @isjpg function handle returned false for it. So imftype then tried the isa function handles for every image file format. One of them, @ispng, returned 1 (true). That information was passed back up to imread, which then read the file successfully as a PNG, which was the file's true format type.
Finally, here's what happened for the image file peppers.2014_Jul_11. When imftype passed the extension '2014_Jul_11' to imformats, no such image format extension was found, so imformats returned empty. That caused imftype to go into the code that simply tried every image format it knew about, which again worked when it got to PNG.
Phew! That's the story of the effort imread makes to read your images in correctly.
For the three of you that are still reading along, I'll send a t-shirt to the first one to post a convincingly complete explanation of this line of code from above:
extension = lower(filename(idx(end)+1:end));
Comments
To leave a comment, please click here to sign in to your MathWorks Account or create a new one.