Loren on the Art of MATLAB

April 17th, 2008

Processing a Set of Files - Repost

This is a repost of an article I wrote early on in this blog's history. Since there continue to be frequent questions on the MATLAB newsgroup regarding processing a set of files, I thought it would be worthwhile recapping this post.

Contents

Setup

I should start a clean workspace and with no WAV-files in my blog publishing directory.

clear all
delete *.wav

Problem Statement

Suppose I want to convert sounds stored in MATLAB MAT-files to files saved in WAV format for Windows. Without explictly hardcoding in the filenames, here's a way to proceed.

Collect the MAT-files Containing Sounds

matfiles = dir(fullfile(matlabroot,'toolbox','matlab','audiovideo','*.mat'))
matfiles = 
6x1 struct array with fields:
    name
    date
    bytes
    isdir
    datenum

Check out the Files

We can see from the length of matfiles that I have 6 MAT-files.

load(matfiles(1).name)
whos
  Name              Size             Bytes  Class     Attributes

  Fs                1x1                  8  double              
  matfiles          6x1               2576  struct              
  y             13129x1             105032  double              

Loading the data places the MAT-file contents into a structure from which I can extract the information I need and write it back out. The sound files that MATLAB ships with store the data in y and the sampling frequency in Fs. Let's look at the first signal.

N = length(y);
plot((1:N)/(N*Fs),y), title(matfiles(1).name(1:end-4))

Loops over the Files

for ind = 2:length(matfiles)
    data = load(matfiles(ind).name);
    wavwrite(data.y,data.Fs,matfiles(ind).name(1:end-4));
end
Warning: Data clipped during write to file:gong
Warning: Data clipped during write to file:splat
Warning: Data clipped during write to file:train

What about Those Files?

dir *.wav
gong.wav      laughter.wav  train.wav     
handel.wav    splat.wav     

Read a File Back for Verification

I'll double-check the last file I just read in and wrote out. ind is still set despite no longer being in the for loop. Let's check both the frequencies and the signals themselves.

[ywav, Fswav] = wavread(matfiles(ind).name(1:end-4));
eqFreqs = isequal(Fswav, data.Fs)
datadiff = norm(ywav-data.y)
eqFreqs =
     1
datadiff =
    0.0010

The data stored in the WAV-file is NOT exactly the same as that stored in the MAT-file. Reading the help for wavwrite gives some insight; the data in the WAV-file, by default, is stored as 16-bit data vs. MATLAB's standard 64-bit double.

Thoughts?

Does it confuse people here that I don't worry about vectorization? Any other thoughts on this topic?


Get the MATLAB code

Published with MATLAB® 7.6

7 Responses to “Processing a Set of Files - Repost”

  1. Roy replied on :

    Hi Loren,

    How about this ‘one liner’ solution instead of the loop?
    f=@(name) wavwrite(getfield(load(name),’y'),getfield(load(name),’Fs’),name(1:end-4))
    cellfun(f,{matfiles(:).name});

    Where matfiles as the same as in your post.

    In this case, since you need several variables form the mat file (e.g. I have to use load twice per file) its probably less efficient, but I use a similar one liner to rescale lots of images in one go:

    imgfiles=dr(’*.png’);
    cellfun(@(name) imwrite(imresize(imread(name),0.5),cat(2,’sml_’,name)),{imgfiles(:).name});

    Don’t know if its really better, just following the rule of thumb that as far as matlab is concerned usually less code is better…

  2. Loren replied on :

    Roy-

    You can certainly use your code. It definitely is jam-packed with many MATLAB features!

    It’s a little harder to read, in my opinion, but it does get the job done. I’m not sure you gain any speed either, even if you could trim out the second file loading.

    In this case, I think better/worse is really an individual preference.

    –Loren

  3. Markus replied on :

    Aaaaargh!

    Roy, do you think you will be able to decipher these cryptic lines after two month? Or one of your colleagues?

    I read a good advice for programming somewhere (I don’t remember the exact words):

    Always code as if the programmer having to maintain your code were a raving lunatic who knows where you live.

    Yours
    Markus

  4. User replied on :

    yes, that’s all very nice, but matlab file processing is still very imature and not qa’ed well enough. just for example, here are some critical bugs i found in the ‘dir’ command, that have not been solved in the last 3 matlab releases:

    1. the ‘dir’ command returns filesize=0 for files which contain unicode characters (such as hebrew or chinese).

    2. in some timezones during DST, the datenum returned by ‘dir’ command is offset by 1 *extra* hour, in addition to the DST offset. in other words, it is (wrongly) corrected twice for DST. very annoying.

  5. Loren replied on :

    User-

    Thanks for the information. Could I convince you to report it to technical support: http://www.mathworks.com/support/service_requests/contact_support.do
    to report these issues? That’s the best way to get the information back to us so it can be handled appropriately. Thanks.

    –Loren

  6. John McElroy replied on :

    Hey, awesome! I’m going to start digging through these blog entries. I’m glad I found them!!

  7. the_milad replied on :

    “Always code as if the programmer having to maintain your code were a raving lunatic who knows where you live.”
    So true!
    Currently I’m the raving lunatic and the programmer before ran off to Egypt.

    Thank you loren, this is exactly what I was looking for.

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • Jun: I totally can not believe it, Loren. You are really helpful. Thank you so much, MATLAB master!
  • Loren: Wow folks- Always lots of interest when there’s a quickie to try out! I will only make 2 general...
  • Loren: Jun- ismember is your friend here: >> [aa,ind] = ismember(Array2,Arra y1) aa = 1 1 1 1 1 1 1 ind = 1 2 1 4 4 3...
  • Dan: I like the first way better than the second way. Combining the arrays into one and running any is nice, although...
  • James Myatt: How about I = (a == 0 | b == 0); a(I) = []; b(I) = [];
  • Tunc: Hello Loren, love your blog because of such inspiring and challenging comments to such ’small’...
  • Pekka Kumpulainen: Here is my tradeoff. I usually want to keep the original variables as they are most probably...
  • Iain: Followup: Of course, to allow NaNs (counting them as non-zero): mask = (a~=0) & (b~=0); The mask says “a...
  • Matt Fig: I would usually go with something like this: y = a&b; x = a(y); y = b(y); But I was surprised to find...
  • kk: c=all([a;b]) a(c) a(b)

These postings are the author's and don't necessarily represent the opinions of The MathWorks.