Loren on the Art of MATLAB

August 15th, 2007

Iterating over Non-Numeric Values

Recently a colleague was hoping to write some code to iterate over fields in a structure. There are at least three different ways I can think of, some more straight-forward than others. Here's what I've come up with.

Contents

Create Data for an Example

I'm only going to consider scalar structures for this post. The contents of each field in this case will also be scalars, for easy illustration. I'd like to create output that increases the value of each entry by 1.

s.a = 1;
s.b = 2;
s.c = 3;

Let me get the field names so I can have the code be reasonably generic.

fn = fieldnames(s)
fn = 
    'a'
    'b'
    'c'

First Method - Loop Over Number of Field Names

In the first case, I find out how many fields are in the struct, and then loop over each of them, using dynamic field referencing to access the value in each field.

out1 = zeros(1, length(fn));
for n = 1:length(fn)
    out1(n) = s.(fn{n}) + 1;
end
out1
out1 =
     2     3     4

Second Method - Loop Over Field Names Themselves

In the second case, I am going to bypass the numeric indexing and loop through the fields themselves. Since the for loop in MATLAB processes the indices a column at a time, I have to turn my column of field names into a row vector. It happens to be a cell array, so I also have to convert the field names to character arrays to use dynamic field referencing. I won't preallocate the output out2 this time, but allow this small array to grow, so I don't need to add a counter to the loop.

out2 = [];
for str = fn'
    out2(end+1) = s.(char(str)) + 1;
end
out2
out2 =
     2     3     4

Check that the results of the first two methods agree.

isequal(out1,out2)
ans =
     1

Third Method - Use structfun

Instead of looping through fields at all, use structfun which iterates over fields in a scalar structure.

out3 = structfun(@(x)x+1,s);
out3 = out3.'
out3 =
     2     3     4

Since the results are returned as a column vector, I transpose them so I can compare the output to that of the other methods. Check that the results of the last two methods agree.

isequal(out2,out3)
ans =
     1

Which Way is Clearest?

I've shown you three ways to iterate over fields in a structure. Which method(s) do you currently use? Which one seems clearest to you? Let me know here.


Get the MATLAB code

Published with MATLAB® 7.4

22 Responses to “Iterating over Non-Numeric Values”

  1. Urs (us) Schwarz replied on :

    STRUCTFUN - only
    us

  2. Markus replied on :

    Hi Loren!

    Nice article! I have to check some of my code if I can simplify it using a cell as loop index.

    I personally don’t like using anonymous function, as they are cryptic to read and painfully slow. Or did the latter change significantly in Matlab 2007a or b?

    And another comment: I would like you to encourage people to use longer, meaningful variables names, like “fieldNames” instead of “fn”, which may be the abbreviation for many other things, like “function number” or so.

    – Markus

  3. Loren replied on :

    Markus-

    Thanks for the comments. We are always attacking performance for MATLAB, and functions calling and dispatching and related overhead are always in our sights. So you should see improvements over time. Are you seeing something puzzling?

    The reason I used fn here instead of fieldNames was so I didn’t confuse the variable with the function fieldnames. MATLAB certainly allows that, be it just seems easy to make a mistake if they are so similar. But I do hear you on making the names more expressive.

    –Loren

  4. Daniel Armyr replied on :

    Personally, whenever I did things like this, I allways took the long way around and used the setfield within a loop.

    The structfun looks like a nice one-liner, but I would say it is a bit cryptic compared to method 2. Is there a reason to believe the structfun method is faster or uses less memory than the second method?

  5. OkinawaDolphin replied on :

    I would use structfun. If an anonymous function is really too slow (in this case arrayfun and cellfun would be slower than for loops) or too cryptic, I write a separate function that is called by structfun/ arrayfun/ cellfun. A for loop is necessary if the index of an element or the sequence of the processed elements is important.

    By the way, components of structs usually have different meaning such as voltage and current or temperature and pressure. Are there many cases where the components of structs are treated in the same way?

  6. Jessee replied on :

    Okinawa, I agree. There aren’t many cases where the components of a struct are treated the same — this was a basic example. The need to loop through the fields of a struct arises more often when manipulating data structures, e.g., converting from a struct to an array, or transposing all the fields from row vectors to column vectors.

    I ran a quick comparison between the first and third methods but used a struct with 1000 fields. Using R2006a, the third method using an anonymous function with structfun appears to be roughly 9 to 10 percent faster than the for loop.

    I’ve pasted my code below (no guarantee it will format correctly).

    %% Create the 100 field struct
    % Does anyone know of a more efficient way to do this?
    N = 10000;
    fields = eval([’{’ sprintf(”’f%u”,’,1:N) ‘}’]);
    data = num2cell(1:N);
    c = [fields;data];
    s = struct(c{:});

    %% First Method
    tic;
    fn = fieldnames(s);
    out1 = zeros(1,length(fn));
    for i = 1:length(fn)
    out1(i) = s.(fn{i}) + 1;
    end
    toc

    %% Third Method
    tic;
    out2 = structfun(@(x)x+1,s).’;
    toc

  7. Dan K replied on :

    Hi Loren,
    Personally I find the contortions involved in the second method to be worrisome (particularly the fact that you have to transpose your cell array to use it for indexing (which was a surprise to me), as well as the need to convert back with char in order to use the dynamic field naming. Were I actually doing it, I would either go with method 1, on the premise that it is bomb-proof, or three but with the added step of predefining an increment anonymous function, to make it more readable.

    Dan

  8. Loren replied on :

    Daniel-

    As with many things in MATLAB, the answer is “it depends”. structfun may be faster sometimes and not others vs. iterating over fields. It really depends on the details of the what’s being calculated. Over time, both the JIT (which helps for loop performance) and structfun/arrayfun/cellfun continue to improve.

    I recommend you write code you can come back to and read in the future and only worry about the differences in speed if that is the true bottleneck in your code.

    –Loren

  9. lehalle replied on :

    %% Structured assoc lists

    %% First way: direct MATLAB structure
    % I do not think such kind of code is clear, because
    % performing operations on fields with different names is
    % not simple to understand. It
    % seems that’s like adding carrots and cabbages.

    s = struct(’a',1,’b',2,’c',3)

    %% Explicit assoc list
    % I prefer to define such values in a stucture like that:
    %
    % you can access the keys like this:
    %
    % {s.key}
    %
    % and the value like that
    %
    % [s.value]

    build_elem = @(k,v)struct(’key’,k,’value’,v)

    s = [ build_elem(’a',1), build_elem(’b',2), build_elem(’c',3)]

    %% Tools
    % The elementary tool needed is this one:

    get_value = @(s,k)[s(strmatch(k,{s.key},’exact’)).value]

    get_value(s,’b')

    %% Simple operations

    [s.value]+1

    %% Regards
    % Charles

  10. Hussein N replied on :

    Thanks Loren for your insightful article on how to improve the performance in MATLAB.

    I am a green user of MATLAB and unfortunately used for…end looping. Now that I have to loop over a large set, where in each loop I solve an optimization problem, I can notice that its time consuming.

    I tried the “structfun” method and could not get it to work when the output of the called function (.m file, not an anonymouse function!) is a structure by itself.

    Here is what I do:
    funhn=@opt1
    v=1:10;
    a=structfun(funhn(arg1,arg2,…,v),v,’UniformOutput’,false)

    Here opt1 is an m file:
    function res=opt1(arg1,arg2,…,v)
    .
    .
    res=struct(’f1′,f1)

    I expected the vector a to be a vector array: a(i) is of type structure, and is a result of opt1(arg1marg2,…,v(i)).

    Isn’t that the way it works?

    Thanks,
    Hussein

  11. Loren replied on :

    Hussein-

    structfun is meant to operate on the fields of an *input* structure. You might try arrayfun instead since your inputs are arrays. However, given the JIT in MATLAB, a for loop might be your fastest solution here, as long as your preallocate space for the output.

    –Loren

  12. Hussein N replied on :

    Thanks Loren for your input.
    What is JIT?

    Let me ask the following:
    1.Does vectorization help when your are looping over an optimization function (linear or conic)?
    2.If yes, what time-saving factor we are looking at?
    3.Is it possible that the I call a function with input vectors (of the same size) and the returned output is a vector of structures, each element is a structure corresponding to the vectors’ elements?
    4. Is it possible that the two input vectors are of different size?

    Thanks,

    Hussein

  13. Loren replied on :

    Hussein-

    Your questions do not have simple answers and are simply too general to give you good answers. They depend very much on circumstances. Vectorization sometimes helps a lot, and sometimes not. The amount of savings depends on all the details. You can call a function and return a vector of structures. See the documentation. Input vectors don’t have to be the same size for all functions, but it depends what function(s) you are referring to.

    –Loren

  14. Andrew replied on :

    I am interested in the structfun and anonymous function solution, but my struct is a struct of arrays of mixed type as follows:
    s.a={’first’, ’second’};
    s.b=[1.1 1.2];
    s.c=[30.1 33.2];
    I am looking to gather the output into a single cell array like so:
    {’first’, 1.1, 30.1; ’second’, 1.2, 33.2}
    I have tried structfun and arrayfun also without success so far.

  15. Loren replied on :

    Andrew-

    I think num2cell is what you need to spread your numeric array out into a cell. Try this:

    s.a={'first', 'second'};
    s.b=[1.1 1.2];
    s.c=[30.1 33.2];
    
    mycell= [s.a ; num2cell(s.b); num2cell(s.c)]'
    
    mycell =
        'first'     [1.1000]    [30.1000]
        'second'    [1.2000]    [33.2000]
    
  16. Andrew replied on :

    Thanks Loren for the reply, but I am looking for a solution that takes the structure and places its contents into a cell array, without having to specify the individual fieldnames. That is why I was looking at structfun. In the end I think struct2cell will do the trick with the use of num2cell, but for the “fun” of it, could it be done with structfun?

  17. Loren replied on :

    Andrew-

    I don’t see how you can do it simply for your set up since you have both cells and numeric arrays and the same functions don’t operate on both and structfun wants one function (though I guess it could be a complicated one). Besides, for non-uniform output, structfun outputs a struct so I can’t see how you’re better off.

    –Loren

  18. Ljubomir Josifovski replied on :

    More generally, Matlab lacks “foreach” construct where one can iterate through a collection without the need to introduce an index.**

    I have settled for a variant of solution 2. I’d use ” for str = fn(:)’ “. The column operator is to make sure I get to a row regardless if ” fn ” is row or column. (ie context independent - maybe there is a single “make row” operator instead of ” (:)’ ” ?)

    - Ljubomir

    **Unneeded indices are bad. Not needing them for anything else but to get to the value, one tends to choose the obvious i,j,k without sufficient attention so may well overwrite the same named loop vars in an outer loop. The editor warns so is of great help here.

  19. Loren replied on :

    Ljubomir-

    Depending on the collection, there’s cellfun, arrayfun, and structfun. And yes, you can use A(:).’ to make a single row. Or you can use reshape(A,1,[]).

    I’m glad the editor warning on loop variables is helpful to you.

    –Loren

  20. per isakson replied on :

    Ljubomir-

    The foreach-functionality is included in Matlab’s FOR - isn’t it?. More often than not, my for-loops are through cell and structured arrays. The arrays may hold anything even funtionhandles of nested functions. Are cell arrays less powerful than collections?

    It took me some time to appreiciate the full power of the for-loops in Matlab.

    - per

  21. Ljubomir Josifovski replied on :

    Got caught by this today - iterate through strings list:

    >> for a = {’b', ‘c’}; a, end;
    a = ‘b’
    a = ‘c’

    But actually

    >> whos a
    Name Size Bytes Class Attributes
    a 1×1 62 cell

    is not a string - one needs to

    >> for a = {’b', ‘c’}; a=a{1}, end;

    which I find inpractical + ugly. Am I missing something/is there a better way to iterate through strings list?

    thanks,
    Ljubomir

  22. Loren replied on :

    Ljubomir-

    That’s the same way I iterated above. It’s a bit clunky getting the contents of the scalar cell, but it is the way to iterate on a cell array.

    –Loren

Leave a Reply

Wrap code fragments inside <pre> tags, like this:

<pre class="code">
a = magic(3);
sum(a)
</pre>

If you have a "<" character in your code, either follow it with a space or replace it with "&lt;" (including the semicolon).


Loren Shure works on design of the MATLAB language at The MathWorks. She writes here about once a week on MATLAB programming and related topics.

  • Jun: I totally can not believe it, Loren. You are really helpful. Thank you so much, MATLAB master!
  • Loren: Wow folks- Always lots of interest when there’s a quickie to try out! I will only make 2 general...
  • Loren: Jun- ismember is your friend here: >> [aa,ind] = ismember(Array2,Arra y1) aa = 1 1 1 1 1 1 1 ind = 1 2 1 4 4 3...
  • Dan: I like the first way better than the second way. Combining the arrays into one and running any is nice, although...
  • James Myatt: How about I = (a == 0 | b == 0); a(I) = []; b(I) = [];
  • Tunc: Hello Loren, love your blog because of such inspiring and challenging comments to such ’small’...
  • Pekka Kumpulainen: Here is my tradeoff. I usually want to keep the original variables as they are most probably...
  • Iain: Followup: Of course, to allow NaNs (counting them as non-zero): mask = (a~=0) & (b~=0); The mask says “a...
  • Matt Fig: I would usually go with something like this: y = a&b; x = a(y); y = b(y); But I was surprised to find...
  • kk: c=all([a;b]) a(c) a(b)

These postings are the author's and don't necessarily represent the opinions of The MathWorks.