Loren on the Art of MATLAB

Turn ideas into MATLAB

Vectorizing Access to an Array of Structures 40

Posted by Loren Shure,

Though I have written about this topic before, I continue to get questions on working with arrays of structures. So I thought I would focus on that alone today.

Contents

Recent Sample Question

Recently there was a question on the newsgroup about how to vectorize the access to elements in a structure array. And also got an email from a customer on this topic. Here is one customer's description of his problem:

    In a nutshell, what I am trying to do (in as few lines of code as
    possible) is:
    state = an array of structs, say N items each with (for example)
    components x and y.
    In my case 'state' is reasonably complicated and hence does not
    warrant the use of a simple 2 x N matrix.
    for i=2:N
       state(i)=markovmodel(state(i-1)); % 1. Access individual structs
    end
    plot(state.x) % 2. Access all of the entries of one element as  vector.

The Answer

Let's create an array of structures first.

states(1).x = 3;
states(2).x = 4;
states(3).x = 5;
states(1).y = 42;
states(2).y = 99;
states(3).y = 0;
states
states = 
1x3 struct array with fields:
    x
    y

Let's see what's in states.x

states.x
ans =
     3
ans =
     4
ans =
     5

With an array of structs, you can gather the x values using

allxvals = [states.x]
allxvals =
     3     4     5

This is because states.x produces a comma-separated list. How do I know this? Because, if I leave the semi-colon off, I see a bunch of output with ans =; in other words, I am getting multiple output values. How do I collect multiple values into an array? I can use either square brackets or curly braces.

Using the square brackets, [], just places that list into between the brackets, creating an array of one type, in this case, doubles. You can do likewise with struct arrays with fields that are more suitable for a cell array using curly braces, {} in place of the square brackets.

User Solution

If instead of using

    plot(state.x)
above, the user replaced this with
    plot([state.x])

the informationg gets plotted correctly.

Comments?

My question to you is why this question keeps getting asked. Is the concept unusual enough that people don't know how to even look for the information on-line? What can we do at The MathWorks to make this more visible? Please pass along your thoughts here.


Get the MATLAB code

Published with MATLAB® 7.4

Note

Comments are closed.

40 CommentsOldest to Newest

Darik Gamble replied on : 1 of 40

I was wondering a while ago if there was a way to extend this technique to further nested levels of structures.

E.g.

a(1).x = 10;
a(2).x = 25;
a(1).y.m = 1000;
a(2).y.m = 3500;
a(2).y.n = 45;
a(1).y.n = 1;

Now, if we look at the y field:

>> a.y

ans =

m: 1000
n: 1

ans =

m: 3500
n: 45

What I’d like to do is:

>> [a.y.m]

Which would ideally yield [1000 3500], but instead gives:

??? Dot name reference on non-scalar structure.

What I end up doing is something like:

temp = a.y;
vals = [temp.m];

Not a big deal, I know, but I’m still curious.

Clemens replied on : 2 of 40

I’m also curious about something similar.

I got a vector of Objects with subsref defined:
function ref = subsref(r,s)
ref = builtin(‘subsref’, r,s);

If I now do [c.x] I get the error:
??? Error using ==> xObject.subsref
Too many output arguments

What would I need to change to get the behaviour from above?

Dan K replied on : 3 of 40

Loren,
I am also interested in the question of structures of structures. As for the reason why people have such a problem with this, I think it is because a comma separated list when provided as output doesn’t look like it does when you are using it for input. For input, it looks like a cell array (well, ok, as far as I understand it is) but when provided as output you just get a collection of ans= statements. Just my two cents.

Dan

F. Moisy replied on : 4 of 40

Dear Loren,
In line with your last post, I would like to know how to extract the element x(2) of the structure array states.
For instance,
states(1).x = [1 5 8];
states(2).x = [3 0 4];
It would be very nice to produce the result [5 0] with a syntax like
m = [states.x(2)]
or
m = [states(:).x(2)]
without a for loop.
Thanks!
Frederic.

john B replied on : 5 of 40

Hi Loren
I think Dan K has hit the problem on the head.

If you know that you are “looking” for a comma-separated list or a list then
1. help []
2. help paren
leads you to lists
lists mainly gives input use and a hint in the lasst para ” Comma separated list are very useful when dealing with variable input or output argument lists…” and examples A= && B=
3. doc [] gets you nowhere
4. doc [
5. Operators and Special Characters
then choosing []: no reference to use of [] on the input side.

So I have no problem with the syntax and I would like to see answers to Darik G & F Moisy questions.
Thanks from downunder

Loren replied on : 6 of 40

This is in answer to Darik G, F Moisy, and John B:

Consider this output:

>> a(1).x = 10;
a(2).x = 25;
a(1).y.m = 1000;
a(2).y.m = 3500;
a(2).y.n = 45;
a(1).y.n = 1;
>> a
a = 
1x2 struct array with fields:
    x
    y
>> a.y
ans = 
    m: 1000
    n: 1
ans = 
    m: 3500
    n: 45
>> a(3).y.q = 17
a = 
1x3 struct array with fields:
    x
    y
>> a(1).y
ans = 
    m: 1000
    n: 1
>> a(3).y
ans = 
    q: 17

Now if I try to gather things together for, let’s say a.y, I don’t even have consistent contents for them. Different fields, etc. So MATLAB can’t know a priori that the operation you want to do will work.

My own belief is that we might solve this by adding something more structured to MATLAB, something like a “uniform” structure, and when the user choses this, s/he guarantees that the fields all exist the same throughout their array. Then we can nest the access and gathering process.

–Loren

Loren replied on : 7 of 40

Clemens-

I think you need to allow your subsref to return multiple outputs but I am not sure. You might want to contact technical support to help you out.

–Loren

Wolfgang replied on : 8 of 40

Hi Loren,

Sometimes I recommend to coleages or friends structures as the preferred data storage for their particular programming task. Many of them have been using MATLAB for years and never heard the term “comma-separated list” (so actually they have been abusing MATLAB for years, I guess).

As commented by john B earlier, I think the “comma-separated list” is not too easy to find in MATLAB help browser. I know that the comma-separated list isn’t a dedicated data type such as cell arrays for example, but I would consider including it on a prominent position in the help tree such as MATLAB -> Programming -> Data Types -> Comma-separated List. Hopefully, while making the comma-separated list more popular you will probably receive less questions on the vectorized access on arrays of structures.

When I asked the people why the command states.x generates a bunch of ans= statements (multiple output) in the command window, I got answers like: “Well, I’ve seen this before, but I never really cared about it.” Maybe it’s just a lack of caution and critical faculty then.

Loren replied on : 9 of 40

Wolfgang, john B and others,

Thanks for the insights. I am passing them along to our writers so they can benefit from your insight and perhaps find ways to make this information more visible in the documentation.

–Loren

Oliver A. Chapman, P.E. replied on : 10 of 40

I agree with Wolfgang’s comments regarding “comma-separated list.” This is an important concept in MatLab and users new to MatLab are likely not to have encountered it. Therefore, MatLab must have focused documentation on this.

I also support Loren’s consideration of a standard design of data structure. This concept could also apply for various types of data common to technical computing: Time history data & tables used for interpolation. The current TimeSeries tends to hide the data from the user. With a standard, well thought out design of a data structure, the user would have easy access to the data but still have the advantage of a single “variable” name to reference all the data necessary to properly operate on the data.

Pete S replied on : 11 of 40

Loren,

This follows Dan K’s comments, I do find the comma separated lists confusing. As an experienced Matlab programmer I expect this example to return a column vector, and it does:

a=[1,2,3],b=[4;5;6],c=[a(:);b(:)]

but transpose it to cell arrays and comma separted lists and I get a 2×3 vector:

a={1,2,3},b={4;5;6},c={a{:};b{:}}

logical for lists but maybe not intuitive, and try this:

if you type a{:}, what appears on the screen looks like it’ll be a column vector,

>> a{:}
ans =
1
ans =
2
ans =
3

but put curly braces round it and you get a row vector:

>> {a{:}}
ans =
[1] [2] [3]

to get a column vector you need to add a transpose (outside the braces, inside will be an error):

>> {a{:}}’
ans =
[1]
[2]
[3]

I don’t find this behaviour intuitve, and I often get lost in it when manipulating cell arrays. I don’t use arrays of structures much in what I do, so I don’t know if these problems read across.

Regards,
Pete

PS Uniform structures sound an interesting idea; I have written scritps to check that a structure is uniform, and to report missing or “incorrect” fields.

Jan K. J. replied on : 12 of 40

Hi Loren,

Great blog!

I totally agree with Pete S: Sometimes receiving a row vector, sometimes a column vector when doing basic manipulation is a bad thing.

As for how to vectorize into an array of structures, this will only work if the elements are singular or 1-by-m vectors. Example:

states(1).x = [ 1 2 3];
states(2).x = 4;
[states.x]
ans =
1 2 3 4

Using n-by-1 vectors will only work if the elements have identical size:

states(1).x = rot90([ 1 2 3]);
states(2).x = 4;
[states.x]
??? Error using ==> horzcat
CAT arguments dimensions are not consistent.

states(2).x = rot90([5 6 7]);
[states.x]
ans =
3 7
2 6
1 5

In one of my data reading routines I use cell array operations to transfer large chunk of data into appropriate structure array fields. I just realized why I _*never*_ have been able to use the alldatavector=[struct.vector] syntacs on the resulting data set. Time to re-write my data reading routines…

Regards,
Jan

Dean Redelinghuys replied on : 13 of 40

In response to Pete S’s confusion about row-vs-column output:

Think of the “comma separated list” (CSL) as MATLAB providing a short-cut that is equivalent to you typing instructions at the command-line. I use this concept when I’m teaching people about CSL’s in our MATLAB training courses.

Try this in MATLAB:
>> 1, 2, 3
ans =
1
ans =
2
ans =
3

You’ll notice that this output is the same as the following:
>> a = {1, 2, 3};
>> a{:}
ans =
1
ans =
2
ans =
3

Typing (or programming with)
>> a{:}

using CELL indexing (indexing with {}) is not the same as ARRAY indexing (indexing with ()). Using ARRAY indexing, a(:) will re-arrange the cell array into a column vector. Using CELL indexing is literally saying to MATLAB: “Return the following elements to me as a comma separated list.” Note that it’s a COMMA separated list, not a SEMI-COLON separated list.

For instance, try this (and figure out why the list comes back in this order):
>> a = {1, 2, 3; 4, 5, 6}
a =
[1] [2] [3]
[4] [5] [6]
>> a{1:2,2:3}
ans =
2
ans =
5
ans =
3
ans =
6

I think it’s the fact that the cell indexing is so similar to array indexing that confuses people.

Now using syntax like:
>> {a{:}}

or
>> [a{:}]

is just telling MATLAB to take the CSL and use that result in constructing a cell array, or a “native” array (the contents of a{} could be strings, doubles, ints, etc.) just like you would type it:
>> {a{1}, a{2}, a{3}}

CSL’s in MATLAB, and the ability to use that as an array constructor, are a brilliant piece of somewhat obscure (at first) programming syntax, as long as you remember that a{…} means “make a CSL” and not “make a sub-array”.

+ Dean

Mike Plonski replied on : 14 of 40

The key point in this entire discussion that I did not find anywhere in the MATLAB documentations was that structure arrays are not “uniform”. After reading this I did a simple test (see below) to figure out what “uniform / non-uniform” means. Apparently, MATLAB struct arrays require each element in the array to have the same fieldnames (see the phantom x(2).y below), but that the datatype for the field can vary (non-uniform) for each structure in the array (see x.z).

My concern is not extracting x.z from an array of structs (x), but how do I take an array of data and put it into a single field (e.g. x.z = ?? or x(:).z = ??). I have had zero luck trying to do this without incrementing through each struct in the struct array.

struct2cell will turn the struct array into a higher dim cell array. What is the overhead associated with these operations. I have about 240 fields allocated among multiple nested struct arrays. It seems there would be a lot of overhead and fancy bookeeping associated with trying to use struct2cell / cell2struct approach to assign a vector to a substructure array in a parent struct array.

A uniform struct array (same field and fieldtypes) is probably required to solve my problem. struct arrays are almost useless without the ability of assigning a vector of data to a specific field in the struct array. I inherited this 240 field nested struct array to maintain, as I would never have deisgned this in MATLAB.

Sample code below:
x(1).z = 5
x(2).z = ‘string’
x(1).y = 10

Then to test
>> x(1)

ans =

z: 5
y: 10

K>> x(2)

ans =

z: ‘string’
y: []

K>> r = [x.z]

r =

string

K>> r = {x.z}

r =

[5] ‘string’

K>> x.z = r
??? Incorrect number of right hand side elements in dot name assignment. Missing [] around left hand side is a likely cause.

K>> x.z = deal(r)
??? Incorrect number of right hand side elements in dot name assignment. Missing [] around left hand side is a likely cause.

K>> x(:).z = r
??? Insufficient outputs from right hand side to satisfy comma separated
list expansion on left hand side. Missing [] are the most likely cause.

K>> a = x

a =

1×2 struct array with fields:
z
y

K>> a.z = x.z
??? Illegal right hand side in assignment. Too many elements.

K>>

Valentin Kuklin replied on : 15 of 40

Hi Mike,

I have the same problem – I want to initialize elements in the struct array with an array.

So, as you said: (e.g. x.z = ?? or x(:).z = ??).

I’m still trying to spell something judicious, but the maximum I have reached is this:

>> x(1).y=1;
x(2).y=2;
x(3).y=3;
x(4).y=4;
>> x.y
ans =
1
ans =
2
ans =
3
ans =
4
>> [x(1:2).y]=x(3:4).y;
>> x.y
ans =
3
ans =
4
ans =
3
ans =
4
So, I can (! :)) initialize all of the elements in the struct array but only with elements in the struct array… It seems I’ll bite myself in a tail :)…

Valentin

Valentin Kuklin replied on : 16 of 40

Hi again,

Harry Potter is a newcomer! Here you are….

>> x(1).y=1;
x(2).y=2;
x(3).y=3;
x(4).y=4;
>> a = {1, 2, 3; 4, 5, 6}
a =
[1] [2] [3]
[4] [5] [6]
>> [x(1:2).y]=a{1:2,2:2};
>> x.y
ans =
2
ans =
5
ans =
3
ans =
4

Unbelievable but it does not work with
>> b={7;8}
b =
[7]
[8]
>> [x(1:2).y]=b
??? Too many output arguments.

only [x(1:2).y]=a{1:2,2:2}!

As you can see

>> a{1:2,2:2}
ans =
2
ans =
5

and
>> b
b =
[7]
[8]
>>

give different results. It’s not so easy to understand for me.

Valentin

Loren replied on : 17 of 40

Valentin-

b is an array and dosen’t have outputs (or, if you want to think about it as having them, it has one, the whole array). [x(1:2).y] is expecting to assign outputs to 2 distinct entities, hence the error message you see. Instead, try this:

[x(1:2).y]=b{:}

–Loren

Valentin Kuklin replied on : 18 of 40

Hi Loren,

thanks a lot! Your alternative is sure better as mine. But I have to say it’s not very intuitive.
>> c=[1;1]
c =
1
1
>> c
c =
1
1
>> c(:)
ans =
1
1
In this syntax it is the same with or without (:).
And
>> d=[1 2 3; 4 5 6]
d =
1 2 3
4 5 6
>> d(1:2,2:2)
ans =
2
5
>> e=[2;5]
e =
2
5
>>
“e” is equal to d(1:2,2:2), but not with {}.

Anyway, thanks a lot!

Valentin

Loren replied on : 19 of 40

Valentin-

You are mixing up indexing into the contents of a cell array via {} (curly braces) with regular array indexing with parens.

In your example, c is not a cell array but a column vector. In general in MATLAB, on the right hand side, A(:) turns A into a column vector. So c and c(:) are identical in your case.

–Loren

Valentin Kuklin replied on : 20 of 40

Hi Loren,

I see that the behaviours of {}-curly braces and () are different. I expected it would be similar. I was wrong :)

Valentin

Valentin Kuklin replied on : 21 of 40

Hi Loren,

I have the next question :). Here is the situation:
>> b
b =
1×3 struct array with fields:
c
>> a
a =
1 2 3

I can use the syntax to initialize the struct

>> d=num2cell(a);
>> [b.c]=d{:}
b =
1×3 struct array with fields:
c

Is there a way to avoid the additional variable d and extra line?

Thanks a lot,

Valentin.

Johan H replied on : 23 of 40

Hi Loren,

It seems that
[b.c] = deal(num2cell(a))
and
d=num2cell(a);
[b.c]=d{:}
yield different results. In your example you copy the entire vector a into “each” c-field, ie. b(1).c => [1 2 3] and b(2).c => [1 2 3] etc.

I’m also looking for a simple and nice *one-liner* to put the elements of the vector a into the struct-field c “one by one”, ie. a(1) goes into b(1).c, a(2) into b(2).c etc. I think this is a rather common task, but I have not yet found a simple solution for it.

You could use b = struct(‘c’,num2cell(a)), but this gets rather messy when you have many fields. I’m also building my struct by adding fields one at a time (due to data processing), and you can’t(?) use this for adding fields afterwards.

In a matlab-thread somewhere, someone suggested a function split(a), seems like a good idea to me. Eg.
[b(1:length(a).c] = split(a); % a is vector
This solution will be my christmas wish!

Best Regards & Merry Christmas!
Johan

Jon replied on : 26 of 40

Loren,

I encountered this same problem and I attempted to find the answer by looking at the documentation for struct. In the documentation, example three is describing this very use of the struct but does not mention how to make the array.

This example in the documentation seems like an excellent place to mention how to get the arrays with the square brackets.

-Jon

From the documentation:
========================================================
========================================================
Example 3This example initializes one field f1 using a cell
array, and the other f2 using a scalar value:s = struct(‘f1’, {1 3; 2 4}, ‘f2’, 25)
s =
2×2 struct array with fields:
f1
f2Field f1 in each element of s is
assigned the corresponding value from the cell array {1 3; 2 4}:s.f1
ans =
1
ans =
2
ans =
3
ans =
4Field f2 for all elements of s is
assigned one common value because the values input for
this field was specified as a scalar:s.f2
ans =
25
ans =
25
ans =
25
ans =
25
========================================================
========================================================

Moritz replied on : 27 of 40

One aspect of structures hasn’t been mentioned much here: speed. My impression is they are awfully slow. In the following function I define a coordinate structure with x and y field. The silly function I want to execute on this data is the element-wise difference between the x and the y coordinate. I calculate this with two arrays X and Y, then with a structure, but I convert to array before the calculation, and then in full structure notation. Turns out that the full structure way is about 5000 times slower than array, and the hybrid is still 80 times slower. Am I doing something wrong?

Here’s code and output

function Structureslowness
ArraySize = 1e5;
AX = rand(1, ArraySize);
AY = rand(1, ArraySize);
A = struct(‘x’,num2cell(AX),’y’,num2cell(AY));

tic
Distance = XYfun(AX,AY);
XYTime = toc;

tic
Distance = Structfun(A);
HybridTime= toc/XYTime

tic
Distance = EvenWorse(A);
StructureTime = toc/XYTime

function d = XYfun(AX,AY)
d = AX-AY;

function d = Structfun(A)
X = [A.x];
Y = [A.y];
d = X-Y;

function d = EvenWorse(A)
d = [A.x] – [A.y];

## output:

HybridTime = 80.7715

StructureTime = 5.5581e+003

Loren replied on : 28 of 40

Moritz-

With the JIT working in MATLAB, a for loop is about twice as fast as your hybrid time. Add these lines to your main function to try it out:

tic
Distance= ForLoop(A);
loopTime = toc/XYTime

and this subfunction to the file:

function d= ForLoop(A)
sz=numel(A);
d=zeros(sz,1);
for ind=1:sz
    d(ind)=A(sz).x-A(sz).y;
end

Personally, I find it less than ideal that none of the vectorized solutions behave as well. I have reported this to the development team.

–Loren

Steve L replied on : 29 of 40

Moritz —

Another consideration is that there’s a big difference (both performancewise and memorywise) between an array of structs and a struct containing an array.


N = 1e5;
x = rand(1, N);
y = rand(1, N);
A1 = struct('x', num2cell(x), 'y', num2cell(y));
A2 = struct('x', x, 'y', y);
whos A1 A2
tic; d1 = [A1.x] + [A1.y]; toc,
tic; d2 = A2.x + A2.y; toc,
isequal(d1, d2)

If you want to interact with each point individually more often, then the array of structs would probably be the right approach.

point1 = A1(1);

Getting the point1 matrix from A2 would be messier. You could do it with the FIELDNAMES function, but I don’t think it would be a nice one-liner.

If instead you want to interact with the X coordinate of all the points more often, the struct containing an array would be better suited for your application.

allX = A2.x;

Getting the allX matrix from A1 isn’t messier (like point1 from A2 above) but it would be slower as you posted.

So which data layout you should use to get better performance depends somewhat on what tasks you want to carry out most often.

Moritz replied on : 30 of 40

Loren,

yes, your loop version is more than twice as fast as the hybrid code, but still rather slow compared to the raw array version. I agree this is not ideal. Thanks for bringing it up with the developers.

Steve,

indeed, it makes more SENSE in my application to have an array of structures. On the other hand, you show that not only they are slow, but memory-hungry, too. I guess I will go for the not-so-elegant, but fast and low-mem simple separate arrays implementation.

Thanks for the help!

KenA replied on : 31 of 40

It may be worth point out that, in R2008a at least, the properties of an array of objects can be indexed as discussed in this blog posting. To Mortiz’s post (27), this also not super-speedy, performing between the example’s “HybridTime” and “StructureTime”. However, according “whos”, an array of classes is far more memory efficient than an equivalent array of structs.

So, if you find yourself creating a large array of structs, where those structs all have the same field names (homogeneous), an array of objects may be something to consider.

Jaime X. Ramos replied on : 32 of 40

I have the same question as ‘F. Moisy’ who replied on April 19th, 2007 at 15:01 UTC, except I have set of m x n matrices as field contents of a structure array.

I’m gathering that I can only access say the (2,3) element for all structures in the array by iterating with a for loop, ie there’s no compact access code like he wrote with
[struct.data(2,3)]

for i = 1:k
struct(i).mat(2,3)
end

I’m reading a tiff stack file using this matlab community user code and they store the frames as structures in a structure array. I’m trying to compute the mode of a given pixel across all the frames but I can’t see no way other than iteratively accessing that pixel.

Any ideas?

thanks for being such a champ! =)

Jaime

Tom replied on : 34 of 40

I think the biggest problem is the matlab help tells you how to BUILD a matrix/array/structure/string/cell but does not ON THE SAME PAGE tell you the syntax to get the data in such a type BACK OUT!! Finding out how to get data OUT of these various storage types is not easy to find.

I am a new user of matlab, and my biggest problem is getting the words I use in the help search function to relate to the words used as commands in the program…

ie “How do I get data out of a structure” does not give me an answer about structures
“structure” tells me how to build one, but not how to USE it, or how to get data inside a feild out and into a plot.

Cheers,
Tomas

Vince replied on : 37 of 40

Hi Loren,
I’m new in using array of structures. I have an array of structures called “list_uwv”. Each structure has a field called “cord”, in which I’ve placed a vector like this [a b c], where a,b,c are three numbers. My question is how can ask to the array of structures where is located the vector [A B C]; there is any function, like “find”, can I use for?
i.e

find(list_uwv.cord(:) == [0 3.5 0.8]))

Up to now I’ve placed the vectors in a matrix (array of Nx3 elements) and I make the query to the matrix by using find function.
TNX a lot Loren

Loren replied on : 38 of 40

Vince-

That’s similar to the way I would solve the problem as well – except I would not look for exact floating point numbers but within a range since there are numeric roundoff considerations to account for when working with floating point. I also would probably use ismember with the rows option after getting the data out of the struct.

–Loren

liuao replied on : 39 of 40

hi,now I have a question about assignment as the following:
I have a structure array named c size 3*1,whose fields is x and y,and now I have a array d size 3*1,then I want to assign
d to field x of c,how should I do?
I have try to type [c.x] = d,[c.x] =deal(d),c.x =d ,c.x=deal(d),but it all doesn’t work.

Thank you and sorry for my poor english.

Loren replied on : 40 of 40

Liuao-

The problem you are encountering is because your right-hand side is not a call that can be dealt out. Try this instead:

c(3).x = []
[c.y] = c.x
d = 1:3
dc = num2cell(d)
[c.x] = dc{:}
c.x

–Loren