Though I have written about this topic before, I continue to get questions on working with arrays of structures. So I thought I would focus on that alone today.
Contents
Recent Sample Question
Recently there was a question on the newsgroup about how to vectorize the access to elements in a structure array. And also got an email from a customer on this topic. Here is one customer's description of his problem:
In a nutshell, what I am trying to do (in as few lines of code as
possible) is: state = an array of structs, say N items each with (for example)
components x and y. In my case 'state' is reasonably complicated and hence does not
warrant the use of a simple 2 x N matrix. for i=2:N
state(i)=markovmodel(state(i-1)); % 1. Access individual structs
endplot(state.x) % 2. Access all of the entries of one element as vector.
The Answer
Let's create an array of structures first.
states(1).x = 3; states(2).x = 4; states(3).x = 5; states(1).y = 42; states(2).y = 99; states(3).y = 0; states
states =
1x3 struct array with fields:
x
y
Let's see what's in states.x
states.x
ans =
3
ans =
4
ans =
5
With an array of structs, you can gather the x values using
allxvals = [states.x]
allxvals =
3 4 5
This is because states.x produces a comma-separated list. How do I know this? Because, if I leave the semi-colon off, I see a bunch of output with ans =; in other words, I am getting multiple output values. How do I collect multiple values into an array? I can use either square brackets or curly braces.
Using the square brackets, [], just places that list into between the brackets, creating an array of one type, in this case, doubles. You can do likewise with struct arrays with fields that are more suitable for a cell array using curly braces, {} in place of the square brackets.
User Solution
If instead of using
plot(state.x) above, the user replaced this with
plot([state.x])
the informationg gets plotted correctly.
Comments?
My question to you is why this question keeps getting asked. Is the concept unusual enough that people don't know how to even look for the information on-line? What can we do at The MathWorks to make this more visible? Please pass along your thoughts here.
Get
the MATLAB code
Published with MATLAB® 7.4



I was wondering a while ago if there was a way to extend this technique to further nested levels of structures.
E.g.
a(1).x = 10;
a(2).x = 25;
a(1).y.m = 1000;
a(2).y.m = 3500;
a(2).y.n = 45;
a(1).y.n = 1;
Now, if we look at the y field:
>> a.y
ans =
m: 1000
n: 1
ans =
m: 3500
n: 45
What I’d like to do is:
>> [a.y.m]
Which would ideally yield [1000 3500], but instead gives:
??? Dot name reference on non-scalar structure.
What I end up doing is something like:
temp = a.y;
vals = [temp.m];
Not a big deal, I know, but I’m still curious.
I’m also curious about something similar.
I got a vector of Objects with subsref defined:
function ref = subsref(r,s)
ref = builtin(’subsref’, r,s);
If I now do [c.x] I get the error:
??? Error using ==> xObject.subsref
Too many output arguments
What would I need to change to get the behaviour from above?
Loren,
I am also interested in the question of structures of structures. As for the reason why people have such a problem with this, I think it is because a comma separated list when provided as output doesn’t look like it does when you are using it for input. For input, it looks like a cell array (well, ok, as far as I understand it is) but when provided as output you just get a collection of ans= statements. Just my two cents.
Dan
Dear Loren,
In line with your last post, I would like to know how to extract the element x(2) of the structure array states.
For instance,
states(1).x = [1 5 8];
states(2).x = [3 0 4];
It would be very nice to produce the result [5 0] with a syntax like
m = [states.x(2)]
or
m = [states(:).x(2)]
without a for loop.
Thanks!
Frederic.
Hi Loren
I think Dan K has hit the problem on the head.
If you know that you are “looking” for a comma-separated list or a list then
1. help []
2. help paren
leads you to lists
lists mainly gives input use and a hint in the lasst para ” Comma separated list are very useful when dealing with variable input or output argument lists…” and examples A= && B=
3. doc [] gets you nowhere
4. doc [
5. Operators and Special Characters
then choosing []: no reference to use of [] on the input side.
So I have no problem with the syntax and I would like to see answers to Darik G & F Moisy questions.
Thanks from downunder
This is in answer to Darik G, F Moisy, and John B:
Consider this output:
>> a(1).x = 10; a(2).x = 25; a(1).y.m = 1000; a(2).y.m = 3500; a(2).y.n = 45; a(1).y.n = 1; >> a a = 1x2 struct array with fields: x y >> a.y ans = m: 1000 n: 1 ans = m: 3500 n: 45 >> a(3).y.q = 17 a = 1x3 struct array with fields: x y >> a(1).y ans = m: 1000 n: 1 >> a(3).y ans = q: 17Now if I try to gather things together for, let’s say a.y, I don’t even have consistent contents for them. Different fields, etc. So MATLAB can’t know a priori that the operation you want to do will work.
My own belief is that we might solve this by adding something more structured to MATLAB, something like a “uniform” structure, and when the user choses this, s/he guarantees that the fields all exist the same throughout their array. Then we can nest the access and gathering process.
–Loren
Clemens-
I think you need to allow your subsref to return multiple outputs but I am not sure. You might want to contact technical support to help you out.
–Loren
Hi Loren,
Sometimes I recommend to coleages or friends structures as the preferred data storage for their particular programming task. Many of them have been using MATLAB for years and never heard the term “comma-separated list” (so actually they have been abusing MATLAB for years, I guess).
As commented by john B earlier, I think the “comma-separated list” is not too easy to find in MATLAB help browser. I know that the comma-separated list isn’t a dedicated data type such as cell arrays for example, but I would consider including it on a prominent position in the help tree such as MATLAB -> Programming -> Data Types -> Comma-separated List. Hopefully, while making the comma-separated list more popular you will probably receive less questions on the vectorized access on arrays of structures.
When I asked the people why the command states.x generates a bunch of ans= statements (multiple output) in the command window, I got answers like: “Well, I’ve seen this before, but I never really cared about it.” Maybe it’s just a lack of caution and critical faculty then.
Wolfgang, john B and others,
Thanks for the insights. I am passing them along to our writers so they can benefit from your insight and perhaps find ways to make this information more visible in the documentation.
–Loren
I agree with Wolfgang’s comments regarding “comma-separated list.” This is an important concept in MatLab and users new to MatLab are likely not to have encountered it. Therefore, MatLab must have focused documentation on this.
I also support Loren’s consideration of a standard design of data structure. This concept could also apply for various types of data common to technical computing: Time history data & tables used for interpolation. The current TimeSeries tends to hide the data from the user. With a standard, well thought out design of a data structure, the user would have easy access to the data but still have the advantage of a single “variable” name to reference all the data necessary to properly operate on the data.
Loren,
This follows Dan K’s comments, I do find the comma separated lists confusing. As an experienced Matlab programmer I expect this example to return a column vector, and it does:
a=[1,2,3],b=[4;5;6],c=[a(:);b(:)]
but transpose it to cell arrays and comma separted lists and I get a 2×3 vector:
a={1,2,3},b={4;5;6},c={a{:};b{:}}
logical for lists but maybe not intuitive, and try this:
if you type a{:}, what appears on the screen looks like it’ll be a column vector,
>> a{:}
ans =
1
ans =
2
ans =
3
but put curly braces round it and you get a row vector:
>> {a{:}}
ans =
[1] [2] [3]
to get a column vector you need to add a transpose (outside the braces, inside will be an error):
>> {a{:}}’
ans =
[1]
[2]
[3]
I don’t find this behaviour intuitve, and I often get lost in it when manipulating cell arrays. I don’t use arrays of structures much in what I do, so I don’t know if these problems read across.
Regards,
Pete
PS Uniform structures sound an interesting idea; I have written scritps to check that a structure is uniform, and to report missing or “incorrect” fields.
Hi Loren,
Great blog!
I totally agree with Pete S: Sometimes receiving a row vector, sometimes a column vector when doing basic manipulation is a bad thing.
As for how to vectorize into an array of structures, this will only work if the elements are singular or 1-by-m vectors. Example:
states(1).x = [ 1 2 3];
states(2).x = 4;
[states.x]
ans =
1 2 3 4
Using n-by-1 vectors will only work if the elements have identical size:
states(1).x = rot90([ 1 2 3]);
states(2).x = 4;
[states.x]
??? Error using ==> horzcat
CAT arguments dimensions are not consistent.
states(2).x = rot90([5 6 7]);
[states.x]
ans =
3 7
2 6
1 5
In one of my data reading routines I use cell array operations to transfer large chunk of data into appropriate structure array fields. I just realized why I _*never*_ have been able to use the alldatavector=[struct.vector] syntacs on the resulting data set. Time to re-write my data reading routines…
Regards,
Jan
In response to Pete S’s confusion about row-vs-column output:
Think of the “comma separated list” (CSL) as MATLAB providing a short-cut that is equivalent to you typing instructions at the command-line. I use this concept when I’m teaching people about CSL’s in our MATLAB training courses.
Try this in MATLAB:
>> 1, 2, 3
ans =
1
ans =
2
ans =
3
You’ll notice that this output is the same as the following:
>> a = {1, 2, 3};
>> a{:}
ans =
1
ans =
2
ans =
3
Typing (or programming with)
>> a{:}
using CELL indexing (indexing with {}) is not the same as ARRAY indexing (indexing with ()). Using ARRAY indexing, a(:) will re-arrange the cell array into a column vector. Using CELL indexing is literally saying to MATLAB: “Return the following elements to me as a comma separated list.” Note that it’s a COMMA separated list, not a SEMI-COLON separated list.
For instance, try this (and figure out why the list comes back in this order):
>> a = {1, 2, 3; 4, 5, 6}
a =
[1] [2] [3]
[4] [5] [6]
>> a{1:2,2:3}
ans =
2
ans =
5
ans =
3
ans =
6
I think it’s the fact that the cell indexing is so similar to array indexing that confuses people.
Now using syntax like:
>> {a{:}}
or
>> [a{:}]
is just telling MATLAB to take the CSL and use that result in constructing a cell array, or a “native” array (the contents of a{} could be strings, doubles, ints, etc.) just like you would type it:
>> {a{1}, a{2}, a{3}}
CSL’s in MATLAB, and the ability to use that as an array constructor, are a brilliant piece of somewhat obscure (at first) programming syntax, as long as you remember that a{…} means “make a CSL” and not “make a sub-array”.
+ Dean
The key point in this entire discussion that I did not find anywhere in the MATLAB documentations was that structure arrays are not “uniform”. After reading this I did a simple test (see below) to figure out what “uniform / non-uniform” means. Apparently, MATLAB struct arrays require each element in the array to have the same fieldnames (see the phantom x(2).y below), but that the datatype for the field can vary (non-uniform) for each structure in the array (see x.z).
My concern is not extracting x.z from an array of structs (x), but how do I take an array of data and put it into a single field (e.g. x.z = ?? or x(:).z = ??). I have had zero luck trying to do this without incrementing through each struct in the struct array.
struct2cell will turn the struct array into a higher dim cell array. What is the overhead associated with these operations. I have about 240 fields allocated among multiple nested struct arrays. It seems there would be a lot of overhead and fancy bookeeping associated with trying to use struct2cell / cell2struct approach to assign a vector to a substructure array in a parent struct array.
A uniform struct array (same field and fieldtypes) is probably required to solve my problem. struct arrays are almost useless without the ability of assigning a vector of data to a specific field in the struct array. I inherited this 240 field nested struct array to maintain, as I would never have deisgned this in MATLAB.
Sample code below:
x(1).z = 5
x(2).z = ’string’
x(1).y = 10
Then to test
>> x(1)
ans =
z: 5
y: 10
K>> x(2)
ans =
z: ’string’
y: []
K>> r = [x.z]
r =
string
K>> r = {x.z}
r =
[5] ’string’
K>> x.z = r
??? Incorrect number of right hand side elements in dot name assignment. Missing [] around left hand side is a likely cause.
K>> x.z = deal(r)
??? Incorrect number of right hand side elements in dot name assignment. Missing [] around left hand side is a likely cause.
K>> x(:).z = r
??? Insufficient outputs from right hand side to satisfy comma separated
list expansion on left hand side. Missing [] are the most likely cause.
K>> a = x
a =
1×2 struct array with fields:
z
y
K>> a.z = x.z
??? Illegal right hand side in assignment. Too many elements.
K>>
Hi Mike,
I have the same problem - I want to initialize elements in the struct array with an array.
So, as you said: (e.g. x.z = ?? or x(:).z = ??).
I’m still trying to spell something judicious, but the maximum I have reached is this:
>> x(1).y=1;
x(2).y=2;
x(3).y=3;
x(4).y=4;
>> x.y
ans =
1
ans =
2
ans =
3
ans =
4
>> [x(1:2).y]=x(3:4).y;
>> x.y
ans =
3
ans =
4
ans =
3
ans =
4
So, I can (! :)) initialize all of the elements in the struct array but only with elements in the struct array… It seems I’ll bite myself in a tail :)…
Valentin
Hi again,
Harry Potter is a newcomer! Here you are….
>> x(1).y=1;
x(2).y=2;
x(3).y=3;
x(4).y=4;
>> a = {1, 2, 3; 4, 5, 6}
a =
[1] [2] [3]
[4] [5] [6]
>> [x(1:2).y]=a{1:2,2:2};
>> x.y
ans =
2
ans =
5
ans =
3
ans =
4
Unbelievable but it does not work with
>> b={7;8}
b =
[7]
[8]
>> [x(1:2).y]=b
??? Too many output arguments.
only [x(1:2).y]=a{1:2,2:2}!
As you can see
>> a{1:2,2:2}
ans =
2
ans =
5
and
>> b
b =
[7]
[8]
>>
give different results. It’s not so easy to understand for me.
Valentin
Valentin-
b is an array and dosen’t have outputs (or, if you want to think about it as having them, it has one, the whole array). [x(1:2).y] is expecting to assign outputs to 2 distinct entities, hence the error message you see. Instead, try this:
[x(1:2).y]=b{:}
–Loren
Hi Loren,
thanks a lot! Your alternative is sure better as mine. But I have to say it’s not very intuitive.
>> c=[1;1]
c =
1
1
>> c
c =
1
1
>> c(:)
ans =
1
1
In this syntax it is the same with or without (:).
And
>> d=[1 2 3; 4 5 6]
d =
1 2 3
4 5 6
>> d(1:2,2:2)
ans =
2
5
>> e=[2;5]
e =
2
5
>>
“e” is equal to d(1:2,2:2), but not with {}.
Anyway, thanks a lot!
Valentin
Valentin-
You are mixing up indexing into the contents of a cell array via {} (curly braces) with regular array indexing with parens.
In your example, c is not a cell array but a column vector. In general in MATLAB, on the right hand side, A(:) turns A into a column vector. So c and c(:) are identical in your case.
–Loren
Hi Loren,
I see that the behaviours of {}-curly braces and () are different. I expected it would be similar. I was wrong :)
Valentin
Hi Loren,
I have the next question :). Here is the situation:
>> b
b =
1×3 struct array with fields:
c
>> a
a =
1 2 3
I can use the syntax to initialize the struct
>> d=num2cell(a);
>> [b.c]=d{:}
b =
1×3 struct array with fields:
c
Is there a way to avoid the additional variable d and extra line?
Thanks a lot,
Valentin.
Valentin,
How about this?
[b.c] = deal(num2cell(a))
–Loren
Hi Loren,
It seems that
[b.c] = deal(num2cell(a))
and
d=num2cell(a);
[b.c]=d{:}
yield different results. In your example you copy the entire vector a into “each” c-field, ie. b(1).c => [1 2 3] and b(2).c => [1 2 3] etc.
I’m also looking for a simple and nice *one-liner* to put the elements of the vector a into the struct-field c “one by one”, ie. a(1) goes into b(1).c, a(2) into b(2).c etc. I think this is a rather common task, but I have not yet found a simple solution for it.
You could use b = struct(’c',num2cell(a)), but this gets rather messy when you have many fields. I’m also building my struct by adding fields one at a time (due to data processing), and you can’t(?) use this for adding fields afterwards.
In a matlab-thread somewhere, someone suggested a function split(a), seems like a good idea to me. Eg.
[b(1:length(a).c] = split(a); % a is vector
This solution will be my christmas wish!
Best Regards & Merry Christmas!
Johan
Missing paranthesis..
[b(1:length(a)).c] = split(a); % a is vector
-Johan
Johan-
I recommend you suggest this enhancement via tech support.
Thanks for the suggestion.
–Loren
Loren,
I encountered this same problem and I attempted to find the answer by looking at the documentation for struct. In the documentation, example three is describing this very use of the struct but does not mention how to make the array.
This example in the documentation seems like an excellent place to mention how to get the arrays with the square brackets.
-Jon
From the documentation:
========================================================
========================================================
Example 3This example initializes one field f1 using a cell
array, and the other f2 using a scalar value:s = struct(’f1′, {1 3; 2 4}, ‘f2′, 25)
s =
2×2 struct array with fields:
f1
f2Field f1 in each element of s is
assigned the corresponding value from the cell array {1 3; 2 4}:s.f1
ans =
1
ans =
2
ans =
3
ans =
4Field f2 for all elements of s is
assigned one common value because the values input for
this field was specified as a scalar:s.f2
ans =
25
ans =
25
ans =
25
ans =
25
========================================================
========================================================
One aspect of structures hasn’t been mentioned much here: speed. My impression is they are awfully slow. In the following function I define a coordinate structure with x and y field. The silly function I want to execute on this data is the element-wise difference between the x and the y coordinate. I calculate this with two arrays X and Y, then with a structure, but I convert to array before the calculation, and then in full structure notation. Turns out that the full structure way is about 5000 times slower than array, and the hybrid is still 80 times slower. Am I doing something wrong?
Here’s code and output
function Structureslowness
ArraySize = 1e5;
AX = rand(1, ArraySize);
AY = rand(1, ArraySize);
A = struct(’x',num2cell(AX),’y',num2cell(AY));
tic
Distance = XYfun(AX,AY);
XYTime = toc;
tic
Distance = Structfun(A);
HybridTime= toc/XYTime
tic
Distance = EvenWorse(A);
StructureTime = toc/XYTime
function d = XYfun(AX,AY)
d = AX-AY;
function d = Structfun(A)
X = [A.x];
Y = [A.y];
d = X-Y;
function d = EvenWorse(A)
d = [A.x] - [A.y];
## output:
HybridTime = 80.7715
StructureTime = 5.5581e+003
Moritz-
With the JIT working in MATLAB, a for loop is about twice as fast as your hybrid time. Add these lines to your main function to try it out:
and this subfunction to the file:
function d= ForLoop(A) sz=numel(A); d=zeros(sz,1); for ind=1:sz d(ind)=A(sz).x-A(sz).y; endPersonally, I find it less than ideal that none of the vectorized solutions behave as well. I have reported this to the development team.
–Loren
Moritz –
Another consideration is that there’s a big difference (both performancewise and memorywise) between an array of structs and a struct containing an array.
N = 1e5;
x = rand(1, N);
y = rand(1, N);
A1 = struct(’x', num2cell(x), ‘y’, num2cell(y));
A2 = struct(’x', x, ‘y’, y);
whos A1 A2
tic; d1 = [A1.x] + [A1.y]; toc,
tic; d2 = A2.x + A2.y; toc,
isequal(d1, d2)
If you want to interact with each point individually more often, then the array of structs would probably be the right approach.
point1 = A1(1);Getting the point1 matrix from A2 would be messier. You could do it with the FIELDNAMES function, but I don’t think it would be a nice one-liner.
If instead you want to interact with the X coordinate of all the points more often, the struct containing an array would be better suited for your application.
allX = A2.x;Getting the allX matrix from A1 isn’t messier (like point1 from A2 above) but it would be slower as you posted.
So which data layout you should use to get better performance depends somewhat on what tasks you want to carry out most often.
Loren,
yes, your loop version is more than twice as fast as the hybrid code, but still rather slow compared to the raw array version. I agree this is not ideal. Thanks for bringing it up with the developers.
Steve,
indeed, it makes more SENSE in my application to have an array of structures. On the other hand, you show that not only they are slow, but memory-hungry, too. I guess I will go for the not-so-elegant, but fast and low-mem simple separate arrays implementation.
Thanks for the help!
It may be worth point out that, in R2008a at least, the properties of an array of objects can be indexed as discussed in this blog posting. To Mortiz’s post (27), this also not super-speedy, performing between the example’s “HybridTime” and “StructureTime”. However, according “whos”, an array of classes is far more memory efficient than an equivalent array of structs.
So, if you find yourself creating a large array of structs, where those structs all have the same field names (homogeneous), an array of objects may be something to consider.
I have the same question as ‘F. Moisy’ who replied on April 19th, 2007 at 15:01 UTC, except I have set of m x n matrices as field contents of a structure array.
I’m gathering that I can only access say the (2,3) element for all structures in the array by iterating with a for loop, ie there’s no compact access code like he wrote with
[struct.data(2,3)]
for i = 1:k
struct(i).mat(2,3)
end
I’m reading a tiff stack file using this matlab community user code and they store the frames as structures in a structure array. I’m trying to compute the mode of a given pixel across all the frames but I can’t see no way other than iteratively accessing that pixel.
Any ideas?
thanks for being such a champ! =)
Jaime
Jaime-
Look at arrayfun. It might well be able to help you out here.
–Loren
I think the biggest problem is the matlab help tells you how to BUILD a matrix/array/structure/string/cell but does not ON THE SAME PAGE tell you the syntax to get the data in such a type BACK OUT!! Finding out how to get data OUT of these various storage types is not easy to find.
I am a new user of matlab, and my biggest problem is getting the words I use in the help search function to relate to the words used as commands in the program…
ie “How do I get data out of a structure” does not give me an answer about structures
“structure” tells me how to build one, but not how to USE it, or how to get data inside a feild out and into a plot.
Cheers,
Tomas
Sorry, the above is related to WHY THERE ARE SO MANY QUESTIONS IN THIS AREA.
Tom-
Thanks for the suggestion about the doc. I will pass it on.
–Loren