Loren on the Art of MATLAB

Structures and Comma-Separated Lists 52

Posted by Loren Shure,

I have seen an increasing number of questions on structures and extracting information from them in a vectorized way. Though I've already covered portions of this topic in earlier posts, I'll try to bring together a coherent view here.

Contents

Simple Structure Array

Let's work with a simple structure array. By simple here, I mean one that does not have nested structures inside (though I don't believe there's any such defined term in MATLAB). We can use the function struct to create one or we can use direct notation. The following constructs are for s1 and s2 are equivalent.

clear
s1 = struct('name','Loren','FavoriteNumber',17)
s2.name = 'Loren';
s2.FavoriteNumber = 17
s1 = 
              name: 'Loren'
    FavoriteNumber: 17
s2 = 
              name: 'Loren'
    FavoriteNumber: 17

Are they the same?

isequal(s1,s2)
ans =
     1

Now let's add more people to the "database," along with their favorite numbers. See this reference for illumination on the structure inputs.

s1(end+1).name = 'Douglas';
s1(end).FavoriteNumber = 42;
s2(end+1) = struct('name','Douglas','FavoriteNumber',42);
isequal(s1,s2)
ans =
     1

What happens when we look at one of these, say s1?

s1
s1 = 
1x2 struct array with fields:
    name
    FavoriteNumber

We see that it's a struct with size 1x2, and fields named name and FavoriteNumber. We can also see the first element, s1(1) and the associated values, since they are not too large to display.

s1(1)
ans = 
              name: 'Loren'
    FavoriteNumber: 17

Let's try digging a little deeper now and see if we can group the data differently and collect it into some other MATLAB arrays. Start with the names.

s1.name
ans =
Loren
ans =
Douglas

When I display s1.name, I see ans = displayed twice, once for each element in the struct array. It's as if I executed this code:

s1(1).name,s1(2).name
ans =
Loren
ans =
Douglas

See how I wrote that last expression? It was really 2 expressions in this case, separated by one of the usual MATLAB statement separators, the comma, hence the term you see in the documentation "comma-separated list."

Converting Numeric Structure Fields to a Variable

Now I want to take the output from a specific field in the structure array and place the values together in one MATLAB variable. I can do this with

  • a for loop,
  • using the function deal,
  • by taking advantage of notation new in Release 14.

I'll show all 3; my preference is for the newest notation, in part because of its compactness and clarity of intent, but it should also be a bit faster in general.

ls1 = length(s1);

% for
numFor = zeros(1,ls1);
for ind=1:ls1
    numFor(ind) = s1(ind).FavoriteNumber;
end

% deal
[numDeal(1), numDeal(2)] = deal(s1.FavoriteNumber);

% R14 notation, Loren's preferred method!
numFavorite = [s1.FavoriteNumber];

Check for correct answers.

isequal(numFor, numDeal, numFavorite)
ans =
     1

The solution using deal requires you know how many outputs you want, which can be awkward to write in an automated way. I can place the results into a cell array and then convert that to a numeric array. Here, I create a comma-separated list of output values, and then place them inside [] to build the output cell array in numDealC. Then I convert this cell array to a numeric one using cell2mat.

[numDealC{1:ls1}] = deal(s1.FavoriteNumber);
numDeal2 = cell2mat(numDealC);
isequal(numFavorite, numDeal2)
ans =
     1

Converting String Structure Fields to a Cell Array

Now let's work on the names. Since they are different lengths, the names probably belong in a cell array of strings. Using the same idea of having a comma-separated list for the output values, I place each output into a cell. The right-hand side is also a comma-separated list, and, with Release 14 (MATLAB 7), I can parcel out these values to multiple output values without using deal or some other means of distribution. Remember, each cell in a cell array is itself a MATLAB array.

[names{1:ls1}] = s1.name
names = 
    'Loren'    'Douglas'

I could have easily done this instead (thanks to John's comment for the reminder).

names = {s1.name}
names = 
    'Loren'    'Douglas'

I now have a cell array of string names and each name corresponds to the associated value in the numFavorite array. I encourage you to check out the references here, and to post any follow-up thoughts.

References

Documentation

Blog Articles

Newsletter and Digest Articles

These articles are older, and, in some cases, the information is not quite up to date. For example, the examples that use deal work fine, but deal is no longer necessary in as many instances as it used to be. See the R14 Release Notes for more information on this.


Published with MATLAB® 7.2

52 CommentsOldest to Newest

Instead of [names{1:ls1}] = s1.name;, I tend to do

names = {s1.name};

It saves you from having to know the size of s1. I didn’t know about numFavorite = [s1.FavoriteNumber];, though. That’s helpful. I’ve been doing

numFavorite = cat(1,s1.FavoriteNumber);,

which is less clear.

John-

Great comments. I forgot to mention cat since I don’t tend to use it for this sort of work. It adds too much noise.

And thanks for the easy: names = {s1.name};. I was so focused on the idea about the comma-separated list, that I neglected another great method.

First of all, thanks for the blog, I’ve learned a lot about matlab already.

I’m interested in the inverse transform of “Converting Numeric Structure Fields to a Variable”, i.e. Converting a variable into a Structure(or cell array)…

The reason this comes up is, say i’ve got a matrix:
M = magic(3)

M =

8 1 6
3 5 7
4 9 2

And say i have a vector:
ind = [3 , 2];

M(3,2) evaluates as [9].
while
M(ind) evaluates as [4,3].

I’d like to be able to have something like
M({ind}) evaluate as [9], (which it does not).

More generally, i’d like to be able to pass the elements of a vector as separate arguments to a function. To do this, I think I need to turn a vector into a comma separated list (probably via some sort of cell array, or deal(), but I haven’t been able to figure out how…) .

Have any ideas? Thanks.

Well, i can get it to work via 2 lines of code.
indCell = num2cell(ind);

then

M(indCell{:})

ans =

9

but then a can’t directly pass the output of one function into another…

Any ideas of how do do this within the same bit of code?

Thanks!

Adam-

I think you want to look into sub2ind
for your situation. Converting to a cell array might be more costly than you want. The function sub2ind will convert the subscripts (3,2) into the linear index 6 and allow you to choose the elements you want without converting to a cell array. It should be pretty efficient and the code should we quite short and readable.

–Loren

Hi,
We’ve seen here that we can get the values of the favorites numbers of the people by typing [s1.FavoriteNumber].
On the other hand, imagine the favorite numbers change. How can I add the new values? For example, newFavoriteNumbers = [12,5];
Imagine I want to add a new field with the favorite colours? How can I do? FavoriteColors = {‘red’,'blue’};
Thanks in advance,*
Marie.

Marie-

All you need to do is this:

FavoriteColors = {'purple','blue’};
[s1.FavoriteColors] = FavoriteColors{:};

This makes the new field FavoriteColors for each element in s1. The right-hand side is a comma-separated list that can be assigned to the multiple left-hand sides because s1 is an array.

For the numbers, you’d need to put them into a cell array and use the same idea.

newNums = [19, 2];
newNumsc = num2cell(newNums);
[s1.FavoriteNumber] = newNumsc{:};

–Loren

I have an array arr of structs, each of which has as a field a java object obj (and many other fields). Suppose obj has a public double member x, so that eg arr(1).obj.x returns a double.

Is there a way to extract all the values of x?
You’d think [arr.obj.x] and/or [arr(:).obj.x] would work, but both fail.

OK, the first step is
u=struct2cell(arr);
v=[u{1,:}]; % suppose the java object is the first field

then it’s just an array of Java objects. But when I try to do

w=[v.x]

that fails

??? No appropriate method or public field err for class obj[].

Is there any way of getting around that short of a for-loop?
Thanks a lot…

[arr.obj.x] and/or [arr(:).obj.x] result in error when used with Java objects, due to the fact that they result in errors when used with MATLAB values:

>> for i=1:10; darr(i).x.y = i; end
>> [darr.x.y]
??? Dot name reference on non-scalar structure.

>> [darr(:).x.y]
??? Scalar index required for this type of multi-level indexing.

The case of an array of Java objects is more subtle, since at first glance one might expect it to work as it does with MATLAB values, such as this example:

>> u=struct2cell(darr);
>> dv=[u{1,:}];
>> [dv.y]

ans =

1 2 3 4 5 6 7 8 9 10

Whereas this errors:

>> for i=1:10; jarr(i) = JrandomJava; end
>> fieldnames(jarr(1))

ans =

‘foo’
‘bar’

>> jarr(1).foo

ans =

1

>> [jarr.foo]
??? No appropriate method or public field foo for class JrandomJava[].

However, this error is actually expected. The reason for the error, is that an array of Java objects is not a plural MATLAB array of Java objects in the same sense that a MATLAB structure array is a plural array of structures — the Java array is actually a singular Java object, with a Java array class:

>> class(dv)

ans =

struct

>> class(jarr)

ans =

JrandomJava[]

>> getClass(jarr)

ans =

class [LJrandomJava;

>>

Although MATLAB exposes some of the “arrayness” of the Java array object (such as SIZE exposing the number of elements in the Java array), the actual Java class of the Java object in the MATLAB workspace does not have the fields of the underlying Java objects.

Another important reason that this syntax does not work with arrays of Java objects is that Java object arrays can be heterogenous. Arrays of MATLAB structures always have exactly the same fields in each member. This is not the case with arrays of Java objects:

>> x(1) = JrandomJava;
>> x(2) = java.lang.Object;
>> x(3) = java.awt.Frame;
>> fieldnames(x(1))

ans =

‘foo’
‘bar’

>> fieldnames(x(2))

ans =

Empty cell array: 0-by-1

>> fieldnames(x(3))

ans =

‘DEFAULT_CURSOR’
‘CROSSHAIR_CURSOR’
‘TEXT_CURSOR’
.
.
.

So, a FOR loop must be used with arrays of Java objects.

Two issues here: one, that a(:).b.c does not expand to a comma-separated list, and similar for an array of java objects. Thanks for definitely telling that.

However, is there a good reason for that, or is that just that nobody cared enough to code it up yet? Why can’t an array of structs that have sub-structs just naively expand to a comma-separated list? Likewise, why can’t [jarr.foo] in your example just expand to a comma-separated list, _then_ throw errors if there are fields missing?

That would make yet another thing possible in matlab that pure Java can’t do (array handling in Java sucks anyway) – wouldn’t be the first time;)

Egor-

The way I think about it, it’s an indeterminate operation. Do I expand the whole way down the struct or do I expand each piece as I go. I don’t think you’d necessarily get what you were asking for, esp. since, as you nest structs, there is no guarantee that the substructs are similar to each other and can be rationally laid out end to end.

We have no precedent in MATLAB that I am aware of for simultaneously returning a result and throwing an error. It seems like a mind-bending way to go. How would that work with try-catch if nothing was in the catch? Just get the result? It seems like a solution that might be fraught with problems.

–Loren

Loren-

Thanks for the reply. I don’t mean you should return both a result and an error – just _try_ to expand into a commma-separated list, and if it’s not feasible (eg a required field is missing in a substruct, or a member is missing in a Java object,etc.), then throw an error instead.

Expansion order among substructs does not matter very much as long as it’s well defined (recursive seems a reasonable choice); the depth of expansion is known from the expression (eg [a.b.foo] means we try to expand down 2 levels, [a.b] means down one level).

Best,
E.

ok, say one of the fields in a struct array is a vector.

s(1).px=[1 2]
s(2).px=[3 4]

then, s(2).px(1) works, you get 3. so s.px(1) should give me the vector [1 3]. but it gives an error.

i feel like something along the lines of {s.px} should help, because it preserves the structure that is lost in [s.px]. but i don’t know where to go from there.

Erik-

s(1).px is not guaranteed to be the same type, size, etc. as s(2).px so MATLAB doesn’t let you collect them together in an automatic way. I realize your example is uniform, but there is no guarantee.

–Loren

there’s no guarantee that w(3) exists either. handle it the same way.

i wind up needing this kind of thing all the time. an example: i am collecting samples from some data acquisition hardware. i get structs back from it for each time point, one of the fields is an [x y] vector. i’d like to have x over time as a vector. you don’t really want to require users to write a for loop in this situation do you?

re the type objection, whoa! i didn’t realize fields didn’t enforce type. but you still implemented [s.px], despite the fact that i can:

load handel;
s(3).px = audioplayer(y, Fs);
[s.px] %error!

my suggestion is in this spirit.

from sai avula at matlab support:

tmp=cell2mat({s.px}’);
tmp(:,1)

even cleaner, but requires temp variable…
be nice to have the octave syntax [1:10](3) in this case…
-e

hello,

I wanna build a file with the following format, using MATLAB
structures:

1st Line: ID and Name;
2nd Line: Address;
3rd Line: Phone Number and e-mail;

exemple:

24 Marc Darwin
Next Street, 100, Paris
555037475 maria@newmail.com

I don’t know how to build this
kind of structure. I appreciate any kind of help.

thanks a lot..

Marc

Marc-

You should read the “Getting Started with MATLAB” documentation. But here are some hints.

s.idname = 'Marc Alany';
s.address = 'Main St.';
s.contact = {'800-555-1212' 'm@x.y'};
s(2).idname = 'Loren';

OR

s = struct('idname',{cell array of names}, 'address', {cell array of addresses},...
     'contact',{cell array of contact info});
% to do this, the arrays need to agree in size -- see the doc

–Loren

Loren thanks a lot, I’m gonna do that. I will give you some feedback
I’m very grateful, really :),
Marc

dear Loren;

small problem with dynamic matlab object creation and
storage. i want to create objects dynamically and store
them in a cell array in a manner not too dissimilar to
the code below.

however – i run into problems. i can’t seem to
reset the value of the object in the cell array.

can I use assignin here, in the same manner it is
used when setting the content of an object’s fields,
to achieve the desired result – ie: ability to
change the field of an object stored within a cell?

>> BEGIN SNIPPET
% set here uses assignin to assign the passed in string
% to the member name in the object obj, and returns the
% value of the field
% get here gets the value of the name field of the object.

beforetags = {‘world’, ‘of’, ‘the’, ‘mules’ };
aftertags = {‘there’, ‘are’, ‘no’, ‘rules’ };
objcatcher = {};

for i=1:length(beforetags)
tag = beforetags{i};
obj = newobj();
retname = set(obj, ‘name’, tag);
objcatcher = [ objcatcher obj ];
end

for i=1:length(objcatcher)
newtag = aftertags{i}’
obj = objcatcher{i};
retname = set(obj, ‘name’, newtag);
% retname *seems* to be whatever newtag is
end

for i=1:length(objcatcher)
obj = objcatcher{i};
name = get(obj, ‘name’);
% however, retname is just the same *original*
% name assigned to the object.
end

Kerr-

assignin is only for variables, not portions of them. The second argument must be the full variable name, not a piece, so not a cell in a cell array, nor a field from a structure.

–Loren

Hello Loren,

I’m glad to see this vivid discussion here on the blog. I’m having problems with a structure the field names of which I save in a variable in a loop, but it fails of course.

Here’s an example:

s = struct(‘filename’, ‘-regerxp’, ‘^apple’);
names = fieldnames(s);

for i=1:last
apple = s.names(1);
some_function(apple);
end

What should I do? I have pictures of apples as matrices in the struct. And the struct’s fields contain the different pictures.

Thanks in advance,

Lassi Tani

Lassi-

You give incorrect and incomplete information here. I recommend you write some code that someone could actually run. Technical support should be able to help you out.

–Loren

Loren-

I apologize for the incorrect and incomplete information I gave. Here’s a simple code I tried to run without success. I understand if this is not the right place for a noob like me. :)

s = struct(‘name’, ‘Anna’, ‘age’, 28);
names = fieldnames(s);
name = names(1);
s.(name);

Thanks for your time anyway!

-Lassi

Lassi-

name is still a string in a cell array. To get its contents, you must use curly braces {} instead of smooth parens ().

In your example here, either of these will work.

s.(names{1})
s.(name{1})

Hello,

Thanks very much to Erik for updating this thread with the tricks for accessing specific elements of a vector in a struct array… I’ve been having the same problem!

However, does anyone know if a similar trick could be applied for assignments… ie from Eriks example
s(1).px=[1 2]
s(2).px=[3 4]

I want to do something like
[s.px(1)] = [2 4]
resulting in s(1).px=[2 2],s(2).px=[4 4]

Thanks

Many thanks to Erik for posting work-arounds for structure indexing… but I find the cellfun and arrayfun solutions awfully clunky, and it would be great if MATLAB included the ability to index these directly as Erik mentioned. I deal with this issue everyday, and it would be great if it were resolved in a future version. Perhaps I’m using structures incorrectly if I find I need to access them this way all the time?

I would like to know how to ‘create subsystem’ using scripts. Any hints would be appreciated.

Also, when I select multiple blocks, how can I access them?

Thanks

I often want to just go through a list of constants, as if

>> for a = {‘b’, ‘c’}, a, end
a =
‘b’
a =
‘c’

but then get an error when using ‘a’ as it is actually a cell

>> whos a
Name Size Bytes Class Attributes
a 1×1 62 cell

Ie one has to a{1} to get to the contents.

Is there maybe a better way to traverse a strings list?

Thanks,

Ljubomir-

No, you need to know if it’s a cell array or not to use the string appropriately. You are doing it correctly.

–Loren

i want to create structure from several variables ..
each time i have diffrent number of variables and diffrent types
lets say
x1 240*240 matrix
x2 3.132
x3 [ 1 2 3]

mystruct =funct(x1,x2,x3)
will be

mystruct.x1
mystruct.x2
mystruct.x3

and if
mystruct =funct(x1,x3)
will be
mystruct.x1
mystruct.x3

how i can create such dynamic struct?

i have finally created general function for all users !
it creates structure very easily now !!!
================
function [theStruct] = createStruct(varargin)

n=nargin;
theStruct = struct;

for ind=1:nargin
theStruct = setfield(theStruct,inputname(ind),varargin{ind});
end
=========================
examples :

a=1;
b=2;
thestruct=createStruct(a,b)
theStruct

theStruct =

a: 1
b: 2
———————
c=3
d=[ 1 2 3]
e= [ 1 2; 3 4]
thestruct=createStruct(a,b,d,e)
thestruct =

a: 1
b: 2
d: [1 2 3]
e: [2x2 double]

Mic-

who can help you find the variable names.

varnames = who

inputname will return an empty for temporary variables that are passed (e.g., x(3))

You may be better off constructing an array of the names and passing that in as well.

–Loren

Something weird about [s.field] concatenation: if a large structure array is created using repmat, the above operation is extremely slow. However if one saves the same structure array to a mat file and reloads it, the above [s.field] syntax is ~400X faster.

s=repmat(struct('a',0),5e4,1);
tic;t=[s.a];toc

save s s
clear all
load s
tic;t=[s.a];toc

The [s.a] command takes ~30 seconds on my XP32 system and ~90sec on my XP64 system.

I wrote a simple function in place of the built-in cat:

function out = fcopy(in,fname)
out = repmat(in(1).(fname),size(in));
for i=1:numel(in)
    out(i) = in(i).(fname);
end

Like others above, I also would like to have a homogenous structure array where all structure elements are identical, as in C. It would certainly be more EML-compliant than Matlab’s current heterogenous structure array. I assume we would then be able to access substructures directly. In Erik’s example above, I would hope that [s.px(1)] would work for a homogenous structure. I currently have to copy out fields before I can access elements of them: spx = vertcat(s.px), spx(:,1)

Todd-

Last things first. It would be great if you gave your suggestion via the link on the right of the blog. In the meantime, you could make your own struct-like class that would ensure whatever characteristics you want.

As for the performance, what’s happening is that in the original struct, all the fields share the same data. And MATLAB is carefully keeping track of all the references. When you save and then load it, they are each saved as separate arrays so no extra work (reference counting, etc.) needs to happen. Is your test case where the fields share the same data really typical? If not, try the same experiment replacing the repmat array with some random matrices.

–Loren

Along the lines of the discussion from replies 10-20 above I have the following complex array of structures that I want to sum the arrays in a structure element along one dimension of the top level array that can only do with a for loop.

PD(1:5,1:64,1:4)= struct('in',struct('arrayX',zeros(1,maxx),'arrayY',zeros(1,maxy)),...
       'out',struct('arrayX',zeros(1,maxx),'arrayY',zeros(1,maxy)) );

Referencing PD(1,:,1).all.arrayX returns
“?? Scalar index required for this type of multi-level indexing.”
I believe I understand the reasoning above as to why this wont work. But I can’t find any alternate method mentioned here or elsewhere that will work.
The following for loop works correctly:

for i=1:numel(PD(1,1,1).in.arrayX))
   PDsum=PDsum+ PD(1,i,1).in.arrayX;
end

This returns the array where each element of PDsum is sum of the respective elements of PD(1,:,1).in.arrayX.

Is there anyway to sum these arrays without the for loop?

Correction: (oops, multiple errors in for loop above)

PDsum=0;
for i=1:size(PD,2)
PDsum = PDsum + PD(1,i,1).in.arrayX;
end;
</pre)
This for loop correctly sums the arrayX arrays.

Charles-

I am having trouble understanding what you actually want to do. I used maxx = 3 and maxy = 5, and your array has all zeros. My suggestion is to convert your struct to a cellarray and try to use cellfun or accumarray to achieve your results.

–Loren

@Todd & Loren,

about reply 40/41. I have experienced the same behaviour as Todd. I have a very large structure R (~200 MB, output from a database query, random numbers) with 8 fields and length ~1.3e6.

If I try to extract a single fields with

omega = [R.omega]'

it takes 10 minuttes. If I do what Todd suggest and looping over the structure it takes 2 seconds or so. But the fastest is to do

omega = [R(1:end).omega]'

which takes 0.6 seconds.

It is of course great that I have gained a 600 times speed improvement but it is not logical why I need to include the indices?

-Allan

Allan-

If the fields have shared data, MATLAB may do some extra work. In a real example, where each field has different data that aren’t linked, I don’t think you should see this disparity in performance. Do you? Can you write a very small code snippet to demonstrate the issue?

–Loren

Dear Loren:

Why is multi-level indexing forbidden when extracting information for structure arrays? For instance:

politicians = [struct( 'name', 'McCain', 'assets', struct('house_num',8,'boatnum',1)),...
    struct( 'name', 'Palin', 'assets', struct('house_num',1,'boatnum',1))];

% we can access fields at the top level
politicians(:).name

% but a scalar index required for this type of multi-level indexing.
politicians(:).assets.house_num  

Thanks,

Paul

Paul,

I believe I’ve covered this before – there is no guarantee that the next level down in structures is at all uniform. One might contain another struct, another a string, some other ones numeric values with different dimensions. It’s not clear how to concatenate those and we didn’t want to design something that worked sometimes but not always.

–Loren

Hi Loren,

I have been using the sort of code given below (taken from response 7) to write values to specific field of the struct array, and it has been working great.

newNums = [19, 2];
newNumsc = num2cell(newNums);
[s1.FavoriteNumber] = newNumsc{:};

I am now trying to convert my functions into C/C++ using Matlab Codegen. I get an error stating that “Code generation only supports cell operations for varargin and varargout”. Is there a way to do the same without using cell operation? Especially something this is similar (i.e., a vectorized operation/code).

Basawaraj-

I don’t know if this would work, but you could store each string in the row of an array, padding with blanks at the right end if need be. Then loop through the rows to get each string and work with it.

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.