Loren on the Art of MATLAB

May 13th, 2010

Rename a Field in a Structure Array

I'm Matthew Simoneau, a software developer at MathWorks focusing on technical communication and social computing.

My friend Bryan May, an occasional MATLAB programmer, called me with a question the other day. He was working with a structure array and wanted to rename one of the fields. My scan of the documentation came up empty. MATLAB has a setfield and a rmfield, but not a "rename field". This started me thinking about the best way to implement this in MATLAB.

Contents

Create a Sample Structure Array

First, lets create a simple structure array.

clear a
a(1).foo = 1;
a(1).bar = 'one';
a(2).foo = 2;
a(2).bar = 'two';
a(3).foo = 3;
a(3).bar = 'three';
disp(a)
1x3 struct array with fields:
    foo
    bar

Using STRUCT2CELL and CELL2STRUCT

The first technique that came to mind was to use the combination of struct2cell and cell2struct. Here we convert the structure to two cell arrays, one containing the fieldnames f and one containing the values v. We find the field in f and rename it, then put the structure back together.

f = fieldnames(a);
v = struct2cell(a);
f{strmatch('bar',f,'exact')} = 'baz';
a = cell2struct(v,f);
disp(a)
1x3 struct array with fields:
    foo
    baz

Using List Expansions and DEAL

Thinking a bit more, I came up with a way to do this a bit more "in place". Comma-separated list expansion is a powerful concept in MATLAB. I knew I could generate one with the a(:).baz notation, and that I could use deal to assign them back into another comma-separated list.

[a(:).qux] = deal(a(:).baz);
a = rmfield(a,'baz');
disp(a)
1x3 struct array with fields:
    foo
    qux

No DEAL Required

Scott French pointed out to me that, as of MATLAB 7, the deal was no longer necessary.

[a.quux] = a.qux;
a = rmfield(a,'qux');
disp(a)
1x3 struct array with fields:
    foo
    quux

Generalization

Further, Kenneth Eaton commented that this technique generalizes nicely using dynamic field names, introduced in MATLAB 6.5.

oldField = 'quux';
newField = 'corge';
[a.(newField)] = a.(oldField);
a = rmfield(a,oldField);
disp(a)
1x3 struct array with fields:
    foo
    corge

Conclusion

My guess is that the no-deal technique used in the last two sections is the most efficient in most circumstances, though I haven't done any profiling. The code is certainly the cleanest in these. What do you think? Is there a better way? Let me know here.


Get the MATLAB code

Published with MATLAB® 7.10

12 Responses to “Rename a Field in a Structure Array”

  1. Kent Conover replied on :

    Thanks for explaining so clearly how to rename a field in a structure array!
    I would appreciate it tremendously if you could arrange to have my poorly conceived attempt to do this removed from the MathWorks file exchange:

    http://www.mathworks.com/matlabcentral/fileexchange/9786-renamefield

    Many thanks,
    -Kent

  2. Pete Scotson replied on :

    I find this very interesting. I believe the I’ve mastered the :, but manipulating the lists and using square brackets on the lhs in this way is new to me. Is there a web-cast that demonstrates this kind of thing?

    Off the point I know, but I’d also like to be able to address a structured array be its contents i.e. using one of the fields as an index, eg. the field “bar” in your example, so that I can retrieve the contents of the field “foo”. This works:

    >> tmp=[a(strcmp('two',{a.bar})).foo]
    tmp =
    2
    but is there a better way? The [] are needed to prevent output argument errors and return an empty matrix instead.

    Regards,
    Pete

  3. Catherine replied on :

    Hi Loren,

    Thank you so much for such clear and useful explanation. I prefer the last one, generalized no DEAL version.

    If it is possible, would you please explain to me why there is a [] in the statement:[a.(newField)] = a.(oldField) and why we need it at lefthand side but do not need it at righthand side of the statement? Thank you very much and I am looking forward to hearing from you.

    Many thanks,

    Catherine

  4. Loren replied on :

    Pete-

    I don’t know if there’s a video on handling comma-separated lists, but there are posts in this blog on it (e.g., look at other ones using deal, and ones using cell arrays).

    As for your other question, I don’t understand it – but perhaps this will help.

    S.one = 1;
    S.two = 2;
    fn = fieldnames(S)
    tmp = S.(fn{1})
    

    Note mixture of curly and smooth parens… They matter. tmp should now be S.one or the value 1.

    –loren

  5. Loren replied on :

    Catherine,

    Here’s the code and explanation:

    oldField = 'quux';
    newField = 'corge';
    [a.(newField)] = a.(oldField);
    a = rmfield(a,oldField);
    

    Since a is an array, a.corge produces a comma-separated list. Those values need to get placed into an array of variables. In general this is true of structs:

    S.f produces S(1).f, S(2).f, etc.

    To return multiple outputs on the left side, you need to put the variables inside [] in MATLAB, e.g.,

    [rows, columns] = size(ones(2,3))

    So [S.f] is equivalent to [S(1).f, S(2).f, ..., S(end).f] and means that the number of output arguments is the length of S – needed because the number of elements on the right side is the number of elements in S as well because S.corge expands to length(S) comma-separated list.

    You might look up structs and comma-separated lists in the doc and look at more posts here tagged structures (and some with cell arrays) for more explanation and examples.

    –Loren

  6. Pete Scotson replied on :

    Loren,

    Thanks for your reply. I’ll take a look at the blog and your answer to Catherine’s query is good. The example you show is what I currently do, where the field name itself is the “index”, but this is not an array of structures (it’s a single structure and so will not return a comma-separated list?) and so has to be accessed by cycling through the fieldnames, eg. when searching through or checking the consistency of the individual sub-fields, rather than by a “for i=1:length(a)” construct. I guess it all comes down to how you want to store and access the data, and when to use a structure or cell array, or array of structures etc.

    Regards,
    Pete

  7. Loren replied on :

    Pete-

    For a scalar struct, you might try structfun to cycle through the fields for you.

    –Loren

  8. Catherine replied on :

    Dear Loren,

    Thank you so much for your detail answer based on my question and suggestions about supporting knowledge.

    Many thanks,

    Catherine

  9. Aslak Grinsted replied on :

    Hi Loren

    I recently had this problem and was not really satisfied with the options that matlab gave me. In my case the structs are huge and I want to avoid having any copies at all cost. In the end i opted for the No-Deal approach. However, instead of using rmfield(a,’baz’) i simply did a.baz=[]. I guess i could also have solved it by holding the data in a custom class rather than a struct. However, that was not easily feasible due to compatibility issues.

    It would be great to have more tools available to modify structs. Like a rename field function, and a rmfield which does not copy the whole struct.

  10. Duane Hanselman replied on :

    I concur with the suggestion to have struct functions such as rnfield, rmfield, etc. that do not involve memory reallocation. The solutions posted above all require memory reallocation. The struct2cell and cell2struct approach above involves less memory reallocation since rmfield uses these two function internally. rmfield is not an ‘in place’ operation, unless it is made into a built in function.

  11. Jan Simon replied on :

    The “Using STRUCT2CELL and CELL2STRUCT” approach is faster with STRCMP instead of the lame STRMATCH:

      f{strmatch('bar',f,'exact')} = 'baz';
    ==>
      f{strcmp('bar',f)} = 'baz';

    I’ve tried it hard but without success to create a Mex version of RMFIELD or RenameField using shared data copies. Although this technique is not documented, it promisses a great speed compared to RMFIELD. This e.g. described here: http://www.mk.tu-berlin.de/Members/Benjamin/mex_sharedArrays
    The Mex I’ve create can rename the field very fast, but unfortunately this concerns the input struct also…

    In Matlab 2009a, there are some possible improvements to accelerate RMFIELD (I think this is fixed in 2010a already?):

      idxremove = [];
      for i = 1:length(field)
        ...
        idxremove = [idxremove;j]
      end
    

    ==> Better pre-allocate and logical indexing instead of FIND.

      j = find(strcmp(field{f}, f) == true)
      % ==> no need for "== true"

    Instead of creating idxremove *and* idxkeep, it would be more efficient to treat the list of field names “f” exactly as the list of data “c”:

      t = cell2struct(reshape(idxkeep, :), newsizeofarray), f(idxkeep));
    

    It would be really helpful, if the shared data copy functions could be documented. Wasting time with deep copies in Mex functions means a loss of energy equivalent to pouring pure oil in the poor sea.
    Thanks for the interesting discussion, Jan

  12. Jan Simon replied on :

    I’ve programmed a C-Mex, which creates shared data copies of all fields and replaces the field name on the fly:
    http://www.mathworks.com/matlabcentral/fileexchange/28516
    For a single field, it is twice as fast as the STRUCT2CELL approach, while it is 20 times faster for a struct with 1000 fields.
    If the copy of the matching fields is omitted, the resulting function equivalent to RMFIELD, but 5 to 10 times faster:
    http://www.mathworks.com/matlabcentral/fileexchange/28517

    Perhaps this helps Bryan May. Kind regards, Jan


MathWorks
Loren Shure works on design of the MATLAB language at MathWorks. She writes here about once a week on MATLAB programming and related topics.

These postings are the author's and don't necessarily represent the opinions of The MathWorks.