# Memory Management for Functions and Variables68

Posted by Loren Shure,

People have different ideas about what costs a lot, in terms of memory, in MATLAB. And sometimes people don't know the details. Today I am going to talk about when MATLAB makes copies of data, for both calling functions, and for data stored in variables. MATLAB makes copies of arrays it passes only when the data referred to changes (this is called copy on write or lazy copying).

### Passing Arrays to Functions

The question is, when does MATLAB copy memory when passing arrays to functions. Some users think that because MATLAB behaves as if data are passed by value (as opposed to by reference), that MATLAB always makes copies of the inputs when calling a function. This is not necessarily true. Take a look at this function.

type fred1
function y = foo(x,a,b)
a(1) = a(1) + 12;
y = a * x + b;


In fred1, the first and third inputs, x and b, are not altered inside. MATLAB recognizes this and passes both these variables in without making any extra copies. This can be a big memory savings, for example, if x is a large dataset. However, in fred1, we can see that the second input, a, gets modified inside. MATLAB recognizes when this is happening to a variable and makes a copy of it to work with so that the original variable in the calling workspace is not modified.

### Structures and Memory

Each structure member is treated as a separate array in MATLAB. This means that if you modify one member of a structure, the other members, which are unchanged, are not copied. It's time for an illustration here.

### Create some rgb image data with 3 planes: red, green, and blue.

im1.r = rand(300,300);
im1.g = rand(300,300);
im1.b = rand(300,300);

Instead, rearrange the same data so that we have an array of structs each element containing an [r g b]-triplet.

im2(300,300).rgb = [0 0 0];  % preallocate the array
for r = 1:300
for c = 1:300
im2(r,c).rgb = [im1.r(r,c) im1.g(r,c) im1.b(r,c)];
end
end

Let’s compare im1 and im2.

clear c r  % tidy up the workspace
whos
  Name       Size                    Bytes  Class

im1        1x1                   2160372  struct array
im2      300x300                 7560064  struct array
s          1x1                       392  struct array
sNew       1x1                       392  struct array

Grand total is 630043 elements using 9721220 bytes



im1 is a scalar structure with members that hold m x n arrays.

• im1.r = imageRedPlane --- size m x n
• im1.g = imageGreenPlane --- size m x n
• im1.b = imageBluePlane --- size m x n

im1 is size 1 x 1; total # of arrays inside im1: 3

im2 is an m x n structure array with fields containing 3-element vectors.

• im2(i,j).rgb = imageOneRGBPixel --- size 1 x 3

im2 is size m x n; total # of arrays inside im2: m x n

Notes: Every MATLAB array allocates a header with information. This makes im1 more memory-efficient than im2 (more generally, scalar structs containing arrays are more memory-efficient than a struct array). When one field in a structure is changed, and possibly copied, the other fields are left intact.

s.A = rand(3);
s.B = magic(3);
sNew = s;
sNew.A(3) = 14;

Since s and sNew have unaltered copies of B, the B fields share memory, but the A fields do not. See the documentation section titled Using Memory Efficiently for more information.

### What's Your Mental Model for MATLAB Memory Management?

Say that three times fast!

Does the description here and/or in the documentation change your model?

Let me know.

Published with MATLAB® 7.2

Ben replied on : 1 of 68

My mental model is ambiguous, because it’s all folklore from other MATLAB users that has accumulated across many MATLAB releases. I’ve seen someone say on the newsgroup that MATLAB does “copy on write” which is what this post seems to show (and I am glad to hear), but that individual’s claim didn’t come with a @mathworks.com address in the From: field, so I classified it as folklore at best, despite how often he seems to post.

It’s good to get the facts from the source at least; to see how Matlab memory management behaves in the wild. It would be nice to see any other optimizations or behaviors as in the case of function fred1.

Markus replied on : 2 of 68

Hi!

I often have the situation that a variable is passed to a function, changed inside that function and then used in the calling function instead of the old version, which is no longer used. As an example, I could call function myfun like this:

x = myfun(x); % overwrite x with the new value

Function myfun has the following form with x as in- and output argument:

function x = myfun(x)
% modify x here, for example
x = x.^2;

Does Matlab recognize that in this case there is no need for copying the input array x, even if it is changed inside the function? x could be a large array or a structure with many fields, so this could save quite some effort.

When implementing myfun as a mex-file, it is possible to change variable values without copying the data. However, I have always been following the advice somewhere in the Matlab documentation *not* to do this because this could lead to “unexpected results”. Is this advice still of relevance? I guess there would hardly be unexpected results when changing an element of a matrix, but what if expanding a matrix or adding a field to a structure?

Loren replied on : 3 of 68

Markus and Ben-

To answer the last question first, yes, we still recommend that you not overwrite input data in MEX-files. That’s because, since MATLAB semantics is value-based, other users (or you sometime later perhaps) are not generally expecting calling function workspace variables used as inputs to change. And if an error occurs somewhere within the chain of calculations, you may end up with some “corrupted” variables that would not occur otherwise.

For the first question, in cases like this,

  function x = myfun(x)
% modify x here, for example
x = x.^2;


MATLAB R2006a and earlier does not take advantage of reusing the memory for the input variable x, nor equivalent behavior for a field of a structure. We are constantly looking for and making performance improvements in MATLAB. The one you mention is on our list.

Allen Myers replied on : 4 of 68

Great writeup!

I haven’t been clear about structures not necessarily needing continuous memory space. It states that in the documentation. Does that mean each field does not need to be in the same continuous space as the rest of the structure?

For functions like this, where x is very large, I have been using nested functions. Actually I tend to have several large variables and this works out well for me. I’d prefer matlab be smart about the whole thing since nested functions can complicate things a little.

function x = myfun(x)
% modify x here, for example
x = x.^2;

Loren replied on : 5 of 68

Allen-

Each field of a struct is its own MATLAB array and each one needs its own contiguous memory, but the union of the struct fields/elements does not need to be contiguous.

–l

Hartmut Seel replied on : 6 of 68

Hi,
In which future MatlabRelease will the improved Variable-handling be implemented?
( function x = myfun(x)
% modify x here, for example
x = x.^2;) -> don´t make a local copy of x;

In this context I also miss the possibility to hand a variable over as reference. So no lokal copy of it will be made.
In C++ this is done through a ‘&’ before the variable.
function x = myfun(&x)

Thanks a lot for answering and
Best regards
Hartmut

Loren replied on : 7 of 68

Harmut-

Thanks for your interest. In general, we don’t pre-announce features for given releases. Sorry.

We have also have the request for references. That is on our idea list for the future.

–Loren

Lars Barring replied on : 8 of 68

Hi!
As was stated previously, a most useful writeup.
Just a question of clarification:
Arrays are not copied until changed when sent as arguments to functions. Is this also true if you create a (huge) array in a function and return it as output to the caller?

/Lars

Loren replied on : 9 of 68

Hi Lars-

Yes, this is true. If you do something inside a function to create an output, an extra copy is not made to assign it to the output variable in the caller. The exception is if there’s an unusual output like an indexed variable:

A(1:end,10) = fcnMakingLargeOutput();


In this case, the large array is copied into the memory in the variable A. Otherwise, there is no extra copy made.

Daniel replied on : 10 of 68

COW is great for the common case – simple semantics, good performance, transparently. How do I deal with the less common case, where I actually want to know whats going on? (what copies are being done)

I have about 200MB of data. While processing it, I suspect lots of memory is being allocated, because Matlab seems to take >600MB of system memory. This might be unavoidable, but I want to know what lines in my program allocate the bulk of the memory, so I can consider changing the implementation.

Is a memory profiler in the todo lists? in addition to being a useful tool, it would also probably improve all of our models of the memory management system… in the mean time, any practical advice would be welcome.

Luca Citi IMT replied on : 11 of 68

Dear Loren,
I write you concerning to the problem
x = myfun(x) .
I understand your recommendation that mex files should not overwrite input data because other users (or the same user sometime later) are not generally expecting inputs to change. Anyway I implemented a mex file called “inplace” (in order to warn me about its behaviour) that does it. The last argument tells the function what to do and the first ones are the actual arguments of the function.
To date I implemented a few functions I need, i.e. some C-like operators and circshift:
inplace(x, y, ‘+=’);
inplace(x, y, ‘.*=’);
inplace(x, ns, ‘circshift’);

They do not allocate memory and compared to the matlab equivalent
x = x + y;
x = x .* y;
are (when working with arrays above 300-400 MB) much faster (tenths of second instead of a few minutes) because the pc does not start swapping to the disk.
For the C-like operators, a matlab-way workaround to limit allocation could be to split the operation in suboperations like:
for i = n:n:N
ii = (i-n+1):i;
x(ii) = x(ii) + y(ii);
end;
x((i+1):end) = x((i+1):end) + y((i+1):end);
but it is not as fast and not very elegant.
For this reason and to perform inplace circshift (much harder to split in suboperations), I decided to go for the mex file.
But a problem arise because of the copy-on-write policy (a really smart and efficient solution in most situations):
w = …something…;
x = w;
inplace(x, y, ‘+=’);
The unpredictable result is that w changes together with x after inplace is called.
Obviously the workaround is to change x before passing it to inplace:
w = …something…;
x = w;
x(1) = -x(1);
x(1) = -x(1);
inplace(x, y, ‘+=’);
and it works. But what if sometime later I forget about this trick? Searching the web for a solution I found someone using two apparently undocumented functions: mxIsSharedArray and mxUnshareArray. Then performing
if(mxIsSharedArray(prhs[0]))
mxUnshareArray(prhs[0]);
before using prhs[0] solves the problem. A more conservative approach could be:
if(mxIsSharedArray(prhs[0]))
mexErrMsgTxt(“The array is shared. Cannot proceed.”);
Now my questions…
Why in matlab such a clean and fast solution as “x += y” is not supported?
Concerning to my solution, can you suggest me better ones avoiding breaking the rule to not alter inputs?
If not, is what I am doing safe?
Concerning to the two functions, is there a risk that in a future implementation they will be dropped? Why are these undocumented? Is the use of mxUnshareArray safe or should I switch to the more conservative approach?
Regards,
Luca

Stuart replied on : 12 of 68

Simple arithemtic operations like the ones you mention:
x = x + y;
x = x .* y;

…are currently performed in-place in M-code, but not at the command line (watch the task manager with a big array),. It is performed by the JIT/Accelerator which does not operate at the command line. As Loren said, we are looking at making this work for M-functions, built-in (C) functions and perhaps Mex files.

Loren replied on : 13 of 68

Sorry I’ve been off-line for a while. Here are some very quick responses.

Daniel- A memory profiler is on the futures list.
Luca- I am not a mex expert and am hoping someone who is can make a helpful comment.

As for +=, there are a lot of things to consider, which is the left-hand side might have duplicate array values: x([3 4 3]) += 17. Do we add 17 to the 3rd entry or 34? We can legitimately choose either and explain it but we want it to be clean and useful. Plus, it will be much cleaner for us to implement after we do some heavy internal refactoring/cleaning up in that part of the code base. Finally, we need to consider how users of the class system can overload it. So it’s not just a no-brainer for us.

Luca Citi IMT replied on : 14 of 68

Thank you both for the answers.

Stuart…
I only tried from the command line and noticed the big increase in memory usage. But good to know that with M-files it is performed in a smarter way.

Loren…
I agree with you concerning to the x([3 4 3]) += 17 case. To avoid confusion, I would make it behave like x([3 4 3]) = x([3 4 3]) + 17 i.e. to add 17 only once. Actually I think it is performed twice but the last one overrides the first one because in
a([1 3 1 1]) = a([1 3 1 1]) + [1 2 3 4]
a(1) is increased by 4
and
tic; a(ones(1000000,1)) = a(ones(1000000,1)) + 1; toc
takes much more time than a(1) = a(1) + 1.
Anyway if, like Stuart said, the operation is optimized in M-files += (and the others) are not a priority.

Regards,
Luca

Paul Marks replied on : 15 of 68

Here’s a technique I came up with the other day to prevent copy-on-write occuring in some cases:

function dosomething(x, deleteme)
assignin(‘caller’, deleteme, []);
x(1) = 123;

end

dosomething(big_matrix, ‘big_matrix’);

As long as big_matrix was the only variable using that chunk of data, then copy-on-write does not occur when changing x inside the dosomething function.

Andras Ferencz replied on : 16 of 68

A question about partial arrays: in the following lines, will b be copied or will it contain just a pointer into A?

A = rand(big_number,5);
b = A(:,3); % is there a copy performed here?

I presume b=A(3,:) must be copied as this is against the grain, but b=A(:,3) is a continous chunk.

Andras

Loren replied on : 17 of 68

Hi Andras-

Only complete arrays are shared. So b, in your example, has a copy of the 3rd column of A. b is a contiguous chunk but wouldn’t be if you selected all the columns in a specific row, e.g., A(3,:).

–Loren

Andras Ferencz replied on : 18 of 68

Thanks Loren. That is too bad. Does the JIT/Accelerator eliminate this copy when I use A(3,:) in an asignment:
c = c + A(3,:);
or
c=c+A(:,3); % non-continous case

thus making
b=A(3,:);
c=c+b;
less efficient then the above?

Thanks again,
Andras

Loren replied on : 19 of 68

Andras-

Currently, the indexing case that’s not contiguous does cause an allocation in the JIT/Accelerator; the contiguous case does not.

–Loren

Georg replied on : 20 of 68

Hi,
maybe this is the place to ask: I frequently run into ‘out of memory messages’ when batch processing large numbers of medical files (images of ~40MB). I pack, I clear, I run calculations only on the necessary parts of the images. By this I can move the ‘out of memory…’ message back from say the 4th image to the 10th. MATLAB memory usage grows continuously, no matter what is visible in the workspace.

Is there any way to handle this? Help would be very much appreciated! I have looked around, but haven’t found a solution.

Thanks!
Georg

Loren replied on : 21 of 68

Georg-

I recommend you take a look at this collection of slides and examples that Stuart put together on the File Exchange.

–Loren

Dadi replied on : 22 of 68

Hi,
After looking at various memory management info I still have the following question. Hope you can help.

I need to know how the preallocation works when I don’t know what the final size of my matrix will be. I will know the maximum size, but I don’t want my final matrix to be of this maximum size. And I dont really want to operate on the matrix with that maximum size. I have no problem preallocating a matrix, but suppose I really want to grow it with a statement such as A = [A;B].

Example: If I preallocate A = ones(1,1000), does that mean that A has that memory regardless of changes made to the variable, for example, if I now set A = B (where B is 1×10). Has A now lost its 1000×1000 space? Suppose I later set A = [A;B] making A 1×20, is it now growing in non-continuous memory or does it still have access to the 1×1000 area created when A was first created? Another way to phrase my question: does A only lose its 1×1000 preallocated memory when I use the clear function?

Thanks

Loren replied on : 23 of 68

If you preallocate A to 1000×1000 and then say B = 1:10, A = B, A will have only space for 10 elements. If you want to keep it with 1000×1000, you must assign by indexing, e.g.,

A(1:length(B)) = B;

A = [A B]

will find contiguous space large enough for all the data from A to be together and then delete the space for the 2 smaller pieces. The original space is long gone. Again, to preserve it, you must assign using indexing on the left-hand side.

–Loren

per isakson replied on : 24 of 68

Loren

I’m loooking for a way to decide which variables to clear to set memory free. That is in code in functions where the same data may be refered to by variables in different scopes. The problem is that little memory is set free if I clear a variable, the data of which is referenced be another variable.

/ per

Loren replied on : 25 of 68

Per-

There isn’t a good way. If variables are linked, then deleting one of them will not free the memory associated with the data, as you’ve noted. You’d have to know somehow yourself if you created certain arrays and they could be cleared.

Would you want this for debugging? If so, you might explore format debug. It shows the pointer to the data and if 2 arrays point to the same data, they will show that. Programmatically there is not a good way however.

–Loren

per isakson replied on : 26 of 68

Loren-

Thanks Loren. No, my problem is not debugging. I try to make a cache.

I’m developing an interactive tool for looking at measured data. The total amount of data is orders of magnitude larger than would fit in ram-memory. I want to give the user the feeling all data are available all the time. To that end, I store the measured data it special binary files, use memmapfile, and have a cache.

This works surprisingly well until it’s time to clear data from the cache. The measure data is never changed, but in my code it is refered to from closures (function handles of nested functions), which I have problems keeping track of. Now a clear the oldest data in the cache, which causes major problems. The following happens: 1) time series A is in use, 2) I delete A from the cache, 3) user asks for A for a new plot, 4) a new copy of A is created in the cache, 5) et cetera. Work around: close tool and start with a fresh memory.

I guess there a books on caches, non of which I have opened.

/ per

anton sirota replied on : 27 of 68

Loren,
I am trying to improved performance large (2-3 Gb) loading binary multiplexed (int16) data from a file. The problem is that I need to load only some channels, making non-continuous reads. I used fread before with various buffer sizes, and keeping only subset of raws. Now I tried the same data load with memmapfile thinking it would give better performance and it doesn’t. I gain due to caching with second read from the same file, but on the first run from uncached files I get about the same speed. At the end of the day, doesn’t it depend on the page size matlab internally uses for memmaping vs me for fread buffer? Is there a way to get the optimal performance with eaither solutions? E.g. what page sizes do fread and memmapfile use? Also, I am running 64bit Linux and 64bit matlab 7.1 sp3, and max size memmapfile allows me to map is 2^31-1, why is that?

Loren replied on : 28 of 68

Anton-

MATLAB doesn’t allow the user to control the buffer size now so unfortunately there is nothing specific I can tell you. As for the 64-bit MATLAB, even there for now, the maximum data size is controlled by a 32-bit pointer.

–Loren

Pranas replied on : 29 of 68

I am thinking about storing data in compressed in memory form way and feed Cashe with decompressed data. It would solve number of problems, but I am missing standard Cashe support in MATLAB. Some ideas? Existing custom implementations?

William replied on : 30 of 68

I am trying to read large binary data files (over 2GB).

At present, I am using memory-mapped file IO via a C++ MEX file.

Does the creation of the memory-map view via the win32 “MapViewOfFile” function violate the “allocate memory with mxCalloc & mxMalloc only” rule?

Loren replied on : 31 of 68

William-

The short answer is “yes,” the use you suggest does violate the rule you mention. You are really best off sticking exactly with our documented APIs and methods for now.

–Loren

Ljubomir Josifovski replied on : 32 of 68

Does COW maybe work for local variables (not only for functions args/automatic vars)? Say after v=zeros(100);a=v;b=v;c=v; will all 4 variables share the same matrix? If they do, what happens after v(1,1)=1? Do they all get their own copy, or do maybe a, b, and c still share the same copy?
Thanks,
Ljubomir

Loren replied on : 33 of 68

Ljubomir-

Copy on write works for variables in general. v,a,b, and c will all point to the same memory at first. Once v has an element changed, a,b, and c will all share memory and v will have its own.

–Loren

Ljubomir Josifovski replied on : 34 of 68

Excellent, thanks Loren.
Ljubomir

Ged Ridgway replied on : 35 of 68

Hi Loren,

Sorry if this should be obvious from the above, but is it better to use a subset of a large array within a function than to pass only that subset. E.g. is the first option below any better than the second, in terms of memory? If so, is it better only in the contiguous case? Many thanks,
Ged

%% option 1, “pass” (copy-on-write) then subref
x = randn(1e4);
process(x, 1:10)
function [y z] = process(x, sub)
y = mean(x(:, sub)); % contiguous chunk
z = mean(x(sub, :)); % non-contiguous

%% option 2, pass in subset
x = randn(1e4);
process(x(:, 1:10), x(1:10, :))
function [y z] = process(x1, x2)
y = mean(x1);
z = mean(x2);

Loren replied on : 36 of 68

Ged-

There is not a simple answer. Depends if you are running into memory issues or not. And how large a subset you typically work on. And your computer’s memory and cache. In either case for you, you are creating that new matrix either explicitly or as a temporary variable when you process so I don’t think I see any major difference memory-wise for you. I’d worry about maintainability and readability as well.

–Loren

Jonathan replied on : 37 of 68

I’ve been battling the same problem Markus had (x = myfun(x)), and was inspired by Paul’s deleteme idea. Here’s my solution to our problem:

myfun(x, deleteme)
assignin(‘caller’, deleteme, []);
x = x.^2;
assignin(‘caller’, deleteme, x);
end

Call it with:

myfun(x, ‘x’);

As Paul noted, as long as x is the only variable using that memory, this trick works properly. Otherwise, when the caller’s variable is cleared, Matlab makes a copy for the callee, causing the function to bog down again. Loren, do you know if this is going to get me in trouble somehow, or does this look to you like a good way to achieve pseudo-pass-by-reference?

-Jonathan

Loren replied on : 38 of 68

Jonathan and Markus-

Have you read this blog entry? It explains the in-place paradigm in MATLAB. You might want to be sure you are putting that to the best use before directly manipulating workspaces from the caller.

–Loren

Etienne Non replied on : 39 of 68

I’m looking for a way to set memory free in a while loop. At the beginning I was not able to run the first iteration of the loop there were always an “Out of Memory error”. I was thinking that I will be fine if I can run just the first iteration of the loop, planning to clear all variables in the loop before starting the second iteration. After reducing the size of arrays and matrix, I got the first iteration. I used then the function clear var1 var2 … that is fine with my laptop (windows) but not in the lab computer (Unix). It is as if the function clear.m doesn’t free memory after clearing the variables in Unix system. Do any one know how to free memory allocated for the cleared variables in Unix?

Loren replied on : 40 of 68

Etienne-

The variables are very likely cleared but may not leave enough contiguous space in memory for new arrays to get created the next time through. You might want to look at technical note 1106 for help with memory issues or contact technical support for issues specific to your code.

–Loren

Jo replied on : 41 of 68

Hi,
I have saved a large variable
>>save(‘myfile’,’data’);
and later want to load it in another program.
That’s fine if I keep it called ‘data’.
but the second program doesn’t know its name, so I have
>>f=fieldnames(varstruct);
>>var=getfield(varstruct, fieldnames(1));

Unfortunately, this creates the variable varstruct as well as var, more than doubling the memory requirement. How can I say “load whatever you find in ‘myfile’ and put it straight into ‘var'”?

Thanks, Jo

Loren replied on : 42 of 68

Jo-

You can’t put it directly in var. But MATLAB is smart about the memory and doesn’t make a copy until it needs to. Your best bet is to either work with it as a struct or do something like this:

varstruct = load('myfile');
...
var1 = ...
clear varstruct


That way, as soon as you as you tinker with var1 (I renamed it since var is a function name in MATLAB), it won’t make a separate memory copy from varstruct since we just cleared varstruct.

–Loren

Steve L replied on : 43 of 68

Jo,

Another option is to use the WHOS function with the -FILE flag to determine which variables you want to load, then load those specific variables from your MAT-file.

% using the durer.mat file from toolbox/matlab/demos
% for purposes of this example.  You can substitute
% whatever MAT-file name you want.
matfile = fullfile(matlabroot, 'toolbox', 'matlab', ...
'demos', 'durer.mat');
variablenames = whos('-file', matfile);
name1 = variablenames(1).name;


The variablenames struct will contain more information about your variables, so you can determine (based on the size field, perhaps) which variables you want to load. Then you can refer to the variable using a dynamic field name:

mydata = var1.(name1) + 1;

A.S. replied on : 44 of 68

You know what would be great? If I could call pack from a function and not just from the command line. That would be absolutely superb. That way I could check to see if I am having a memory issue and “pack.”

Loren replied on : 45 of 68

A.S.-

I recommend, if you are using windows, to use the memory function to help you see how memory’s being used. You might get some insight there.

–Loren

Matt replied on : 46 of 68

Good blog post. I find a lot of people are not fully on top of MATLAB’s copy-on-write behavior and the MATLAB documentation on this is pretty diffuse.

It would be nice to see the discussion expanded to include handle and value classes, in addition to ordinary structures. Handle classes, in particular, break many of the rules mentioned above. The MATLAB OOP manual discusses it, but somewhat obliquely, IMHO.

Loren replied on : 47 of 68

Matt-

Most of this post applies immediately to objects, both value and handle. objects that are handles, of course, see changes if they refer to the same object where a change was made (not copy on write in this case). Note however values inside of handle properties may still be sharing data with other variables and do copy-on-write to preserve the semantic copies of those values. Like it a property came from a value or regular MATLAB array and then the regular array changed….

The examples as listed above are valid for value objects. For handle objects, the code would change to something like this:

s.A = rand(3);
s.B = magic(3);
sNew = s;
aNew = s.A;
sNew.A(3) = 14;




Now sNew.A and s.A now have the same value.
sNew.A and s.A now have a different value from aNew. Before the assignment on the last line, sNew.a and s.A pointed to the same array header and same array with no semantic copy, but that array shared data with aNew. After the assignment, s.A and sNew.A still have the same array header and array, but it now has its own copy of the data distinct from the data in aNew.

–Loren

Denzel replied on : 48 of 68

Thank you for the posts. My question is, does matrix transpose cause a new copy? For example:

A= some large matrix;
myFunction(A');


Does myFunction keep a separate copy of A’, or it use the A directly, assuming myFunction does not modify the transposed function.

Loren replied on : 49 of 68

Denzel-

Yes, matrix transpose does make a copy as it needs to rearrange the values in memory.

–Loren

Eric replied on : 50 of 68

But I believe that vector transpose does not require any memory copy, correct?

Loren replied on : 51 of 68

Eric-

That is correct. For vector transpose, no data needs to be rearranged, and therefore MATLAB recognizes that it doesn’t need to make a copy.

–Loren

Matt replied on : 52 of 68

This is a helpful blog, but I think a good way to disseminate the material in it would be to ammend the MATLAB documentation.

Apparently, a decision was made in the early writings of various MATLAB docs to shield end-users from the complexities of “copy on write” ideas and just explain MATLAB memory management as “passing by value”.

One sees this, for example, in the Object Oriented Programming doc where the distinction between “handle obejcts” and “value objects” are described. Value objects are described, essentially, as things which always have their data copied. But this is not true, since copy-on-write behavior applies to them. Yet the doc doesn’t explain this subtlety.

While it’s understandable to want to shield users from complex subtleties, I think it’s done more harm than good, because copy-on-write is an important thing to know how to exploit in MATLAB programming (as the existence of this blog clearly shows).

Loren replied on : 53 of 68

Matt-

Please put this in as an enhancement request. Coming from users, the request carries more weight than me alone. Thanks. (link on the right to support…)

–Loren

Doug replied on : 54 of 68

please clarify issue of contiguous memory and structs for sub-structs. Does the comment above apply recursively to structs that are elements of structs?

From the above, it appears that A.x(1:10), A.y(1:20) and A.z can be in three separate chunks of memory.

Now suppose I have A.B(1:5).a(1:10) and A.B(1:5).b(1:26). Does A.B(1:5) need to be a contiguous block of memory, like A.x and A.y?

Loren replied on : 55 of 68

Doug-

Each field in a struct array is its own MATLAB array and those arrays are the ones that require contiguous memory. So A.B doesn’t need to be contiguous, just its constituents.

–Loren

Jesse replied on : 56 of 68

Question about lazy copies and plots: We have an application in which we want to plot multiple data sets, each very large. Most plots are done vs time, where the time-vector is always the same. However, using format debug, I notice that each plot has a different copy of the time-vector. The following few lines will illustrate this:

>> t = 1:10

t =

1     2     3     4     5     6     7     8     9    10

>> y = sin(t);
>> z = cos(t);
>> h1 = plot(t,y);
>> figure;
>> h2 = plot(t,z);
>> format debug
>> get(h1,'xdata')

ans =

m = 1
n = 10
pr = 2a2b3d60
pi = 0
1     2     3     4     5     6     7     8     9    10

>> get(h2,'xdata')

ans =

m = 1
n = 10
pr = 2a8e3900
pi = 0
1     2     3     4     5     6     7     8     9    10

>> t

t =

m = 1
n = 10
pr = 2a8e3eb0
pi = 0
1     2     3     4     5     6     7     8     9    10


Is there really a separate copy of “t” in handles “h1″ and “h2″? Is there a way to get them to use the same copy? This would really help some memory issues we are having when plotting multiple data sets.

Loren replied on : 57 of 68

Jesse-

Yes, there really is a separate copy. I am not sure if this will get you one copy, but if you plot t vs. a matrix of values, there may only be one copy of t. That would suggest a different strategy to your plotting, I realize. The graphics backend needs these extra copies, even if MATLAB doesn’t at the front end. I know there is work going on to share even more of the data where possible. I am not sure what the resolution is likely to be for this.

–Loren

Michele Williams replied on : 58 of 68

I’ve been struggling with memory issues in both Windows XP and Linux. I tried saving my data structures and then clearing the workspace, but to no avail. I have 15 data segments that I’m looping through and each segment took more time than the previous segment (the last segment was 10 times slower than the first). THEN I tried saving figures and closing them right after they are generated rather than leaving them open and saving them at the end. Now the run times for the segments remain constant and the total run time is MUCH, MUCH shorter. I used the print command to save files with the segment number in the file name, e.g. , where filenum is something like ‘-f1′ and filename is something like ‘segment1′.

Michele Williams replied on : 59 of 68

Hmmmm, my code fragment didn’t show up. This is the print statement: print(filenum,’-dtiff’,filename); where filenum is a string like ‘-f1′ (1 indicates figure 1) and filename is a string like ’segment1′.

Loren replied on : 60 of 68

Michele-

Not sure what you are asking. Sounds like you’ve figured out how to get better performance.

–Loren

Anis replied on : 61 of 68

Hello

I am running some mex file and getting out of memory error.
When i use “memory” command from matlab prompt i can see matlab is using only around 10% of the total physical memory available. Also, when i add up all the arrays that i am declaring inside mex file, it becomes only few percent of total memory available.

Now my question is – can we ask matlab to use more memory from the available RAM or is there any other alternate without buying a new computer/etc.

Thanks
Anis

Arvind Raghavan replied on : 63 of 68

Hi,

I have a specific question:
Does Matlab implement copy-on-write even for arrays passed in as arguments when constructing an object? From profiling output I’ve observed, I am lead to suspect that it there may be some kind of copy-on-access behavior for large arrays passed in to the constructor of class.

Sorry for the cross-post, but I didn’t get any responses when I posted a more detailed version of this question on the newsgroup at:

(I have a code snippet in that post along with profiling output).

Arvind

Loren replied on : 64 of 68

Arvind-

Yes, object properties also take advantage of the copy-on-write behavior. If you are on windows, you can watch the task manager memory plot as you assign to an object’s property and you should see no memory spike.

–Loren

umibozuuu replied on : 65 of 68

Hello,
I am having a problem where it seems “assignment forces allocation”- inside a function called by its handle. I’d like to know if it’s possible to avoid the repeated memory allocations that I show below.
There is this function I call by its handle repeatedly, which does a little computation for which it is useful to store a large temp (always the same size of temp every time the function is called). I thought it good to allocate space for the temp once and for all inside the function that produces the function_handle. However when I profile with memory on it seems memory is being allocated at each call

function fh= get_fctr(X);
n=size(X,1);
tmp=zeros(n,1);
fh=@fcn_implem;
function [res1,res2]=function_implem(i)
tmp=known_function0(X,i);%here!this is where allocation is %happening and I don't understand why!
res1=known_function1(tmp,i);
res2=known_function2(tmp,i);
end
end
%now in a 'main' function I am calling this functor several %times.
afunc=get_fctr(anX);
for i=1:largenumber
store_results(:,i)=afunc(i);
end


and for some reason I don’t know why it needs to reallocate the tmp inside the function every time instead of just once when I do the “afunc=get_fctr(anX)” call.
Hope this is clear enough? Thanks!

Loren replied on : 66 of 68

umibozuuu,

That’s simple. YOu are asking for the allocation by saying “tmp=”

Memory will always be allocated for tmp under that sort of situation. You need to index into the tmp variable you preallocated for it to reuse the memory, even though tmp is shared from the parent to the nested function.

–Loren

umibozuuu replied on : 67 of 68

Hello Loren,
sorry for being slow here, but
I’m not sure I understand what you mean by “index into the tmp variable”? Is there a way to assign a value to tmp without reallocating?

Loren replied on : 68 of 68

umibozuuu-

I mean indexing into tmp for assignment like:

tmp(ind1,ind2) = whatever

–Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.