Memory Management for Functions and Variables 68
Posted by Loren Shure,
People have different ideas about what costs a lot, in terms of memory, in MATLAB. And sometimes people don't know the details. Today I am going to talk about when MATLAB makes copies of data, for both calling functions, and for data stored in variables. MATLAB makes copies of arrays it passes only when the data referred to changes (this is called copy on write or lazy copying).
Contents
Passing Arrays to Functions
The question is, when does MATLAB copy memory when passing arrays to functions. Some users think that because MATLAB behaves as if data are passed by value (as opposed to by reference), that MATLAB always makes copies of the inputs when calling a function. This is not necessarily true. Take a look at this function.
type fred1
function y = foo(x,a,b) a(1) = a(1) + 12; y = a * x + b;
In fred1, the first and third inputs, x and b, are not altered inside. MATLAB recognizes this and passes both these variables in without making any extra copies. This can be a big memory savings, for example, if x is a large dataset. However, in fred1, we can see that the second input, a, gets modified inside. MATLAB recognizes when this is happening to a variable and makes a copy of it to work with so that the original variable in the calling workspace is not modified.
Structures and Memory
Each structure member is treated as a separate array in MATLAB. This means that if you modify one member of a structure, the other members, which are unchanged, are not copied. It's time for an illustration here.
Create some rgb image data with 3 planes: red, green, and blue.
im1.r = rand(300,300); im1.g = rand(300,300); im1.b = rand(300,300);
Instead, rearrange the same data so that we have an array of structs each element containing an [r g b]-triplet.
im2(300,300).rgb = [0 0 0]; % preallocate the array for r = 1:300 for c = 1:300 im2(r,c).rgb = [im1.r(r,c) im1.g(r,c) im1.b(r,c)]; end end
Let’s compare im1 and im2.
clear c r % tidy up the workspace whos
Name Size Bytes Class im1 1x1 2160372 struct array im2 300x300 7560064 struct array s 1x1 392 struct array sNew 1x1 392 struct array Grand total is 630043 elements using 9721220 bytes
im1 is a scalar structure with members that hold m x n arrays.
- im1.r = imageRedPlane --- size m x n
- im1.g = imageGreenPlane --- size m x n
- im1.b = imageBluePlane --- size m x n
im1 is size 1 x 1; total # of arrays inside im1: 3
im2 is an m x n structure array with fields containing 3-element vectors.
- im2(i,j).rgb = imageOneRGBPixel --- size 1 x 3
im2 is size m x n; total # of arrays inside im2: m x n
Notes: Every MATLAB array allocates a header with information. This makes im1 more memory-efficient than im2 (more generally, scalar structs containing arrays are more memory-efficient than a struct array). When one field in a structure is changed, and possibly copied, the other fields are left intact.
s.A = rand(3); s.B = magic(3); sNew = s; sNew.A(3) = 14;
Since s and sNew have unaltered copies of B, the B fields share memory, but the A fields do not. See the documentation section titled Using Memory Efficiently for more information.
What's Your Mental Model for MATLAB Memory Management?
Say that three times fast!
Does the description here and/or in the documentation change your model?
Let me know.
Published with MATLAB® 7.2
- Category:
- Memory,
- Structures
Note
Comments are closed.
68 CommentsOldest to Newest
Thanks for your comments.
To answer the last question first, yes, we still recommend that you not overwrite input data in MEX-files. That's because, since MATLAB semantics is value-based, other users (or you sometime later perhaps) are not generally expecting calling function workspace variables used as inputs to change. And if an error occurs somewhere within the chain of calculations, you may end up with some "corrupted" variables that would not occur otherwise.
For the first question, in cases like this,
function x = myfun(x) % modify x here, for example x = x.^2;MATLAB R2006a and earlier does not take advantage of reusing the memory for the input variable x, nor equivalent behavior for a field of a structure. We are constantly looking for and making performance improvements in MATLAB. The one you mention is on our list.
A(1:end,10) = fcnMakingLargeOutput();In this case, the large array is copied into the memory in the variable A. Otherwise, there is no extra copy made.
varstruct = load('myfile'); ... var1 = ... clear varstructThat way, as soon as you as you tinker with var1 (I renamed it since var is a function name in MATLAB), it won't make a separate memory copy from varstruct since we just cleared varstruct. --Loren
% using the durer.mat file from toolbox/matlab/demos % for purposes of this example. You can substitute % whatever MAT-file name you want. matfile = fullfile(matlabroot, 'toolbox', 'matlab', ... 'demos', 'durer.mat'); variablenames = whos('-file', matfile); name1 = variablenames(1).name; var1 = load(matfile, name1);The variablenames struct will contain more information about your variables, so you can determine (based on the size field, perhaps) which variables you want to load. Then you can refer to the variable using a dynamic field name:
mydata = var1.(name1) + 1;
s.A = rand(3); s.B = magic(3); sNew = s; aNew = s.A; sNew.A(3) = 14;
Now sNew.A and s.A now have the same value. sNew.A and s.A now have a different value from aNew. Before the assignment on the last line, sNew.a and s.A pointed to the same array header and same array with no semantic copy, but that array shared data with aNew. After the assignment, s.A and sNew.A still have the same array header and array, but it now has its own copy of the data distinct from the data in aNew. --Loren
A= some large matrix; myFunction(A');Does myFunction keep a separate copy of A', or it use the A directly, assuming myFunction does not modify the transposed function.
>> t = 1:10 t = 1 2 3 4 5 6 7 8 9 10 >> y = sin(t); >> z = cos(t); >> h1 = plot(t,y); >> figure; >> h2 = plot(t,z); >> format debug >> get(h1,'xdata') ans = Structure address = bac1900 m = 1 n = 10 pr = 2a2b3d60 pi = 0 1 2 3 4 5 6 7 8 9 10 >> get(h2,'xdata') ans = Structure address = bac1900 m = 1 n = 10 pr = 2a8e3900 pi = 0 1 2 3 4 5 6 7 8 9 10 >> t t = Structure address = 2cb29920 m = 1 n = 10 pr = 2a8e3eb0 pi = 0 1 2 3 4 5 6 7 8 9 10Is there really a separate copy of "t" in handles "h1" and "h2"? Is there a way to get them to use the same copy? This would really help some memory issues we are having when plotting multiple data sets.
function fh= get_fctr(X); n=size(X,1); tmp=zeros(n,1); fh=@fcn_implem; function [res1,res2]=function_implem(i) tmp=known_function0(X,i);%here!this is where allocation is %happening and I don't understand why! res1=known_function1(tmp,i); res2=known_function2(tmp,i); end end %now in a 'main' function I am calling this functor several %times. afunc=get_fctr(anX); for i=1:largenumber store_results(:,i)=afunc(i); endand for some reason I don't know why it needs to reallocate the tmp inside the function every time instead of just once when I do the "afunc=get_fctr(anX)" call. Hope this is clear enough? Thanks!
Recent Comments