Nested Functions and Variable Scope35

Posted by Loren Shure,

I get a parade of questions about which variables are available to nested functions and which variables, used in nested functions, are part of the nesting function workspace. So today I thought I'd address this topic. For more information, you can read this documentation.

Where is data?

Let's take a look at the following code.
function blahblah
% BLAHBLAH help line
data = 1;
data2 = yoyo1();
yoyo2(3);
   function data = yoyo1()
% Note that data is an output here
data = 2
end
function yoyo2(in)
% Here, data is shared with the variable 'data'
% in blahblah's workspace
      data = in;
  end
end
You can see that there is the variable, in each function, named data. The question is, however, do they all refer to the same entity? And the answer is 'no.' Surprised? Let me explain. yoyo1 essentially declares data to be a local variable and not shared because of data's presence on the output argument list. The same would be true if data showed up on the input argument list instead, or if it showed up on both. However, yoyo2, which has no mention of data in its arguments lists, shared the variable data that is in blahblah's workspace. So calling yoyo2 changes the value of data in blahblah but calling yoyo1 does not.

Truisms about Variable Scope with Nested Functions

• When you call an M-file containing a nested function, sufficient analysis is done on the M-file to determine which variables are shared and which functions share them.
• If a variable is in a nested function and appears in at least one of input or output argument lists for that function, then, even if it's name matches a name in the nesting function's workspace, the two variables refer to different entities. This is true even if other nested functions share the variable in the nesting function.
• For a variable in a nested function to be shared with the nesting function, the variable must appear in the code of (i.e., be referred to or used in) nesting function. This is true even if you don't need to use the variable in the nesting function but want multiple nested functions to share the variable.
• You can't poof variables into nested function workspace. If need to do some debugging and want to create new variables, you can declare them global and then assign to them. For more detail, read the documentation about Restrictions on Assigning to Variables.

In the code I showed, I made sure to highlight the variables that are shared by printing them in italic. Especially for those of you who do not care for the implementation we chose for nested functions, would some visual affordance such as this make the situation more tenable for you? Let me know here.

Get the MATLAB code

Published with MATLAB® 7.5

Note

Scott Hirsch replied on : 1 of 35

I think that’s a great idea, Loren. I’ve been recommending that people trying to document my code carefully to indicate when a function call from the main function to a nested function might change the data, but that is highly error prone and hard to maintain. A simple editor affordance could go a really long way.

Richard Brown replied on : 2 of 35

Yes, absolutely some visual feedback would be great. I really like using nested functions, but it’s so easy to accidentally use a variable of the same name in the nested function, like ‘i’ for a loop counter, messing up the value in the nesting function.

Personally I’d rather that shared variables were declared, with something akin to ‘global’ in the nested function, but failing that some visual feedback in the editor would be pretty helpful. Some cases might also be able to be spotted by mlint

Jessee replied on : 3 of 35

I rarely ever find myself in a situation where it seems like a nested function is the best solution to my problem. Maybe this is because I’m not really aware of any advantages of a nested function over regular or sub functions. Would anyone care to make a case for nested functions?

Jason Merrill replied on : 4 of 35

It’s easier to avoid problems like this if you don’t have so many unnecessary local variables running around in the first place. You can avoid explicit loop counters in some cases by vectorizing your code, and in many others by using functions like arrayfun() and cellfun(). It would be nice if these were pushed more in the documentation, and if they used the “‘UniformOutput’, false” behavior by default.

Another change that would allow for fewer local variables would be allowing indexing into functions that return matrices and structures. For instance “cdata = aviread(‘movie.avi’).cdata” instead of “temp = aviread(‘movie.avi’); cdata = temp.cdata”. The extra variables in cases like the second often get named something like “temp” which is likely to be used again elsewhere.

Jason Merrill replied on : 5 of 35

Also, isn’t more or less the only reason to use a nested function instead of a subfunction that nested functions share scope with their parent? Like Jessee, I’d like to hear a discussion about the relative merits of these two options.

Robert Bemis replied on : 6 of 35

Thanks, Loren! This is a great learning example: simple and effective.

If anyone is not groking the behavior from the explanation, stepping through the example line by line using MATLAB’s debugger can be extremely helpful. Be sure to “step in” (not over) the nested functions, and pay attention to the workspace drop-downs in the Editor and Workspace Browser, switching back and forth as needed to really “get” what MATLAB is doing.

Loren replied on : 7 of 35

Jessee and Jason-

I recommend that you start by reading my other blog entries relating to nested functions. One way to find them is looking in the category labeled “Function Handles.” Within those articles are both motivation and links to other places that provide motivation for nested functions.

–Loren

Matt Whitaker replied on : 8 of 35

One of the most useful places to use nested functions is in gui programming to make ‘pseudo-OO’ objects where the shared properties are used as object ‘properties’ and the nested functions are ‘methods’. The public method functions are are tied together in a structure of function handles. If you have the IPT toolbox look at the code for imline and imdistline (which uses some simple inheritance from imline). I tend to actually return the API instead of the hggroup handle (make the handle an accessor method instead) so my GUI functions tend to look like:

So my GUI routines tend to look like

function guiAPI = someGUI(varargin)
globalVar = initializeGlobalVar(varargin);
guiAPI = setAPI;

function api = setAPI
api.fcn1 = @fcn1;
api.fcn2 = @fcn2;

end %setAPI

% get and set various global variable functions here
if they are to be accessed externally

%’method’ functions
function fcn1(varargin)
end %fcn1

function fcn2(varargin)
end %fcn2
end %guiAPI

function globalVar = initializeGlobalVar(varargin)

end %initializeGlobalVar

%various ‘utility’ functions here

I’ve found this programming pattern highly productive for GUI intensive apps.
Matt

Dan K replied on : 9 of 35

Loren (and everybody else),
One thing that I have long wondered about is relative speed of nested functions relative to subfunctions. Is the use of shared variables faster than passing by reference? My (admittedly limited) understanding of the way in which MatLab passes by reference would suggest that they should be equally fast. I’ve also struggled trying to identify if there is a difference between the risks of using global variables (like race conditions, accidental overwrites, etc), and what happens with nested functions. I do see some convenience in the use of nested functions for building a handle library, but in general, I’ve avoided using them, due to the time it would take to convert some large routines, and the inability to (as loren so eloquently put it) “poof” new variables in, which is fairly critical to my method of developing and debuging code.

Just my \$0.02.

Dan

Markus replied on : 10 of 35

As I mentioned in several comments before, I am not a friend of nested functions. In subfunctions you can start looking at the “function” keyword and see which variables are there and which not. In the case of nested functions, you always have to start at the “function” keyword of the nesting function. This can get really nasty when the whole function gets larger and larger. And for GUI programming, there are other good solutions besides nested functions.

Markus

Oliver A. Chapman, P.E. replied on : 11 of 35

All the .m code that I write will be maintained by someone else, likely someone less familiar with MatLab than me. So, I’ve avoided nested functions just because it is so hard to follow the data flow. I further restrict myself to passing all variables thru the function calls. Thus, I never use global variables because it impairs visibility of the data flow.

Since I use this approach, nested functions don’t seem very attractive.

Further, the documentation for nested functions is not adequate given the complexity of the topic. If there is great power in using nested functions as a programming technique, the documentation doesn’t illustrate it.

For example, the first two examples in the documentation show only the syntax required for first one nested function or more than one. But there aren’t further, more advanced examples that demonstrate a benefit of using even one nested functions, e.g., “this is how using a nested function adds clarity or ….” Subsequent examples just illustrate, for example, that nested functions can do in 7 lines what you could do in 1 line without nested functions.

Also, even though the MatLab documentation has separate sections for nested functions & subfunctions, important details about subfunction variable scope are discussed in the section on nested functions and these details are not discussed in the section on subfunctions.

Finally, the bulk of the documentation on nested functions is devoted to their use with function handles and this is another topic that MatLab poorly documents. I’ve previously criticized this.

My concern is that even though using nested functions may be a cute trick that can allow an operation to be coded with fewer lines of code, the resulting code won’t be easy to understand or very clear. And, even though fewer lines of code is generally good, as Dykstra said, ” . . .we now take the position that it is not only the programmer’s task to produce a correct program but also to demonstrate its correctness in a convincing manner, . . .” Thus, clarity is much more important than fewer lines of code.

My concern is amplified by the main focus of your column where you are illustrating how easy it is to confuse the data flow. Although an “affordance” in the editor would help, it seems like an inferior approach to compensate for a more fundamental error.

Thus Loran, what you need to do is give us an example of a batch of X lines of code that does something. Then, write an alternate batch of code that is either easier to understand or uses significantly fewer lines of code because you used nested functions. Further, your example must show how the use of either functions in general or subfunctions would not have given the same benefit as nested functions.

By the way, about half a dozen of us around here have here gleefully wasted for far more time than we should have with your new-to-us word of “affordance.”

Markus replied on : 12 of 35

Thanks Oliver, very good statement! Exactly my opinion.

Markus

Tim Davis replied on : 13 of 35

I suppose a definition would help: affordance (n) (1) the ability of one to pay the costs for attending a prom, (2) what you do if you turn the wheels too sharply in your Mustang while driving on ice. See also http://en.wiktionary.org/wiki/affordance ;-)

wobbly replied on : 14 of 35

I find the datasharing of nested functions a major invitation to data corruption except in limited cases. I avoid it like the plague if I think anyone else will go near the code.

All I want is to have module-scope globals. ie the ability for non-nested functions to access explicitly listed variables in the main function.

I have always found the lack of this a bit incomprehensible. To get around it, I tend to pass structures back and forth to the functions.

There is another problem with nexted functions: during debugging, you can’t create new variables from the console, thus nobbling interactive development

Loren replied on : 15 of 35

Wikipedia has a discussion of affordance that might help or confound more.

I think nested functions can be very useful but the writer MUST be mindful of exactly when to take advantage of the data sharing feature.

Many of you have posted articulately your dislikes of nested functions. Here are what I believe are some of the benefits:

– I can have functions with a natural calling sequence in a mathematical sense. I can separate out parameters from variables, for example, in an optimization problem. For example, if I want to evaluate the height along a line, I care about the distance x for that line and the corresponding y. slope and intercept are parameters and don’t have to look “equal” to the independent variable x.

– I can truly share data when I want multiple functions to be able to access and manipulate the data — and without having to make extra copies, something that can be painful with large datasets.

–Loren

Ryan Gray replied on : 16 of 35

Everyone should understand that nested functions are not a replacement for subfunctions. There are times to use each type of Matlab function. I see nested functions like the nested functions I used to use in Pascal. What was nice about them was that I could quickly move a section of code in a function into a nested function without all the hassle of creating input and output parameters – just give it a name. This would then clean up the nesting function to make it clearer. Since it was so easy to do, I was more likely to divide the routine up into parts rather than leave it as a big monolithic routine. The main function was then largely just a list of the nested function calls, which had nice names, so that the main function was easy to follow.

function y = foo(x)

step1;
step2;
step3;

function step1

end
function step2

end
function step3

end
end

You can just read the main function, and it is like a summary. There’s no big parameter lists cluttering up things, so the routine names can tell the story. The details follow it in the subfunctions.

One of Loren’s examples was to use them to make clear what a function is really working with. I’ll try an example:

function foo

function y = bar(x)
y = m * x + b;
end

m = 3;
b = 2;

y1 = bar(7);
y2 = bar(9);

end

The function arguments are only the parameter that is changing – Use the shared scope for the others as you would a global. Also, the output is the only thing changing, so no side-effects. Of course, Matlab doesn’t restrict nested functions from changing the shared variables, but a discipline can keep things easy to understand.

There are times when we might want to modify lots of variables, so in that case, make the nested function return nothing, which cues you that it must be modifying shared variables. It could still take parameters to call attention to key values driving what it is doing.

Brad Phelan replied on : 17 of 35

The big problem with nested functions is that they are a good idea not taken to their logical conclusion. Most of the time nested functions are defined and then used immediately just under their definition by passing the function handle to another function.

This is analogous to Matlab removing the for and while loop and replacing them with the functions for and while. These functions would require that you pass a function handle to them implementing the body of the for or while loop. In general that would be ugly and users would rebel.

The trick is to generalize the concept of control flow and make it look similar to the tradional for, while paradigm. At an implementation level the body of a for loop is just a unnamed nested function.

For example

print all numbers from 1 to 10

for i = 1:10
…i
end

using a different notation and making ‘for’ a function instead of a keyword we could write

for(1:10) begin(i)
…i
end

Here the begin(i) is the parameter list to the block and really is just a multiline anonymous function.

or perhaps write a function that generates Fibonacci numbers

fib(1000) begin(f)
…if f > 50
……break
…end
end

You might say the above can be implemented as a normal for loop. ie

for f = fib(1000)
…if f > 50
……break
…end
end

However in the second example the fib function must generate all 1000 first Fibonacci numbers before starting a traditional for loop over the result, whereas in the first example Fibonacci numbers are generated on demand for each iteration of the loop.

The same paradigm can be used for specifying a callback to a gui button

button.on_press begin(event)
…’button pressed with event ‘
…event
end

Normally this is written in matlab as

function handle_event(event)
…’button pressed with event ‘
…event
end
button.on_press(@handle_event)

Loren replied on : 18 of 35

While agree that MATLAB could have additional flow-control capabilities, I don’t know where you get the data for you second statement “Most of the time nested functions are defined and then used immediately just under their definition by passing the function handle to another function.” That is not my experience. Instead, more typically, the handle, including workspace, are sent out for others to use and that’s where some of their real power come from.

–Loren

Brad Phelan replied on : 19 of 35

“Instead, more typically, the handle, including workspace, are sent out for others to use and that’s where some of their real power come from.”

The scenario you describe is a different one. Passing functions back to the caller is a nice way to implement lightweight objects. It can be scaled up into a semi object oriented framework and I have no argument with it.

However I still dispute that it is necessary to name the functions. With a decent notation the code can become smaller and perhaps clearer.

The canonical example

function f = counter(init)

…function c = count
……init = init + 1
……c = init
…end
…f = count

end

could be written more concisely

function f = counter(init)
…f = begin()
……init = init + 1
……c = init
…end
end

The most convincing argument in favor of blocks is when they are used for transaction or resource management. Here the canonical example is opening a file, reading or writing it and closing it. Closing a file is often left up to the user and many users forget to do it resulting in other problems later on. This problem goes away if you imagine something like this.

fid = fopen(‘fgetl.m’);
while 1
…tline = fgetl(fid);
…if ~ischar(tline)
……break
…end
disp(tline)
end
fclose(fid);

could become

fopen(‘fgetl.m’) begin(fid)
…while 1
……tline = fgetl(fid);
……if ~ischar(tline)
………break
……end
……disp(tline)
…end
end

or even better.

fopen(‘fgetl.m’) begin(fid)
…each_line(fid) begin(line)
……disp(tline)
…end
end

Psuedo implementations:

function fopen(name)
…fid = builtin(‘fopen’, name)
…yield fid
…fclose(fid)
end

function each_line(fid)
…while 1
……tline = fgetl(fid);
……if ~ischar(tline)
………break
……end
……yield tline
…end
end

An extra keyword yield sends a value to an attached block. The resulting code

fopen(‘fgetl.m’) begin(fid)
…each_line(fid) begin(line)
……disp(tline)
…end
end

is much cleaner and safer than the original because there is no risk of file handles left open or accidentally reading past the end of the file.

Regards

Gautam Vallabha replied on : 20 of 35

This link does a fairly good job of unpacking what “affordance” means in the context of a GUI:
http://www.joelonsoftware.com/uibook/chapters/fog0000000060.html

Loren — I read Brad’s comment (“Most of the time nested functions are defined and then used immediately … “) to mean that USUALLY the name of the nested function is unimportant. If I have something like this:

function out = blahblah(a)
count = 0;
out = @yoyo;

function yoyo(x)
do_something(x,a);
count = count + 1;
end
end

Here, the name ‘yoyo’ is just a throwaway. Whatever the name is, I invoke it just once to get the handle and return that handle. For this usage, it is better to have anonymous nested functions, like this:

function out = blahblah(a)
count = 0;

out = @(x)
do_something(x,a);
count = count + 1;
end
end

Also, Brad wants MATLAB to be Ruby.

Brad Phelan replied on : 21 of 35

— Also, Brad wants MATLAB to be Ruby.

Not quite right. Ruby has a nice abstraction model over control flow. However many languages have other ideas to borrow. Ruby got most of it’s concepts from Smalltalk. Python has generators which allows you to implement looping contructs where the iteration variable is generated on demand using the yield keyword. Perl has been doing the anonymous block things in a way similar to Ruby and Smalltalk. Java has anonymous inner classes which are ugly substitute for anonymous blocks as well as the Iterator concept. Even c++ stl has the iterators which are verbose to use but powerfull.

Looking at other languages for inspiration is not a bad idea but I know it is hard to please everyone :)

Regards

Andrzej Miekina replied on : 22 of 35

Hi Loren,
May be it is not a best place to write about my problem I met, but I didn’t find better. It is realy strange. I defined simply function in local directory:

function f=u(x)
global tau
f=1-exp(-x/tau);

and after changing directory to my local and after run script when I call num2str function the error apeared:

??? Error: File: num2str.m Line: 150 Column: 13
“u” previously appeared to be used as a function or command, conflicting
with its use here as the name of a variable.
A possible cause of this error is that you forgot to initialize the
variable, or you have initialized it implicitly using load or eval.

I don’t know how to explain it.

Could you help me with it.
PozdrawiAM –

Tristan replied on : 24 of 35

“One thing that I have long wondered about is relative speed of nested functions relative to subfunctions. Is the use of shared variables faster than passing by reference?”

I tried the following :

%Nested function :

function Untitled1
x=ones(10,10);
tic
for i=1:1000000
tsqr;
end
toc
function tsqr
a=(x+1)./x;
end
end

Result : “Elapsed time is 2.792550 seconds.”

%separate function :

function Untitled
x=ones(10,10);
tic
for i=1:1000000
tsqr(x);
end
toc

end

function tsqr(x)
a=(x+1)./x;
end

Result : Elapsed time is 1.977658 seconds.

So it seems that nested functions are slower. Maybe checking if nested function variable names exist in the caller workspace is slower than a good old “pass-by-reference”.

Tristan replied on : 25 of 35

Wow!
I just tried with a global variable and it’s 5 times slower than with a argument!

function Untitled1

global x
x=ones(10,10);
tic
for i=1:1000000
tsqr;
end
toc
end

function tsqr
global x
a=(x+1)./x;
end

Result : “Elapsed time is 11.449752 seconds.”

Loren replied on : 26 of 35

Tristan-

Nested functions can be slower in some cases currently. We know we have some opportunities to optimize more. However, be very careful as it depends on how many variables, how much data there is, how much data changes, etc. I don’t think one experiment characterizes the larger picture fully.

–Loren

Tristan replied on : 27 of 35

yeah you are right, one example dont prove anything.

It just shows that, in some cases, there might be some important speed differences between a function and the same one nested, or using global variables.

Alessandro replied on : 28 of 35

Tristan, try with this experiment:

function test1
x=ones(10000,10000);
tic
for i=1:100
x = tsqr(x);
end
toc

end

function a = tsqr(x)
a=(x+1)./x;
end


vs.

function test2
x=ones(10000,10000);
tic
for i=1:100
tsqr;
end
toc

function  tsqr
x =(x+1)./x;
end

end


the first is slower(try with 1000) and gives out of memory error (with 10000)!

OysterEngineer replied on : 29 of 35

I’ve inherited some complex .m code, written by a very competent co-worker. The code has ~40 nested functions & I’m struggling with the very issue of which variables each nested function can see or has revised.

I just wish there was some simple way to list all the variables a nested function could see. That would make de-bugging much easier.

Loren replied on : 30 of 35

OysterEngineer-

See if this helps. Put a break point in one of the nested functions, perhaps at the last executable statement. Type “who” at the MATLAB prompt. The display should show you local variables and up-level ones. If this helps, you can then do that for each of the nested functions. Tedious, yes, but I think it will get you the info you are looking for.

–Loren

OysterEngineer replied on : 31 of 35

Yes, that is kind of what I did.

However, I was hoping to find something like the codemetrics that would summarize the nesting relationship. I couldn’t find anything on the file exchange, so I ended up doing it by hand & making a table in an Excel spreadsheet.

a concerned programmer replied on : 32 of 35

Matlab really should recognize the ‘local’ keyword in order to force a variable in a nested function to be local. Otherwise, there is no way to gaurantee that a variable is has local scope unless all functions are declared in separate files, which is not always the best solution, and actually kind of annoying to work with in some cases.

Just a suggestion.

Loren replied on : 33 of 35

A concerned programmer-

Thanks for your concerns. FYI, you can guarantee, in a nested function, that variables are local if you list them in the input or output argument list.

–Loren

Jung replied on : 34 of 35

Sorry Loren if the question is very basic. When I run the following, I get data=1 when it should be data=3. Am I missing something basic here?


function test(3)
data = 1;
yoyo2(3);
data
end

function yoyo2(in)
data = in;
end



Thanks,
Jung.

Loren replied on : 35 of 35

Jung-

You did not write a nested function but a subfunction. yoyo2 is not nested inside test.

–Loren