Loren on the Art of MATLAB

MATLAB Release 2009b – Best New Feature or ~? 53

Posted by Loren Shure,

MATLAB R2009b was recently released. My favorite new language feature is the introduction of the ~ notation to denote missing inputs in function declarations, and missing outputs in calls to functions. Let me show you how this works.

Contents

Unused Outputs

I have occasionally found that I would like the indices from sorting a vector, but I don't need the sorted values. In the past, I wrote one of these code variants :

          [dummy, ind] = sort(X)
          [ind, ind] = sort(X)

In the first case, I end up with a variable dummy in my workspace that I don't need. If my data to sort, X, has a large number of elements, I will have an unneeded large array hanging around afterwards. In the second case, I am banking on MATLAB assigning outputs in order, left to right, and I create somewhat less legible code, but I don't have an extra array hanging around afterwards.

Now you can write this instead:

          [~, ind] = sort(X)

and I hope you find your code readable, with the clear intention to not use the first output variable.

Unused Inputs

You can similarly designate unused inputs with ~ in function declarations. Here's how you'd define the interface where the second input is ignored.

          function out = mySpecialFunction(X,~,dim)

You might ask why that is useful. If I don't use the second input, why put it in at all? The answer is that your function might be called by some other function that expects to send three inputs. This happens for many GUI callbacks, and particularly those you generate using guide. So your function needs to take three inputs. But if it is never going to use the second input, you can denote the second one with ~.

Can M-Lint Help?

Yes! Consider this function mySpecialFunction shown here.

type mySpecialFunction
function ind = mySpecialFunction(X,second,dim)
% mySpecialFunction Function to illustrate ~ for inputs and outputs.

[dummy,ind] = sort(X,dim);

Running mlint on this code produces two messages.

msgs = mlint('mySpecialFunction');
disp(msgs(1).message(1:50))
disp(msgs(1).message(51:end))
disp(' ')
disp(msgs(2).message(1:49))
disp(msgs(2).message(50:end))
Input argument 'second' might be unused, although 
a later one is used.  Consider replacing it by ~.
 
The value assigned here to 'dummy' appears to be 
unused.  Consider replacing it by ~.

Since M-Lint is running continuously in the editor, you would see these messages as you edit the file. Here's a cleaned up version of the file.

type mySpecialFunction1
function ind = mySpecialFunction1(X,~,dim)
% mySpecialFunction Function to illustrate ~ for inputs and outputs.

[~,ind] = sort(X,dim);

And let's see what M-Lint finds.

mlint mySpecialFunction1

It finds nothing at all.

What's Your Favorite New Feature?

Have you looked through the new features for R2009b? What's your favorite? Let me know here.


Get the MATLAB code

Published with MATLAB® 7.9

53 CommentsOldest to Newest

Another use I can think of for this tilde operator is in a constraint function of an optimization problem where the constraint does not “directly” depend on the optimization variable, x. (The constraint may, for instance, be a nested function with dependence on a sizable quantity calculated in a nested objective function.)

I don’t have R2009b yet but I suppose this should stop mlint from giving the warning for

function [c,ce] = nonLinearCon(~)
% ...
end

unless tilde cannot be used to ignore the first input argument—I suppose there shouldn’t be such a limitation.

Nice addition. Thanks Loren for bringing this to our attention.

-Omid

Omid-

~ can be used to ignore any inputs from the input list. If the function might be called by more than 1 input, you would still need input variables following the ~.

–Loren

OK, I have to say this: I hate it when MATLAB reuses symbols. (For instance ‘end’, i.e. the ‘1:end’ and ‘if something; end’ reuse that makes file parsing annoying.)

~ is the Logical NOT operator. I think it is a really bad idea for ANY of the logical operators to be re-used.

Tom-

Can you say more about why you think that’s bad? Even in this case, the new meaning is still “not”. What about dot (.) for fields in structs as well as decimals in numbers?

–Loren

One reason I think this is a bad idea is that now this means we now can’t get individual output parameter inverting directly in a function call, i.e.

[this, ~that, the_other]=MyFunction();

Loren,

The new meaning is only ‘NOT’ in an linguistics sort of fashion, not a mathematical fashion. There’s nothing for the operator to operate on. It only seems the same because in English, NOT and ‘not there’ sound the same. It’s not the same mathematically. An empty set would have been better IMO mathematically.

Loren.

Anecdotally, yes :) There’s this one function a colleague wrote that I always want the inverse on one of his outputs and every time I use it I wish for that when I write the following line ‘that=~that;’. Also, there are times when you don’t care about the actual value from a function, you just want to know if the output is zero (i.e. ~that).

However, I understand why it’s not implemented as it’s a pretty niche application.

(Edit on the last comment)

“you just want to know if the output is zero (i.e. ~that)”

[in a function with multiple outputs, so you can’t just do that=~myfunction()]

(Sorry for the comment flood)

Actually, that leads to an interesting question. Will
[~, that]=~MyFunction();
work now or does it still respond with “Error using ==> not
Too many output arguments.”?

Tom-

Thanks for the comments.

[~, that]=~MyFunction();

still errors with the same message. The ~ operator applies to a single array but there are multiple outputs from the function. It’s an interesting idea to extend ~ in the way you mention.

–Loren

Does the new ~ feature signal in any way to the function that the first output is not needed? Does this make some multiple output functions more efficient because MATLAB doesn’t have to allocate the memory for the first missing output argument?

For example, I don’t know how sort is implemented, but in in [y,ind]=sort(x), I imagine it actually computes the ind first and as a final step computes y = x(ind). If I use [~,ind]=sort(x), can the function know to skip the final step?

The style seems inconsistent:
~ in inputs is for function calling, but ~ in outputs is for function decalaration…
I think the reloading sheme for function inputs/outputs should be carefully designed. Some workaround may turn out to be harmful…

I really like the new “~” thing. Both for inputs and outputs.

What happens to nargin and nargout? And is there a way to detect inside a function that the caller is not interrested in the first output? That could be useful in some cases where there are significant memory/computational savings from not calculating the first output.

Like tom i also sometimes want to do something like

[~, that]=~MyFunction();

or more often maybe something like:

[~, that]=-2*MyFunction();
or 
[~, that]=sort(MyFunction());

or even something where i do something to one output and something different to another. Sometimes i also wish that i could do like this:

number10=sort(x)(10);
or perhaps like this to be more explicit:
number10=(sort(x))(10);

Jessee and Aslak-

There is currently no way to determine inside a function that it got called with ~ as some outputs so there is no intermediate memory savings. All the outputs get calculated, but those aligned with ~ outputs get freed and don’t enter the caller’s workspace. nargin and nargout do not change.

Peng-

I think you have the pattern backwards. ~ is for inputs in function declarations, and is in outputs when calling functions. But maybe I am misunderstanding you. Perhaps you can clarify what is inconsistent and needs to be worked around.

–Loren

Loren – I love the idea of the ~ in calling sequences, on either end. The downside is that so many users are still living in older MATLAB releases. For example, I’m only now starting to incorporate bsxfun into new code. So it will be several years before a feature like this will typically appear in anything I write.

I’m not saying that neat improvements like this are not very much appreciated. I love progress! So here is a question for my own benefit. Should I just go ahead and use the more recent toys as they appear in my code as I post something? Label the code as requiring the most recent release, but use those features?

Or where possible, should I put a test that will indicate if a feature exists in my code? For example, in one function, I essentially test to see if bsxfun exists before using it. Otherwise, I use one of the alternative schemes that we needed to use before bsxfun ever appeared. The problem is, that test itself consumes some cpu time.

Or should we be writing multiple code versions to post, so that those users who are still stuck in the Dark Ages of MATLAB can utilize efficient code? Obviously I hate this last idea, because I still find people occasionally running releases that are many years old. So I’d need to write many different versions of a piece of code.

In the case of something like bsxfun, it is possible to post a bsxfun substitute itself on the FEX so that those stuck in an old release can use your code. But a syntactical change like ~ cannot be gotten around by that expedient.

While I’d love to think that my work is so popular that it alone might convince someone that it is worth their money to spend for an upgrade, I know that to be silly. So what should my approach be here?

I like this new addition. It may be inconsistent “mathematically”, but it is highly readable, intuitive and maintainable. Those, like me, who often use the “dummy” vars would appreciate the simplicity.

I will personally continue using the dummy vars (and the accompanying “%#ok”) for backward compatibility with older Matlab versions, but in cases when this is a non-issue I will definitely use the new notation.

MathWorks should be applauded for not being afraid to modify/expand fundamental syntax notations where these are called for.

As a suggested improvement, I would remove the documented limitation of forcing a comma after the tilde sign: The parsing engine should be able to parse [a ~ c]=f() just like [a b c]=f() and [a,~,c]=f(). Similarly, the parser could recognize the ~b notation that Tom Elmer mentioned above (i.e., ~ immediately followed by a varname should use the NOT operator, otherwise use the new “disregard” feature). This should improve consistency and backward compatibility.

I think ~ is a good idea. I invoke SVD in four ways [u,s,v]=svd(A), s=svd(A), [u,s]=svd(A), [v,s]=svd(A’) and the matrices are large, so it matters that I do not save v, for instance.

As to people seeming to want to be able to do weird things like [~,out2]=~MyFunc(…), it just seems ambiguous to me in the multiple output case, since ‘=’ is kind of being overloaded here. I don’t see how such a notation would solve more problems than it would create.

But what I think people may be thinking of here, is the ability to do certain actions on function outputs in-place without copying, which was another post on this blog: http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/

Grunde-

Yes, the only language feature. Lots of nice features in the desktop and other areas. See the release notes (linked above) for more details.

–Loren

Hi Loren, this is a great new feature.

For unused outputs, some authors have suggested e.g.

[x x] = fun;

(such as in the reply to Jason’s question on the newsgroup) although I must admit I’ve never been comfortable with the idea. Are there differences in stability/efficiency etc between this and the new syntax?

For unused inputs, if it’s true that only pointers are passed until the variable is actually used, does the new syntax make much difference in the efficiency of function calls?

Lastly, are there any plans for future versions to allow direct referencing of function outputs? Something like

(A^2)(1,1)

(since the documentation for subsref advises us not to use it!)

thanks,
Ben

Ben-

The construct

[x x] = fun;

relies on MATLAB always assigning left to right. It does that currently, but seems like a slightly precarious assumption to rely on. Hence I prefer the new syntax which is guaranteed to do what you want. So I guess that makes it more stable in your terms perhaps. Efficiency is the same as the function being called still sees nargout=2 and creates both outputs. The first output just gets dumped on completion.

[~ x] = fun;

I’ve mentioned in other posts that the direct referencing of function outputs is on our future wishlist. I don’t have a date for its appearance.

–Loren

I know it’t too late, but TMW could have implemented a new reserved keyword “dummy”. In that way, the same code, e.g., “[dummy, V1] = myfunc(..)” could have been used in both old and new ML releases.

Jos-

You are right. We could have. And though code would have run in all versions, behavior would differ. What if the code then indexed into dummy after the assignment, for example? It seemed like an arbitrary rule that would be easy to trip over, especially for non-native English speakers.

–loren

Loren,

I am luke-warm on the ~ operator used in this way. It seems to me the [x,x]=f() is just as easily readable, though it is true that this is based on MATLAB assigning the first variable first. However, I think TMW just as easily could have made this the official, documented and recommended behavior (especially since there are no current exceptions so backward compatibility would be kept) as making a new use for an old operator, no?

I don’t hate the idea, part of me even likes it. I just like writing code that will work on older versions too, as a lot of my colleagues still use versions going back to V6.
If something already works, why complicate things?

Just a thought.

Loren,

This new use of

~

does indeed seem nifty and I’m quite sure I’ll make use of the feature in some of my experimental code. On the other hand I will echo Mr. DErrico’s concern from the 12th of September. Maintaining backwards compatibility does rule out employing such features, at least in the near future. For instance, one of the larger packages I’m currently working on will currently run “out of the box” only in releases >= R2007a due to frequent use of BSXFUN. It’s a conscientious choice that I stand by because BSXFUN is such a wonderful tool, but it is nevertheless a concern when distributing the code to other users.

To finish on a more `upbeat’ note I’ll just say that, for me personally, the thread support for BSXFUN and SORT rates higher than

~

. It just seems more immediately useful and doesn’t preclude backwards compatibility.

Best regards,
Bård Skaflestad
SINTEF ICT, Applied Mathematics

Folks-

I just want to be clear that I hear your concerns about compatibility. And I understand that new functions cause fewer issues than operators, because you can potentially write a replacement function for use in previous versions.

Thanks for all of the thoughtful comments.

–Loren

I like the use of the tilde for skipping an output. My question is regarding the use of the tilde in the input list. But before I get to that: Is the new use of “~” documented online anywhere? I couldn’t find it.

I usually specify a missing input by passing empty, e.g., myFun(a,[],c). Inside the code I would define a default for the input argument that is used if the argument is not passed or is empty.

What does myFun(a,~,c) do? Is it just shorthand for [], that is the second input argument will exist inside the function but be empty? Or will the corresponding input argument be undefined inside the function (as if it were not passed)? If it is the latter, what would NARGIN report?

I don’t have 2009b in order to test these questions.

An additional side note: to me myFun(a,[],c) is more readable than myFun(a,~,c).

Thanks.

-g

g-

The link to the documentation is in the opening sentence. The input variable represented by ~ is not present at all in the program. You can’t refer to it or set it to default. Having no name, it is undefined. It means the code will accept but never use that input value. As mentioned above, nargin and nargout still count the number of inputs/outputs, and ~ counts as one for those purposes as it it a placeholder in a particular position. There is no programmatic way to interrogate for more information. In your example, nargin would still be 3 since the function gets called with 3 inputs.

–Loren

Hi Loren,
I think it is worth pointing out that many commands, such as “sort” which you used in your example, now run much faster due to better multithreading support. A couple of results which I tested just now:

Sort the columns of a random 3000×3000 matrix (averaged over 10 trials):
R = rand(3000); tic; sort(R); toc;
In 2009a -> 0.6849 seconds
In 2009b -> 0.3740 seconds

Sort a 10000000×1 vector (averaged over 10 trials):
R = rand(1e7,1); tic; sort(R); toc;
In 2009a -> 0.9439 seconds
In 2009b -> 1.4147 seconds

Many other functions have been speeded up as well. Such performance enhancements are without a doubt the most welcome features in my opinion (you can never do things too fast).

Teja

I really like the new feature. I tend to do stuff like.

[i,ignore] = find(A) ;
clear ignore

which translates directly into what happens under the hood for

[i,~] = find(A) ;

Now we really need a way to determine if an output argument is “~” or not, both in an m-file and in a mexFunction. That way, the function could avoid computing it.

How does “~” behave as an input/output argument to a mexFunction? Is there a new set of mxGetWhatever functions to query the input/output arguments?

Can you call a function with “~” as an input, or is that syntax just there for the function header line in the m-file?

Folks-

Thanks for the comments about other new features that you like!

Tim-
There is currently no way to determine if a function is called with ~ on the LHS but I do understand why you’d like that. I would too. It’s on the enhancement list. As for a mexFunction, using ~ when calling it should be no different than calling a MATLAB program. I don’t know more details about the mexFunction interface, but I assume it’s like the MATLAB one. You can’t reference an input if you are ignoring it as an input. In a mex-file, you can simply not process the second input and ignore, for example. But there is no code checker in MATLAB that will give you a warning about that, unlike M-Lint with a MATLAB program.

–Loren

I was waiting for this for a long time!!!!!!!!!!!

I am also waiting for:

-C++ like macros (#define)
-do while statement

I guess they are not too much difficult…. Does anybody know why they ae not included into Matlab?

Luigi-

Please place your 2 requests using the support link on the right side of the page. Please include how/why you would like the features so people have a solid use case to evaluate. Thanks.

–Loren

Loren,

Thanks for clearing up my questions. I missed the link to the documentation and further, I missed that the tilde when used for input is for function declarations not function calls. I naturally assumed that since ~ skips outputs in the function call that its use in the input list would also be for function calls–but obviously this is wrong.

Can ~ be used to skip outputs in a function declarations? Probably not since the doc doesn’t mention it.

Skipping function outputs when a function is called, and skipping input arguments when a function is defined are not analogous, and I dislike that one symbol is used for both cases. I think this is what Peng (comment 17) was getting at. I don’t know how it is harmful, though, other than being confusing.

-g

Thanks g-

I bet you are right about Peng’s comment. You can’t use ~ to skip outputs in function declarations. If the user supplies a variable (forget ~ for now), s/he should get an output if the function can create one. The function can’t claim to return 3 outputs but not include, for example, the second one. What if the user called the function with that output? Didn’t seem like an idea that could be depended on by users.

–Loren

Hi Loren,

Yes, this is a good idea. But much like John DErrico, I won’t be using it for a while. Eventually we’ll start including new syntax into our MATLAB code, but right now I’m still avoiding || and && … :)
What is a good moment to drop support for older versions?

Tom Elmer (post 5) complained about the reuse of symbols. I think that the various meanings of ~ are clear enough. The same for the dot example you gave. A lot more difficult to distinguish are the two uses of the quote character. g’ and ‘g’ are very different things, and without syntax highlighting in the editor, they’re easily confused! I would have preferred if MATLAB used the double quote for strings, it’s the only character not yet being used, I think?

Loren,

Thanks for the feedback.

What I would like is something like this. If MATLAB calls my mexFunction with [x,y]=f(~,b) then inside the mexFunction f I should be able to query the first and 2nd argument to see if they are really there.

So why not pass me a null pointer for the 1st input argument to my mexFunction? that’s easy to do, and easy for me to check. Or pass an empty matrix, and then create a new query function that answers “do you exist?” for any input. The latter option would be safer, for backward compatibility.

And then for the output arguments, there could be another “do you exist?” query function that I could call.

It would make things like QR a lot easier, which has a boatload of “do this if you have 2 inputs and 1 output, something else if you have 1 input and 1 output, and …”, and likewise for many other mexFunctions.

These same functions, “do you exist?” applied to inputs and outputs, would be useful for m-file functions too. Something like this:

[x,y]=funky(a,b)
if (~isargpresent(a))
    fprintf ('Hey, you didn''t pass me anything for a!')
end
if (~isargpresent(x))
    fprintf ('Whatsa matter?! How come you don''t want x?') ;
end

Finally! Thanks for implementing this feature. Regarding Cris’s comment on support for older versions, using keywords such as ‘unused’ instead of ‘~’ might be a better move. Who in the right mind will use ‘unused’ to name a variable that they plan to use?

Loren: any ideas why it’s better to use ‘~’?

Hoi-

It could be better to use ~ so you don’t clutter your workspace with stuff you don’t need.

Tim-

In your first example, you are looking to query inputs. Using ~ in your case is really not much different than using []. In any case, MATLAB doesn’t allow you to optionally use some input if the signature in MATLAB (not C) says it won’t use it. I think we are designing the C-Mex to mirror MATLAB usage. So either the function requires input #2 or it doesn’t.

I understand you also want to query whether some outputs are not wanted. Not currently possible in MATLAB or mex. It is on the wishlist for the future.

–Loren

Tim,

I’m not sure you’re using ~ correctly in post 44. If by “If MATLAB calls my mexFunction with [x,y]=f(~,b)” you mean that you want to use ~ in the _input_ argument list when you _call_ the function, you can’t do that. Picture what would happen if you called, for example, PLUS like:

y = plus(x, ~);

What would you expect MATLAB to do in this case? You gave the PLUS function two inputs, as it expected, but you somehow expect it to ignore the second one?

You can use ~ in the _output_ argument list when you call the function (if you don’t care what value the function assigned to that output when you called it) and/or you can use it in the input argument list when you _define_ the function (if you’re going to ignore whatever the caller passed into the function.)

In a MEX-file, you can’t use ~ in the declaration, but you don’t need to. You can simply not reference that element of the array containing the right-hand sides, prhs. In MATLAB, doing that would earn you a warning from M-Lint:

function y = myfun(x, t) % t is not used
y = x.^2;

but M-Lint, as the name implies, doesn’t apply to MEX-file source code. Using ~ in the output argument list of a call to a function works the same way regardless of how the function is implemented — MEX, M-file, or built-in.

Hoi,

If we’d chosen to create a new keyword for this, it would have the potential to break older code that used the keyword as a variable or function name, as attempting to create a variable whose name is a keyword is an error:

for = 4;

and you can’t call a function whose name is a keyword — the keyword will take precedence.

Hi Loren,

I’m talking about turning ‘unused’ into keyword that has the same behavior as ‘~’. That way for R2009b, there won’t be a variable called ‘unused’ to clutter the workspace, yet for older versions it won’t break the code for older versions, as they continue to exist as junk variables.

Even better, this scheme makes it possible to turn on/off this feature by enabling/disabling the keyword ‘unused’ without breaking any code in any version.

Cheers,
Hoi

Hoi-

Sorry I misunderstood you. See Steve Lord’s answer regarding keywords. Introducing new ones also breaks compatibility.

–Loren

I like this feature, but I don’t get why not underscore was chosen instead. It’s used for the same purpose in other languages and it doesn’t mean anything else already(?) like ~ does (it’s not a valid variable name either)

Ole-

Choosing a symbol is partly a matter of taste, partly a matter to somehow conform to the rest of the language. We could have chosen _ as you suggest and preferred the looks and the fact that it meant “not” because we thought it would help people realize there was no value available.

–Loren

What does the function see for the value of an unused input parameter? [], or undefined, or something else? If there are following defined parameters, what does nargin in show?

i.e. if my function is defined as

myfunc(argOne, argTwo, argThree)

and it’s called like this:

myfunc('arg', ~, 'arg')

Is nargin 2, or 3?
Is argTwo undefined or empty or something else?

Thanks

Marc-

That’s not how to use the ~ feature. Instead, the function would be something like this:

function myfunc(arg1, ~, arg3)
nargin

and nargin returns 3. You can't, in the function body, refer to the second input. It's there because some other use requires the second input, which this function will totally ignore.

The other use of ~ is to ignore defined outputs. See the example in the post or in the documentation.

--Loren

These postings are the author's and don't necessarily represent the opinions of MathWorks.