Steve on Image Processing and MATLAB

Concepts, algorithms & MATLAB

About the unused-argument syntax in R2009b

Last fall Loren wrote a blog post about a new syntax in R2009b for ignoring function inputs or function outputs. For example, suppose you call sort and only want the second output. In R2009b you can do that using this syntax:

 [~, idx] = sort(A);

The tilde character here is used as a placeholder to indicate that you don't need the first output argument of sort.

For another example, suppose you write a function that has to take three input arguments (because another function is always going to pass it three arguments), but your function doesn't need the second argument. Then you can write the first line of your function this way:

 function out = myfun(A, ~, C)

This new syntax has drawn a startling amount of discussion, both in the comments on Loren's post as well as in the comp.soft-sys.matlab newsgroup. (See the thread "Getting indexes of rows of matrix with more than n repetitions," for example.)

Responses to this new syntax have fallen into roughly four categories:

  • "Finally, I've been waiting for this!"
  • Clarification questions about how it works and how to use it.
  • Complaints about the specific syntax chosen and suggestions for alternatives.
  • "The intro of the tilde op was nothing but a big, useless blunder."

The passionate arguments in comp.soft-sys.matlab caught my eye and have prompted me to add my two cents here. Although I have no particular expectation of changing anyone's mind, I thought it might be interesting to address one particular question that was expressed well by Matt Fig: "If something already works, why complicate things?"

Matt means that we already have at least a couple of ways to ignore an output variable. The first is to use a variable name that (hopefully) makes the programmer's intent clear:

  [unused, idx] = sort(A);

The second technique takes advantage of the way MATLAB assigns function outputs from left to right:

  [idx, idx] = sort(A);

So why go to the trouble of introducing a new syntax?

Well, you'll probably get a somewhat different answer from every MATLAB developer you ask. Some might even agree. (I'll note, however, that this was one of the least-controversial MATLAB syntax proposals ever considered by the MATLAB language team.)

The proposed syntax was originally considered years ago, sometime around 2001. The proposal was approved internally at that time but then didn't get implemented right away because of competing priorities.

To understand why we eventually did decide to go ahead and implement it, you have to understand how this fairly minor syntactic issue is connected to our long-term efforts to do something much more important: provide automatic (and helpful!) advice to MATLAB users about their code. If you've used the MATLAB Editor at all over the last few years, you've probably observed how it marks code sections and makes suggestions to you. These suggestions fall into several categories, such as errors, possible errors, performance, new features, etc.

One of the things we try to flag is potential programming errors. A coding pattern that very often indicates a programming error is when you save a computed value by assigning it to a variable, but then you never use that saved value. At the very least, that pattern may indicate "dead code," or code that was doing something useful at one time but is now just cruft.

Both of the conventions mentioned above ([unused,idx] = sort(A) and [idx,idx] = sort(A)) exhibit this pattern of computing and saving a value and then never using it.

That might not be obvious for the [idx,idx] = sort(A) case, but it's true. The first output of sort is assigned to the variable idx. Then the second output is assigned to idx, causing the first output value to be discarded without ever being used.

These cases aren't programming errors, though, because we've given you no other way to ignore outputs. We can't automatically and reliably distinguish between the intentional [idx,idx] = sort(A) and the similar-looking [y,y] = foobar(x) that is the result of a typo.

I think it's important for the programmer to communicate his or her intent very clearly (which is why I tend to prefer the [unused,idx] = sort(A) convention). It is useful to have a way to communicate intent syntactically instead of by convention. And the new syntax helps us, in a small way, progress toward our long-term goal of helping MATLAB users write better MATLAB code.

OK, fire away!




Published with MATLAB® 7.9

|

Comments

To leave a comment, please click here to sign in to your MathWorks Account or create a new one.