I've recently been offered the opportunity to review a new book, The Elements of MATLAB Style by contributor to the FEX, Richard K. Johnson. It's a great opportunity for me to see what's important in the eyes of one particular prolific MATLAB user. And it's a book worth you investigating for questions of style, especially if you work in a group or organization where there is lots of shared code and lots of people looking at and using the code.
The first thing I like about Richard's book is the intent, to make it a reference in the venerable tradition of The Elements of Style, by Strunk and White. If you write in English, I highly recommend this book. Neither book is comprehensive but rather attempts to boil ideas down to the ones with the largest pay-off and the ones where mistakes are often made. So you get some essentials & pitfalls, and conventions (some, but not all, particular to MATLAB).
Taking a look at the table of contents, we see first some high level principles, followed by a small number of main topics:
- Files and Organization
with a few helpful lists bringing up the rear (e.g., keywords).
Each of these chapters ends with a summary section which pulls together the main themes of each section. These summaries serve as a helpful review when you want to go back to look for more information. If it's listed there, you can be sure there will be some items in the chapter to guide you.
I don't happen to agree with every choice Richard has made in terms of conventions, e.g., for layout or formatting. Nor does MathWorks follow all of these (or all of any convention, in some cases). I do agree that he has identified relevant topics worthy of any group embarking on a project to discuss and standardize on.
I'd now like to take a little time mentioning a few of many points from the book that resonate for me. These are only a sampling, so don't read anything into ones that I have not listed here!
- #7 Split Long Code Lines at Graceful Points - I find this one useful as it is a total pain having to trail far off to the right in any editor, even though it is possible.
- # 10 Do Not Use Hard Tabs - This helps keep sanity when working among a group with possibly different editing environments.
- # 43 Use Meaningful Names for Variables with a Large Scope - This makes code much easier to read, understand, and debug, if necessary.
- # 69 Name Functions for What They Do - Since functions perform an action, the name should include information about the action.
- # 86 Use Sortable Numbering in Data Filesnames - If you have many similar files of data, having a rational numbering scheme can only help you out.
- # 97 Be Sure That Comments Agree with the Code - I will never forget the time that my thesis advisor called me because he was really irritated. I had left him a copy of a Fortran program that had copious comments, the final one being "Ignore all the comments above; they were for a previous version."
- # 135 Avoid Cryptic Code - I have found that generally, writing cryptic code buys less than I expect in terms of good things, and more headaches than it warrants. On occasion, I have used cryptic code for performance in something time-critical. When I do, I try to comment it fully, including a straight-forward implementation in the comments which I have tested. That way, when the performance trade-offs change, I understand what the code is supposed to do and have two starting options for doing a code update.
- # 150, 151 Minimize the Use of Global Variables and Minimize the Use of Global Constants -- I would say this even more strongly myself. There are superior techniques for dealing with information you want to share, whether they be function handles, classes and their properties, or some other methods. These techniques are much safer to use for many reasons - e.g., more easily controlled side effects, should any be desired, and code becomes more suitable for parallelism potentially.
- # 172 Use Parenthese - Clarity of meaning is paramount, especially if others need to understand, modify, or translate the code.
- # 176 Avoid Use of eval When Possible - I'm sure it doesn't seem so to some MATLAB users, but eval is avoidable most of the time.
- # 185-188 The first of these is Avoid Complicated Conditional Expressions - These entries contain some useful thoughts on dealing with conditional constructs, the ordering of the cases, etc.
- # 271-275 The first of these is Write Small Tests - I love that Richard has made testing a central tenet of this style guide. I don't see how programmers function well without a robust test suite.
Congratulations to Richard for writing "The Elements of MATLAB Style." It's a book that I recommend you read. I encourage you to adapt the guidelines in a way suitable for your programming environment.
Get the MATLAB code
Published with MATLAB® 7.11
7 CommentsOldest to Newest
Why are hard tabs discouraged when multiple people are working with different editing environments? I have never understood this.
With hard tabs, everyone can set the tab width to their preferred setting. Some like an indentation of 2 spaces and some like 8 spaces.
If spaces are used instead of tabs, everyone has to agree on a fixed width.
My experience with tabs is that you set them to whatever, but people still use spaces in addition to line things up. When that happens and you use tabs and they differ for different people, code that lines up for one person doesn’t line up for the other one.
About the not writing cryptic code I have to agree and object (maybeish), because sometimes I’ve had to use the symbolic toolbox to generate analytical solutions to problems, and used that to generate code. In that kind of situation I’ve gladly taken the output and run. The snippets below is one such example, and even if the two expressions are unmaintainable, there is a known way to reproduce them. Maybe others have even more complicated machine-generated code?
l(2) = [ 1/2/(ex^2+ey^2+ez^2)*(2*ez*z0-2*zl*ez-2*yl*ey+2*ex*x0+2*ey*y0-2*xl*ex+2*(2*ex^2*zl*z0+2*ez*z0*ex*x0-ex^2*y0^2-ex^2*z0^2-ex^2*yl^2+ex^2*R^2-ex^2*zl^2-ey^2*xl^2-ey^2*z0^2-ey^2*x0^2+ey^2*R^2-ey^2*zl^2-ez^2*xl^2-ez^2*y0^2-ez^2*x0^2-ez^2*yl^2+ez^2*R^2+2*ex^2*yl*y0+2*ey^2*zl*z0+2*ey^2*xl*x0+2*ez^2*xl*x0+2*ez^2*yl*y0-2*ez*z0*yl*ey+2*ez*z0*ey*y0-2*ez*z0*xl*ex+2*zl*ez*yl*ey-2*zl*ez*ex*x0-2*zl*ez*ey*y0+2*zl*ez*xl*ex-2*yl*ey*ex*x0+2*yl*ey*xl*ex+2*ex*x0*ey*y0-2*ey*y0*xl*ex)^(1/2))]; l(1) = [ 1/2/(ex^2+ey^2+ez^2)*(2*ez*z0-2*zl*ez-2*yl*ey+2*ex*x0+2*ey*y0-2*xl*ex-2*(2*ex^2*zl*z0+2*ez*z0*ex*x0-ex^2*y0^2-ex^2*z0^2-ex^2*yl^2+ex^2*R^2-ex^2*zl^2-ey^2*xl^2-ey^2*z0^2-ey^2*x0^2+ey^2*R^2-ey^2*zl^2-ez^2*xl^2-ez^2*y0^2-ez^2*x0^2-ez^2*yl^2+ez^2*R^2+2*ex^2*yl*y0+2*ey^2*zl*z0+2*ey^2*xl*x0+2*ez^2*xl*x0+2*ez^2*yl*y0-2*ez*z0*yl*ey+2*ez*z0*ey*y0-2*ez*z0*xl*ex+2*zl*ez*yl*ey-2*zl*ez*ex*x0-2*zl*ez*ey*y0+2*zl*ez*xl*ex-2*yl*ey*ex*x0+2*yl*ey*xl*ex+2*ex*x0*ey*y0-2*ey*y0*xl*ex)^(1/2))];
Sounds great. I have another ‘dream book’ that would be very useful: ‘Best Practices for Managing Complex Matlab Projects with Multiple Users’. It would cover how to keep code organized and current, jointly develop code with other engineers/programmers/scientists, and how to document projects beyond just commenting within code. In other words how to apply standard software engineering practices to Matlab projects. Like many Matlab users, that’s not the world I came from, but could sure benefit from its insights.
The book does have chapters on modern development and documentation techniques, with MATLAB specifics. Professional software engineering practices have changed quite a bit in the last 10 years, and most of them fit well with MATLAB. Of course working successfully with a development team may be as much a matter of sociology…
What is your take on Richard’s comments with respect to vectorization? I do not have the most recent version of Matlab; have for…end loops gotten that much better?
I believe vectorization is still quite important, but there are certainly times where there is less performance difference between that and equivalent for-loops. For loops have gotten dramatically faster in recent years under many circumstances.
Also, he’s quite right (as I have written in blog posts), that it is possible to vectorize to the point where you need more memorable than is reasonable. In that case, I recommend seeing if you can vectorize but work in chunks. Still, there are times when the code is MUCH clearer if written as a loop, and if performance is acceptable, as it might be these days, that’s fine. I still think mathematical code is often easier to read when vectorized since it’s closer to the matrix equations.