Loren on the Art of MATLAB

Turn ideas into MATLAB

Note

Loren on the Art of MATLAB has been archived and will not be updated.

Benefits of Refactoring Code

Benefits of Refactoring Code

I have seen a lot of code in my life, including code from many different people written for many different purposes, and in many different "styles". These styles range from quick-and-dirty (I only need to do this once), to fully optimized, documented, and tested (I want this to last a long time while other people use it). For me, I have found, a bit more than I expected, that the quick-and-dirty quickly morphs into something being useful and used a lot, but without the thought and care of making sure the code is really up to the task.
Today I want to argue why, as soon as you take your quick-and-dirty code, to use again, it is time to refactor and use some good engineering techniques to whip it into shape.

Refactoring leads to smaller components

First break the code into logical units that are small. Each unit is more concise, it's more focused, does less, and it's clear what it does and does not do. Decide on edge cases and error conditions and deal with these in a way that is straight-forward and is likely to cause users of this module the least amount of trouble.

Benefits of smaller components

Each piece is easier to understand

As I already said, each part does less and so it's easier to understand what each piece does. In fact, if you can get the piece of code to do one thing well, that often pays off. One technique for doing this is to reduce the branching (if-elseif-else) and instead have various branches relegated to separate functions. Another technique is to use an arguments block to check the input arguments. This generally takes up fewer lines of code and you are able to be concise and precise using it.

Reduced complexity has less overhead (mental and otherwise)

When you have smaller components, they are frequently easier to understand, and much easier to test - especially if there are a very limited number of code paths. You can check out the complexity of your code using either of 2 options to checkcode.
checkcode(filename, "-cyc")
checkcode(filename, "-modcyc")
Complexity is reduced when you refactor the code so there are not so many nested statements like if and switch statements.

Each piece is easier to test

With smaller pieces, each part is easier to test, debug, check out edge conditions. You can be certain you are covering all the bases more readily. Of course, using the one of the testing frameworks.

Each piece is easier to reuse

Since we're working with smaller units, it's generally easier to reuse the pieces because, I hope, we've split things up in such a way that the interfaces are simpler to use. With each piece small, there's usually only a limited number of inputs required.

Each piece is easier to read and understand

With input argument lists smaller, I hope the calling syntax is shorter. If you are calling functions that take optional arguments from within your function, the more the code complexity, the more likely code gets indented further and further to the right, either having code not in view in your window, or the argument list goes on for several lines, especially if you are using the new name=value syntax. In that case, each "input" is potentially long, possible causing extra line continuations to fit inside the editor window parameters you have set.

Example

I have an example from the file exchange, X Steam. If you look at the code, there are lots of switch/case statements with if-else statements inside. How can you easily debug something near a particular line. It's hard! And that's despite the code not being poorly written or structured apart from this nesting.
checkcode Xsteam.m -cyc
L 159 (C 14-19): The McCabe cyclomatic complexity of 'XSteam' is 322. L 1430 (C 4-6): The value assigned to variable 'err' might be unused. L 1441 (C 18-22): The McCabe cyclomatic complexity of 'v1_pT' is 2. L 1457 (C 18-22): The McCabe cyclomatic complexity of 'h1_pT' is 2. L 1473 (C 18-22): The McCabe cyclomatic complexity of 'u1_pT' is 2. L 1491 (C 18-22): The McCabe cyclomatic complexity of 's1_pT' is 2. L 1509 (C 19-24): The McCabe cyclomatic complexity of 'Cp1_pT' is 2. L 1525 (C 19-24): The McCabe cyclomatic complexity of 'Cv1_pT' is 2. L 1547 (C 18-22): The McCabe cyclomatic complexity of 'w1_pT' is 2. L 1569 (C 18-22): The McCabe cyclomatic complexity of 'T1_ph' is 2. L 1584 (C 18-22): The McCabe cyclomatic complexity of 'T1_ps' is 2. L 1599 (C 18-22): The McCabe cyclomatic complexity of 'p1_hs' is 2. L 1614 (C 20-26): The McCabe cyclomatic complexity of 'T1_prho' is 3. L 1634 (C 18-22): The McCabe cyclomatic complexity of 'v2_pT' is 2. L 1638 (C 1-2): The value assigned to variable 'J0' might be unused. L 1639 (C 1-2): The value assigned to variable 'n0' might be unused. L 1653 (C 18-22): The McCabe cyclomatic complexity of 'h2_pT' is 3. L 1675 (C 18-22): The McCabe cyclomatic complexity of 'u2_pT' is 3. L 1701 (C 18-22): The McCabe cyclomatic complexity of 's2_pT' is 3. L 1727 (C 19-24): The McCabe cyclomatic complexity of 'Cp2_pT' is 3. L 1749 (C 19-24): The McCabe cyclomatic complexity of 'Cv2_pT' is 3. L 1777 (C 18-22): The McCabe cyclomatic complexity of 'w2_pT' is 3. L 1805 (C 18-22): The McCabe cyclomatic complexity of 'T2_ph' is 8. L 1857 (C 18-22): The McCabe cyclomatic complexity of 'T2_ps' is 8. L 1912 (C 18-22): The McCabe cyclomatic complexity of 'p2_hs' is 8. L 1966 (C 18-24): The McCabe cyclomatic complexity of 'T2_prho' is 4. L 1991 (C 20-26): The McCabe cyclomatic complexity of 'p3_rhoT' is 2. L 2000 (C 1-2): The value assigned to variable 'pc' might be unused. L 2011 (C 20-26): The McCabe cyclomatic complexity of 'u3_rhoT' is 2. L 2020 (C 1-2): The value assigned to variable 'pc' might be unused. L 2031 (C 20-26): The McCabe cyclomatic complexity of 'h3_rhoT' is 2. L 2040 (C 1-2): The value assigned to variable 'pc' might be unused. L 2053 (C 20-26): The McCabe cyclomatic complexity of 's3_rhoT' is 2. L 2062 (C 1-2): The value assigned to variable 'pc' might be unused. L 2075 (C 21-28): The McCabe cyclomatic complexity of 'Cp3_rhoT' is 2. L 2084 (C 1-2): The value assigned to variable 'pc' might be unused. L 2102 (C 21-28): The McCabe cyclomatic complexity of 'Cv3_rhoT' is 2. L 2111 (C 1-2): The value assigned to variable 'pc' might be unused. L 2121 (C 20-26): The McCabe cyclomatic complexity of 'w3_rhoT' is 2. L 2130 (C 1-2): The value assigned to variable 'pc' might be unused. L 2148 (C 18-22): The McCabe cyclomatic complexity of 'T3_ph' is 4. L 2182 (C 18-22): The McCabe cyclomatic complexity of 'v3_ph' is 4. L 2216 (C 18-22): The McCabe cyclomatic complexity of 'T3_ps' is 4. L 2249 (C 18-22): The McCabe cyclomatic complexity of 'v3_ps' is 4. L 2283 (C 18-22): The McCabe cyclomatic complexity of 'p3_hs' is 4. L 2318 (C 18-22): The McCabe cyclomatic complexity of 'h3_pT' is 5. L 2348 (C 20-26): The McCabe cyclomatic complexity of 'T3_prho' is 3. L 2369 (C 17-20): The McCabe cyclomatic complexity of 'p4_T' is 1. L 2380 (C 17-20): The McCabe cyclomatic complexity of 'T4_p' is 1. L 2391 (C 17-20): The McCabe cyclomatic complexity of 'h4_s' is 9. L 2449 (C 17-20): The McCabe cyclomatic complexity of 'p4_s' is 4. L 2462 (C 18-22): The McCabe cyclomatic complexity of 'h4L_p' is 5. L 2488 (C 18-22): The McCabe cyclomatic complexity of 'h4V_p' is 5. L 2513 (C 18-22): The McCabe cyclomatic complexity of 'x4_ph' is 3. L 2525 (C 18-22): The McCabe cyclomatic complexity of 'x4_ps' is 4. L 2541 (C 18-22): The McCabe cyclomatic complexity of 'T4_hs' is 15. L 2606 (C 18-22): The McCabe cyclomatic complexity of 'h5_pT' is 3. L 2630 (C 18-22): The McCabe cyclomatic complexity of 'v5_pT' is 2. L 2634 (C 1-3): The value assigned to variable 'Ji0' might be unused. L 2635 (C 1-3): The value assigned to variable 'ni0' might be unused. L 2650 (C 18-22): The McCabe cyclomatic complexity of 'u5_pT' is 3. L 2675 (C 19-24): The McCabe cyclomatic complexity of 'Cp5_pT' is 3. L 2698 (C 18-22): The McCabe cyclomatic complexity of 's5_pT' is 3. L 2724 (C 19-24): The McCabe cyclomatic complexity of 'Cv5_pT' is 3. L 2752 (C 18-22): The McCabe cyclomatic complexity of 'w5_pT' is 3. L 2781 (C 18-22): The McCabe cyclomatic complexity of 'T5_ph' is 3. L 2798 (C 18-22): The McCabe cyclomatic complexity of 'T5_ps' is 3. L 2814 (C 18-24): The McCabe cyclomatic complexity of 'T5_prho' is 3. L 2836 (C 22-30): The McCabe cyclomatic complexity of 'region_pT' is 15. L 2867 (C 22-30): The McCabe cyclomatic complexity of 'region_ph' is 18. L 2953 (C 22-30): The McCabe cyclomatic complexity of 'region_ps' is 16. L 3008 (C 22-30): The McCabe cyclomatic complexity of 'region_hs' is 33. L 3168 (C 24-34): The McCabe cyclomatic complexity of 'Region_prho' is 17. L 3238 (C 19-24): The McCabe cyclomatic complexity of 'B23p_T' is 1. L 3245 (C 19-24): The McCabe cyclomatic complexity of 'B23T_p' is 1. L 3255 (C 20-26): The McCabe cyclomatic complexity of 'p3sat_h' is 2. L 3271 (C 20-26): The McCabe cyclomatic complexity of 'p3sat_s' is 2. L 3285 (C 19-24): The McCabe cyclomatic complexity of 'hB13_s' is 2. L 3299 (C 20-26): The McCabe cyclomatic complexity of 'TB23_hs' is 2. L 3319 (C 29-44): The McCabe cyclomatic complexity of 'my_AllRegions_pT' is 13. L 3348 (C 1-2): The value assigned to variable 'ps' might be unused. L 3365 (C 29-44): The McCabe cyclomatic complexity of 'my_AllRegions_ph' is 14. L 3408 (C 1-2): The value assigned to variable 'ps' might be unused. L 3426 (C 21-28): The McCabe cyclomatic complexity of 'tc_ptrho' is 8. L 3444 (C 7): Consider using newline, semicolon, or comma before this statement for readability. L 3444 (C 9-10): Terminate statement with semicolon to suppress output (in functions). L 3467 (C 30-46): The McCabe cyclomatic complexity of 'Surface_Tension_T' is 3. L 3485 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_p' is 1. L 3488 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_p' is 1. L 3491 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_T' is 1. L 3494 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_T' is 1. L 3497 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_h' is 1. L 3499 (C 26-37): The McCabe cyclomatic complexity of 'fromSIunit_h' is 1. L 3501 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_v' is 1. L 3503 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_v' is 1. L 3505 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_s' is 1. L 3507 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_s' is 1. L 3509 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_u' is 1. L 3511 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_u' is 1. L 3513 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_Cp' is 1. L 3515 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_Cp' is 1. L 3517 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_Cv' is 1. L 3519 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_Cv' is 1. L 3521 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_w' is 1. L 3523 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_w' is 1. L 3525 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_tc' is 1. L 3527 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_tc' is 1. L 3529 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_st' is 1. L 3531 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_st' is 1. L 3533 (C 23-32): The McCabe cyclomatic complexity of 'toSIunit_x' is 1. L 3535 (C 25-36): The McCabe cyclomatic complexity of 'fromSIunit_x' is 1. L 3537 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_vx' is 1. L 3539 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_vx' is 1. L 3541 (C 24-34): The McCabe cyclomatic complexity of 'toSIunit_my' is 1. L 3543 (C 26-38): The McCabe cyclomatic complexity of 'fromSIunit_my' is 1. L 3550 (C 16-20): The McCabe cyclomatic complexity of 'check' is 28. L 3570 (C 14): Terminate statement with semicolon to suppress output (in functions). L 3571 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3581 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3582 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3592 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3593 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3605 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3606 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3626 (C 14): Terminate statement with semicolon to suppress output (in functions). L 3627 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3637 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3638 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3648 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3649 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3660 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3661 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3681 (C 14): Terminate statement with semicolon to suppress output (in functions). L 3682 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3692 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3693 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3703 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3704 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3714 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3715 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3725 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3726 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3736 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3737 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3747 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3748 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3755 (C 1-2): The preallocated value assigned to variable 'R3' might be unused. L 3757 (C 5-6): The variable 'R4' appears to change size on every loop iteration. Consider preallocating for speed. L 3759 (C 11): Terminate statement with semicolon to suppress output (in functions). L 3760 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3764 (C 1-2): The preallocated value assigned to variable 'R3' might be unused. L 3768 (C 11): Terminate statement with semicolon to suppress output (in functions). L 3769 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3773 (C 1-2): The value assigned to variable 'R3' might be unused. L 3777 (C 11): Terminate statement with semicolon to suppress output (in functions). L 3778 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3798 (C 14): Terminate statement with semicolon to suppress output (in functions). L 3799 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3809 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3810 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3820 (C 12): Terminate statement with semicolon to suppress output (in functions). L 3821 (C 4): Terminate statement with semicolon to suppress output (in functions). L 3835 (C 17-23): If you are operating on scalar values, consider using STR2DOUBLE for faster performance. L 3835 (C 45-51): If you are operating on scalar values, consider using STR2DOUBLE for faster performance. L 3836 (C 5-9): The value assigned to variable 'Check' might be unused. L 3836 (C 10): Terminate statement with semicolon to suppress output (in functions). L 3838 (C 9-11): The value assigned to variable 'err' might be unused.
With this many paths through the code, 322, what are the chances that there are no issues, despite the refactored functions? If I were using this for some work I wanted to publish, I would need to make sure that all the paths I used were correctly computing what I need. Since that's a hassle, I'd likely refactor the code.

Refactor - how?

Apart from going in the code and copy/pasting to elsewhere (could be in the same file), you can use the tools in the editor toolstrip or from the right-click context menu once you've made your selection.

How do you deal with your piles of code?

Do you let the code rule you or do you rule your code? Please post any additional techniques or benefits I have not mentioned right here.
Copyright 2021 The MathWorks, Inc.

  • print

评论

要发表评论,请点击 此处 登录到您的 MathWorks 帐户或创建一个新帐户。