The Diamond of Breadth

저자 Andy Campbell, June 8, 2015

17 회 조회 (최근 30일) | 0 좋아요 | 2 댓글

Sometimes, Java® bugs me.

Don't get me wrong, in many ways it is a beautiful language full of elegant structure and all the tools you need to help drive robust architectures. However, the flipside is that this structure and rigid set of rules add layers of complexity and requires mundane, un-interesting passion-zapping boiler plate code to accomplish the interfaces you want while also achieving the code reuse that keeps you productive.

On the other hand, sometimes C++ scares me.

C++ is much closer to the metal, and allows you to shoot yourself in the foot if you can't recite every section of the C++ standard verbatim or know precisely how each compiler implements the standard. This is fantastic, allowing you full access and full control as to how the streaming bits of ones and zeros dance on the hardware, and simultaneously devastating when these same ones and zeros crash and burn instead.

One interesting case study comparing these approaches is the (in)famous diamond of death problem with multiple inheritance. C++ supports multiple inheritance, and it is very dangerous when it is encountered due to the default C++ semantics. Take the following simple diamond structure:

By default, the underlying C++ objects that are created result in actually two instances of A as seen below, one for each path through the inheritance. This can cause a world of pain, especially when casting at runtime and the object is sliced in unforeseen and unexpected ways. When there are two truths for what A is you can also see how you can get into an inconsistent state, where if you change a property through a pointer to B and access that property through a pointer to C you are actually setting and accessing two different copies of that property!

Now, as we learn more C++ there are ways to avoid this ambiguity, but in general I think its a fair claim that using multiple inheritance in C++ is an advanced maneuver that can be fraught with peril, and you had better know what you are doing fully and completely in order to avoid falling into some of these dangerous pitfalls. The resulting culture has been to avoid multiple inheritance, largely due to this behavior of the diamond problem.

In fact, this problem has had a direct impact on the design of languages like Java and C# which specifically prohibit multiple inheritance of properties and methods. You can, of course implement many interfaces but those interfaces cannot have implementations (Yes, in Java 8 there are now default methods for interfaces, but they can't contain any properties for those methods to operate on). The result is a ton of boiler plate code where Java objects implement interfaces but need to hardcode the link of that interface fulfillment to some implementation that is shared across many objects. This boiler plate code sometimes boils my blood! OK, I'm not that mad at it but it sounded good. I do think that this inability to multiply inherit content from two base classes has resulted in far more code than truly needed, and to me code is a liability.

The issue comes down to lack of language features. The disdain for the diamond of death inheritance structure is largely due to tricky/dangerous language semantics of C++, but when we are operating in a language that handles it more safely many of the reasons for avoiding it disappear. MATLAB for example handles multiple inheritance in a much more rational and understandable way. Encountering a diamond structure is not as severe as doing so in C++ and sometimes may even be perfectly fine. For example, there only exists a single instance of the common base class (A above) and method ambiguities are caught and prevented.

Ultimately, I often find when having engaging discussions about this problem that many view multiple inheritance as a smell and avoid the diamond structure at all costs. However, when we start getting into why they do so it is hard to find an answer that is independent from specifics of a particular language like C++.

I am not advocating rushing out and adding a bunch of diamond structures to your code! Indeed it often does signal a possible abstraction problem and there are subtleties (e.g. there is only one instance of the shared base class, but MATLAB invokes its constructor twice). However, in MATLAB it is much more manageable, and ultimately getting comfortable with the diamond problem opens you up to the use of mixins. The use of mixins is a powerful design tactic that leverages multiple inheritance and is an alternative to composition that similarly helps keep hierarchies wide. Let's get into those next time.

What do you think? Am I missing anything here? Can you help describe fundamental problems with the diamond problem that transcend language semantics? Have you ever encountered the diamond in MATLAB deliberately or accidentally, and if so what affect did it have?

Published with MATLAB® R2015a