One of the most basic concepts of object oriented programming is that
of inheritance, in which we think of the IS-A property
and say that thing A is a thing B, but different or more specific
in certain ways. For example, a toothbrush is a dental hygiene
instrument.
Inheritance leads us to speak of subclasses and
superclasses. (A class is a way of abstracting
the properties of a kind of object, while a class instance (or object) is
an actual item of that kind; e.g., you could have a class called
Toothbrush, of which there may be four instances hung in the rack by your
bathroom sink.) The terms are not absolute, but rather form a relationship
between two classes (class DentalHygieneInstrument being a superclass of
class Toothbrush). Every instance of the subclass can also be considered
(though not exclusively) as an instance of the superclass.
This can be useful because some or all of the code that a programmer writes
to manipulate objects of the superclass also applies to objects of its
subclasses, even if they weren't implemented (or even being considered)
when the code was written. The Sterilize method of a
DentalHygieneInstrument works for a Toothbrush as well as for a Burnisher.
But this is not always the case.
After all, while a Toothbrush is a DentalHygieneInstrument, it is
not a Burnisher (which is also a DentalHygieneInstrument), so clearly all
DentalHygieneInstruments are alike in many ways, but different in some ways.
Perhaps the sterilization procedure for a Burnisher involves first roughing
up the surface with sandpaper[1] before performing the usual
(common) procedure. While the Toothbrush class will not define a Sterilize
method of its own (causing the language to search upward through the
inheritance hierarchy until it finds one to use), Burnisher must do so.
But there is no need to reproduce the code found in DentalHygieneInstrument's
Sterilize method in the Burnisher class; instead, Burnisher's Sterilize
can first sand itself, and then invoke the Sterilize method from its
superclass.
class Burnisher(DentalHygieneInstrument):
def Sterilize(self):
self.SandTip()
DentalHygieneInstrument.Sterilize(self)
This is an example of the near-universal rule that a subclass's method
which overrides a method of a superclass should call the overridden method,
either before or after it does its own thing.
Notice that that code cannot use self.Sterilize() because
that would call its own specialized method, rather than the shared one
from the superclass.
Now some languages give you a shortcut for doing that. For example, in
Java, we can write super.Sterilize(). Java knows that the
superclass is DentalHygieneInstrument, so it knows to which method that
expression is referring.
That shortcut, however, is not all that useful. It saves a few keystrokes,
and it doesn't have to be changed if the code is copied to another class.
But it falls down in the face of multiple inheritance.
My examples will change now; I've run out of gas on the dental
instrument thing….
Multiple inheritance enters the picture when we have a Thing A that not only
is a Thing B, but also is a Thing C (which B is not).
Sometimes, B and C are quite unrelated concepts – in which case A is
often called a mixin class – and sometimes not. More on that
in a moment. Time for a diagram:
class B class C
\ /
\ /
\ /
\ /
\ /
class A
In light of the rule about calling your superclass's method from your own
overriding one, we can see that a method in A may have to call the same method
in both B and C. (In a mixin scenario, it is often the case that a particular
method is known to be inherited only from one or the other.) Clearly, the
super keyword will not do here. So, for a method M, an
overriding implementation in A will look like:
class A(B, C):
def M(self):
B.M(self)
C.M(self)
# do A's M stuff here
In fact, in earlier versions[2] of
Python, if A wanted the M of
both B and C to be called, it had to implement its own M even if it had
nothing to add, because Python would automatically call only one of them.
That's a bit yucky, but wait! Another large wooden shoe is about to be
thrown into the mix. Let's get a bit more personal about A, B, and C:
class Cumulus class Nimbus
\ /
\ /
\ /
\ /
\ /
\ /
\ /
\ /
class CumuloNimbus
What does this suggest to you? What am I holding back? Remember I said
that a mixin class inherited from two unrelated classes. When the superclasses
are clearly related, it is not a mixin. Cumulus and Nimbus are obviously
related — because each is a subclass of … drum roll …
class Cloud
/ \ Scud
/ \ Save
/ \
/ \
/ \
/ \
/ \
/ \
class Cumulus class Nimbus
Save \ / Rain
\ / Save
\ /
\ /
\ /
\ /
\ /
\ /
class CumuloNimbus
This diagram shows the genesis of the aptly-named diamond problem. Consider
a method being called with an instance of CumuloNimbus. It we're calling
cn.Scud(), that's not a problem: Scud is
a method of Cloud and is not overridden by any of these other classes, because
clouds all scud across the sky in the same way.
But in the face of reimplementations or extensions, problems occur. Consider a
call to cn.Rain(). This is not a problem, because neither
Cumulus nor Cloud defines Rain, so the method implemented by Nimbus will
be used. Calling cn.Save, on the other hand, is a different
kettle of fish. (Let's assume Save is a method that saves the object's data
to an open file.) We need to add a Save method to CumuloNimbus, so that it
can call Cumulus.Save and Nimbus.Save, in accordance with our rule about
calling the overridden method. And it's obviously very important that they
both get called, because our cn object may contain data defined
by Cumulus, Nimbus, and even Cloud. So we add
class CumuloNimbus(Cumulus, Nimbus):
def Save(self):
Cumulus.Save(self)
Nimbus.Save(self)
and heave a sigh of relief. (In this example, it doesn't matter whether
CumuloNimbus adds more data that needs to be saved.)
Oh, no! Remember how diligent we all are when it comes to calling the
method in the superclass when we override it? Let's look at the sequence
of methods that get called when we do a cn.Save():
CumuloNimbus.Save
Cumulus.Save
Cloud.Save
Nimbus.Save
Cloud.Save
The Save method in Cloud got called twice! Depending on the nature of the
method, this may or may not be a problem; in this case, it clearly is.
Note that if we call Save on an instance of Cumulus, there is no problem,
yet the situation is intolerable when called with cn, which
is a Cumulus!
While he was creating version 2.2 of Python, Guido
noted that this problem could become much more prevalent than it had
been previously, because the addition of an ultimate superclass –
which earlier Pythons did not have – could cause diamonds to appear
where they had not been before, even in the absence of the application
creating them; namely, in the situation with A being a subclass of B and C,
B and C might now (and should) be subclasses of object.
He wouldn't put us in such a fix without hope, though. A new builtin
function called super will be our salvation. It can be
used by our various classes to call overridden methods in a safe and
complete way. How can it do this? Through its knowledge of the method
resolution order.
The method resolution order is a list associated with each class that
tells it where to look for an attribute when an instance does not include
it; this list is built when the class statement is compiled. In previous
Pythons, the MRO for our diamond above would be
[CumuloNimbus, Cumulus, Cloud, Nimbus, Cloud]
The presence of Cloud in the list twice was never a problem, because the
search would never find the attribute in the second one if it hadn't found
it in the first one already. At version 2.2, these duplicates are removed,
and our MRO is now
[CumuloNimbus, Cumulus, Nimbus, Cloud]
This ordering is easily derived from the previous one, by retaining only
the last occurrence of any class that appears more than once. The
super function can use the MRO, together with the class of
an instance, to unambiguously determine which class should be called next.
It is important to remember here that, e.g., when the Save method in
Cumulus is executing, its self may refer to either a Cumulus
instance or a CumuloNimbus instance; its code doesn't need to care which
(if it did, the whole inheritance idea would be useless), but
super, under the covers, cares very much.
A class X always uses super in the same way: the two arguments
it passes are always itself (class X) and self. The function then
returns an object through which the appropriate implementation of the
sought attribute can be found. So, let's look at the implementation of
our various Save methods:
class Cloud(object):
def Save(self):
# do actual saving, no overridden method to call
class Cumulus(Cloud):
def Save(self):
super(Cumulus, self).Save()
# now save Cumulus-specific data
class Nimbus(Cloud):
def Save(self):
super(Nimbus, self).Save()
# now save Nimbus-specific data
class CumuloNimbus(Cumulus, Nimbus):
def Save(self):
super(CumuloNimbus, self).Save()
# now save CumuloNimbus-specific data
Did you notice that, unlike our previous examples, there is only one
call to super in CumuloNimbus's Save? This is the case no matter how
many superclasses it has, because the
super mechanism causes
every Save method in the hierarchy to be called, once and only once[3].
It does this by looking at the MRO of the class of the instance (
not
the class whose code is executing), locating the executing class within
that list, and making the next entry the starting point for the search for the
attribute. Consider the super call in Cumulus.Save. If
self is
an instance of Cumulus, the MRO is [Cumulus, Cloud] and super
determines that Cloud is the place to begin looking for the inherited
Save method.
But, if
self is a CumuloNimbus, the
MRO is the one I listed earlier, and it finds Nimbus instead!
So, given the use of super in our code now, when we execute
cn.Save(), these methods are called:
CumuloNimbus.Save
Cumulus.Save
Nimbus.Save
Cloud.Save
Problem solved!
[1] I obviously know nothing about the care and handling of
dental instruments. Don't ask where that example came from.
[2] Prior to version 2.2, which began the long-awaited fixing
of the so-called type-class split, a major wart in Python.
[3] Real Magic™ happening here. Note that the calls to
Save invoked through the super-object do not pass self as an argument! And yet, the super function
(actually the constructor for a super object) cannot create
a bound method, because it doesn't know what method you're going to attempt to call.