1. Model Inheritance
...in Django
This presentation is about a feature of the Django Object Relational Mapping layer called
“Model Inheritance”.
2. Impedance Mismatch
First some background. The root of the problem that Object Relational Mapping (ORM)
systems attempt to solve is a fundamental “Impedance Mismatch” between the object-
oriented world inhabited by OO programming languages, such as Python, and the relational
database world as defined by RDBMS’ such as PostgreSQL, Oracle and MySQL.
These two worlds have very different ways of looking at how data should be organized.
3. Object Oriented Values
One of the main tools of the OO world is inheritance. In OO inheritance, a descendant class
inherits the characteristics of its ancestor. This allows common functionality to be
programmed into the ancestor class, and then “specialized” sub-classes can be created,
extending the functionality of the ancestor, or superclass.
In this example, a User and a Customer are different kinds of specialized extensions of a
Person. Additionally, an E-Commerce User inherits from both User and Customer, and
therefore has all of the traits of its superclasses.
It is common to organize data in OO in various hierarchies of superclasses and subclasses.
4. Relational Values
In the world of relational databases, database tables are defined with “foreign keys” that
relate them to each other. In this example, we see that each row in the User table has a
person_id, which is a foreign key that relates to a record in the Person table.
Data organization is done through relating records together, and then composing queries
that pulls the necessary data from the database, following the relationships defined by these
foreign keys.
5. ORM Layers are both
OO and Relational
Because they are designed to bridge the different approach to managing data by the two
worlds, ORMs are inherently both Object Oriented AND Relational. It is exactly what is
advertised by their name: Object Relational Mapping.
6. Building Inheritance
into ORMs is difficult
Trying to bring OO-style inheritance to ORMs is difficult, because the relational world doesn’t
really support the concept. It opens a lot of questions of the best way to provide inheritance
functionality to the programmer, in a way that isn’t a total hack on the relational side.
7. Django approach
before now:
Use Composition
Up until now, Django hasn’t really supported model inheritance. When a Django application
developer was presented with a situation that would best be solved with inheritance, they
were advised to use a technique called composition instead.
The most prominent example of this is the user profile. It would be a common case for a
programmer to want to extend the User class that comes in django.contrib.auth so that they
can contain attributes specific to the application. Django historically has solved this with a
“user profile” - a separate class that is identified in settings.py and can be retrieved by the
get_user_profile() method on User objects.
8. Unique Foreign Key
To use composition, one defines a foreign key that is also unique on the composited class
(the class that would be the subclass if we were solving this problem using OO inheritance).
For example, if we hypothetically wanted to specialize a Person class and make a User, the
User would have a ForeignKey field that pointed back at the Person model (the Django ORM
equivalent of a relational foreign key). In addition, we would designate the field unique,
ensuring that there would always only be one User per Person.
Traits on the composited “superclass” would have to be accessed explicitly - they are not
truly inherited by the specialization class.
9. This Sucks
There are a number of reasons this is a suboptimal situation.
1. It makes queries more complicated, meaning that the work of defining the relationship
that should be done once in data modeling is now pushed all over the application code.
2. It fails to take advantage of Python’s OO features and thus, power.
3. It creates an object model that doesn’t really best describe the real-world entities that it is
trying model.
10. Malcolm Tredinnick
This man has come to the rescue, however. This is Malcom Tredinnick, and he has for some
time now been working on a branch of the Django code called QuerySet Refactor.
11. Queryset Refactor
The QuerySet Refactor branch brings a number of new features to the Django ORM. Most
notably it has brought true Model Inheritance. Despite being difficult to program, the
QuerySet Refactor programmers, led by Malcolm, have managed to implement it in a couple
of great ways.
Furthermore, on April 26th this year, QuerySet Refactor was merged into the Django Trunk.
That means that all of its features are now part of the mainstream development of Django.
12. Two Approaches to
Model Inheritance
With the merge complete, Django now offers two different approaches to Model Inheritance.
An application might choose to use both approaches, since they offer advantages in different
circumstances.
13. 1. Abstract Base Classes
The first approach is called Abstract Base Classes. This approach is best used when the
parent superclass is never meant to be instantiated on its own. It is merely a source of
common functionality that will be used by subclasses.
Abstract Base Classes are specified by adding “abstract = True” in the Meta inner class. When
syncdb is run, the Django ORM will not create a database table for these models.
Model classes that extend the Abstract Base Class will automatically inherit fields defined in
the superclass, however. The Django ORM will automatically generate corresponding
columns and relations for the superclass’ fields in the table of the subclass.
14. Abstract Base Classes
are a coding
convenience only.
In the case of Abstract Base Classes, the inheritance relationship is ignored at the relational
level. They are essentially a kind of advanced syntactic sugar - providing a type of “include”
in model definitions. Once the Django ORM has “compiled” the models into SQL, the
Abstract Base Class essentially ceases to exist.
15. ABC Gotcha
“related_name”
Abstract Base Classes can carry a couple of gotchas. Let’s look at one related to the use of
the “related_name” attribute in ForeignKey fields.
16. Specifying a ForeignKey
in an ABC superclass
If we put a foreign key field in an ABC, we might wish to specify a related_name in a
ForeignKey field. The related_name is the name by which this class (“Person”) is known by
the target model of the ForeignKey (“Company”). So a Company object has “people”.
The problem comes when we have more than one concrete subclass of the ABC. We have to
remember that there is no database table that corresponds to Person. Instead, the Django
ORM compiles the fields of the ABC into the table definitions of the subclasses.
This means that Company objects will have a “people” field that points to both User and
Customer objects. Django will throw an error when syncdb is run.
17. Specifying a ForeignKey
in an ABC superclass
The suggested solution is to change the related name to something that is dynamically
interpolated. By using the “%(class)s_related” notation, we will create two attributes in the
Company model: “user_related” pointing to users, and “customer_related” pointing to
customers.
18. 2. Multiple Table Inheritance
The second approach to Model Inheritance that Django now provides is called Multiple Table
Inheritance. In this approach, Django generates a separate relational table for each model in
the inheritance hierarchy.
This approach is useful for circumstances where you may wish to instantiate models from
both the superclass and its subclasses. In this example, we may wish to define Persons that
are not Users, in addition to Users that inherit Person attributes.
MTI inheritance can be engaged simply by subclassing an existing model. In this case there
is no “abstract” attribute being set in an inner Meta class.
19. MTI makes inheritance
a relationship under the
hood.
As mentioned, when the Django ORM generates SQL from models using MTI, a separate table
is created for both the superclasses and subclasses. In addition, the inheritance between
models is converted into a relation.
The subclass table will have a special additional column called “<model_name>_ptr_id”, a
foreign key that points at the superclass table. When an instance of the subclass model is
pulled from the database, the ORM will pull the related row in the superclass table and build
a model instance with data from both tables. Essentially it’s composition under the hood, but
encapsulated inside the ORM model. The relational world sees it as a relation, and the OO
world sees it as inheritance.
20. Querying the MTI
superclass
While its not possible to run a query directly against an ABC, it is possible to run one against
an MTI superclass. But what if the superclass is directly tied to a record in the subclass table?
The Django ORM will return an object of the superclass model. The associated subclass
model is accessible as an attribute of the superclass model. So in this case “fred the user” is
accessible as the “user” attribute of “fred the person”.
This is a bit like composition again, but “fred the user” will inherit all the attributes of “fred
the person”, so it is better than where we were before.
If there is no corresponding user object for “fred the person”, accessing the user attribute will
raise an exception.