Ming is a SQLAlchemy-inspired object-document mapper (ODM) for MongoDB developed at SourceForge which is also used by the TurboGears2 web framework to provide mongodb support.
After a short introduction to the basic Ming layer we will cover the Ming Object Document Mapper layer to show how to take advantage of its Unit Of Work to avoid performing incomplete changes and achieve relations between collections.
The last part of the talk will show how to use Ming to perform lazy migration of data when your schema changes and how to drop below the ODM layer to achieve maximum speed.
2. Who am I
● CTO @ Axant.it, mostly Python company
(with some iOS and Android)
● TurboGears development team member
● Contributions to Ming project ODM layer
● Really happy to be here at PyConUK!
○ I thought I would have crashed my car driving on
the wrong side!
3. MongoDB Models
● Schema free
○ It looks like you don’t have a schema, but your
code depends on properties that need to be there.
● SubDocuments
○ You know that a blog post contain a list of
comments, but what it is a comment?
● Relations
○ You don’t have joins and foreign keys, but you still
need to express relationships
4. What’s Ming?
● MongoDB toolkit
○ Validation layer on pymongo
○ Manages schema migrations
○ In Memory MongoDB
○ ODM on top of all of those
● Born at sourceforge.net
● Supported by TurboGears
community
MongoDB
PyMongo
Ming
Ming.ODM
5. Getting Started with the ODM
● Ming.ODM looks like SQLAlchemy
● UnitOfWork
○ Avoid half-saved changes in case of crashes
○ Flush all your changes at once
● IdentityMap
○ Same DB objects are the same object in memory
● Supports Relations
● Supports events (after_insert, before_update, …)
6. Declaring Schema with the ODM
class WikiPage(MappedClass):
# Metadata for the collection
# like its name, indexes, session, ...
class __mongometa__:
session = DBSession
name = 'wiki_page'
unique_indexes = [('title',)]
_id = FieldProperty(schema.ObjectId)
title = FieldProperty(schema.String)
text = FieldProperty(schema.String)
# Ming automatically generates
# the relationship query
comments = RelationProperty('WikiComment')
class WikiComment(MappedClass):
class __mongometa__:
session = DBSession
name = 'wiki_comment'
_id = FieldProperty(schema.ObjectId)
text=FieldProperty(s.String, if_missing='')
# Provides an actual relation point
# between comments and pages
page_id = ForeignIdProperty('WikiPage')
● Declarative interface for models
● Supports polymorphic models
7. Querying the ODM
wp = WikiPage.query.get(title='FirstPage')
# Identity map prevents duplicates
wp2 = WikiPage.query.get(title='FirstPage')
assert wp is wp2
# manually fetching related comments
comments = WikiComment.query.find(dict(page_id=wp._id)).all()
# or
comments = wp.comments
# gets last 5 wikipages in natural order
wps = WikiPage.query.find().sort('$natural', DESCENDING).limit(5).all()
● Query language tries to be natural for both
SQLAlchemy and MongoDB users
8. The Unit Of Work
● Flush or Clear the pending changes
● Avoid mixing UOW and atomic operations
● UnitOfWork as a cache
wp = WikiPage(title='FirstPage', text='This is my first page')
DBSession.flush()
wp.title = "TITLE 2"
DBSession.update(WikiPage, {'_id':wp._id}, {'$set': {'title': "TITLE 3"}})
DBSession.flush() # wp.title will be TITLE 2, not TITLE 3
wp2 = DBSession.get(WikiPage, wp._id)
# wp2 lookup won’t query the database again
9. How Validation works
● Ming documents are validated at certain
points in their life cycle
○ When saving the document to the database
○ When loading it from the database.
○ Additionally, validation is performed when the
document is created through the ODM layer or
using the .make() method
■ Happens before they get saved for real
10. Cost of Validation
● MongoDB is famous for its speed, but
validation has a cost
○ MongoDB documents can contain many
subdocuments
○ Each subdocument must be validated by ming
○ Can even contain lists of multiple subdocuments
11. Cost of Validation benchmark
#With Validation
class User(MappedClass):
# ...
friends = FieldProperty([dict(fbuser=s.String,
photo=s.String,
name=s.String)], if_missing=[])
>>> timeit.timeit('User.query.find().all()', number=20000)
31.97218942642212
#Without Validation
class User(MappedClass):
# ...
friends = FieldProperty(s.Anything, if_missing=[])
>>> timeit.timeit('User.query.find().all()', number=20000)
23.391359090805054
#Avoiding the field at query time
>>> timeit.timeit('User.query.find({}, fields=("_id","name")).all()', number=20000)
21.58667516708374
12. Only query what you need
● Previous benchmark explains why it is
good to query only for fields you need to
process the current request
● All the fields you don’t query for, will still
be available in the object with None value
13. Evolving the Schema
● Migrations are performed lazily as the
objects are loaded from the database
● Simple schema evolutions:
○ New field: It will just be None for old entities.
○ Removed: Declare it as ming.schema.Deprecated
○ Changed Type: Declare it as ming.schema.Migrate
● Complex schema evolutions:
○ Add a migration function in __mongometa__
14. Complex migrations with Ming
class OldWikiPage(Document):
_id = Field(schema.ObjectId)
title = Field(str)
text = Field(str, if_missing='')
metadata = Field(dict(tags=[str], categories=[str]))
class WikiPage(Document):
class __mongometa__:
session = DBSession
name = 'wiki_page'
version_of = OldWikiPage
def migrate(data):
result = dict(data, version=1, tags=data['metadata']['tags'],
categories=data['metadata']['categories'])
del result['metadata']
return result
version = Field(1, required=True)
# … more fields ...
15. Testing MongoDB
● Ming makes testing easy
○ Your models can be directly imported from tests
○ Just bind the session to a DataStorage created in
your tests suite
● Ming provides MongoInMemory
○ much like sqlite://:memory:
● Implements 90% of mongodb, including
javascript execution with spidermonkey
16. Ming for Web Applications
● Ming can be integrated in any WSGI
framework through the ming.odm.
middleware.MingMiddleware
○ Automatically disposes open sessions at the end
of requests
○ Automatically provides session flushing
○ Automatically clears the session in case of
exceptions
17. Ming with TurboGears
● Provides builtin support for ming
○ $ gearbox quickstart --ming projectname
● Ready made test suite with fixtures on MIM
● Facilities to debug and benchmark Ming
queries through the DebugBar
● TurboGears Admin automatically
generates CRUD from Ming models
18. Debugging MongoDB
● TurboGears debugbar has builtin support
for MongoDB
○ Executed queries logging and results
○ Queries timing
○ Syntax prettifier and highlight for Map-Reduce and
$where javascript code
○ Queries tracking on logs for performance
reporting of webservices
20. Ming without learning MongoDB
● Transition from SQL/Relational solutions
to MongoDB can be scary first time.
● You can use Sprox to lower the learning
cost for simple applications
○ Sprox is the library that empowers TurboGears
Admin to automatically generate pages from
SQLA or Ming
21. Sprox ORM abstractions
● ORMProvider, provides an abstraction over
the ORM
● ORMProviderSelector, automatically
detects the provider to use from a model.
● Mix those together and you have a db
independent layer with automatic storage
backend detection.
22. Hands on Sprox
● Provider.query(self, entity, **kwargs) → get all objects
of a collection
● Provider.get_obj(self, entity, params) → get an object
● Provider.update(self, entity, params) → update an
object
● Provider.create(self, entity, params) → create a new
object
# Sprox (Ming or SQLAlchemy)
count, transactions = provider.query(MoneyTransfer)
transactions = DBSession.query(MoneyTransfer).all() # SQLAlchemy
transactions = MoneyTransfer.query.find().all() # Ming