This document provides an overview of using MongoDB with Python. It introduces pymongo, the official Python driver for MongoDB, and covers connecting to MongoDB, performing CRUD operations, aggregation, GridFS for large files, indexing, and ODM frameworks. The presenter is Norberto Leite, a MongoDB Technical Evangelist based in Madrid, Spain.
9. MongoDB Features
JSON Document Model
with Dynamic Schemas
Auto-Sharding for
Horizontal Scalability
Text Search
Aggregation Framework
and MapReduce
Full, Flexible Index Support
and Rich Queries
Built-In Replication
for High Availability
Advanced Security
Large Media Storage
with GridFS
11. THE LARGEST ECOSYSTEM
9,000,000+
MongoDB Downloads
250,000+
Online Education Registrants
35,000+
MongoDB User Group Members
35,000+
MongoDB Management Service (MMS) Users
750+
Technology and Services Partners
2,000+
Customers Across All Industries
13. pymongo
• MongoDB Python official driver
• Rockstart developer team
• Jesse Jiryu Davis, Bernie Hackett
• One of oldest and better maintained drivers
• Python and MongoDB are a natural fit
• BSON is very similar to dictionaries
• (everyone likes dictionaries)
• http://api.mongodb.org/python/current/
• https://github.com/mongodb/mongo-python-driver
14. pymongo 3.0
!
• Server discovery spec
• Monitoring spec
• Faster client startup when connecting to Replica Set
• Faster failover
• More robust replica set connections
• API clean up
19. Connecting to Replica Set
#!/bin/python
from pymongo import MongoClient
!
uri = ‘mongodb://127.0.0.1?replicaSet=MYREPLICA'
mc = MongoClient(uri)
20. Connecting to Replica Set
#!/bin/python
from pymongo import MongoClient
!
uri = ‘mongodb://127.0.0.1'
mc = MongoClient(host=uri, replicaSet='MYREPLICA')
26. Insert
#!/bin/python
from pymongo import MongoClient
mc = MongoClient()
!
coll = mc['madrid_pug']['testcollection']
!
!
coll.insert( {'field_one': 'some value'})
27. Find
#!/bin/python
from pymongo import MongoClient
mc = MongoClient()
!
coll = mc['madrid_pug']['testcollection']
!
!
cur = coll.find_one( {'field_one': 'some value'})
!
for d in cur:
print d
34. GridFS
• MongoDB has a 16MB document size limit
• So how can we store data bigger than 16MB?
• Media files (images, pdf’s, long binary files …)
• GridFS
• Convention more than a feature
• All drivers implement this convention
• pymongo is no different
• Very flexible approach
• Handy out-of-the-box solution
42. Indexes
• Single Field
• Compound
• Multikey
• Geospatial
• 2d
• 2dSphere - GeoJSON
• Full Text
• Hash Based
• TTL indexes
• Unique
• Sparse
43. Single Field Index
from pymongo import ASCENDING, MongoClient
mc = MongoClient()
!
coll = mc.madrid_pug.testcollection
!
coll.ensure_index( 'some_single_field', ASCENDING )
indexed
field indexing
order
44. Compound Field Index
from pymongo import ASCENDING, DESCENDING, MongoClient
mc = MongoClient()
!
coll = mc.madrid_pug.testcollection
!
coll.ensure_index( [('field_ascending', ASCENDING),
('field_descending', DESCENDING)] )
indexed
fields indexing
order
45. Multikey Field Index
mc = MongoClient()
!
coll = mc.madrid_pug.testcollection
!
!
coll.insert( {'array_field': [1, 2, 54, 89]})
!
coll.ensure_index( 'array_field')
indexed
field
46. Geospatial Field Index
from pymongo import GEOSPHERE
import geojson
!
!
p = geojson.Point( [-73.9858603477478, 40.75929362758241])
!
coll.insert( {'point', p)
!
coll.ensure_index( [( 'point', GEOSPHERE )])
index
type