SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Migrating to MongoDB
Why we moved from MySQL to Mongo
Getting to know Mongo
Demo app using Mongo with PHP
Reasons we looked for
alternative to RDBM setup
Issues with our RDBM setup

Architecture was highly distributed, number of
databases was becoming an issue
Storing similar objects with different structure
Options for scalability
Storing files
Many DBs
In a MySQL server (with MyISAM)...
  1 database = 1 directory
  1 table = more than 1 file in DB directory
Filesystem limits number of inodes per directory and it’s
not that big
Had a mix of MySQL with SQLite databases spreaded
across directory hierarchy
Many DBs
In a Mongo server ...
  No 1:1 relation between databases and files
  Stores data set of files pre-allocated with increasing
  size
  Number of files grows as needed
Using many collections within a single database
allowed to move everything in DB server
A “collection”?

 RDBM model:
   Database has tables which hold records
   Records in a table are identical
 Document-oriented storage
   Database has collections which hold documents
Obj. with differing structure

 For example, events where attributes vary based on
 type of event
   Event A: from, att1
   Event B: from, att1, att2
   Event C: from, att3, att4
 What’s your schema for this?
tbl_events_A
      id     from          Att1

      1      Jim           1237

      2      Dave          362                  tbl_events_C
      3      Bob           9283         id   from    Att3      Att4

                                        1    Bob     hello     7249

       tbl_events_B                     2    Bill   goodbye   23091

id   from           Att1         Att2   3    Jim    testing    2334

1    Bill       2938              23

2    Jim            632           9

3    Hugh      12832              14
tbl_events
id   type   from   Att1     Att2    Att3     Att4
1     A     Jim    1237    NULL     NULL     NULL
2     A     Dave   362     NULL     NULL     NULL
3     B     Bill   2938     23      NULL     NULL
4     C     Bob    NULL    NULL     hello    7249
5     A     Bob    9283    NULL     NULL     NULL
6     C     Bill   NULL    NULL    goodbye   23091
7     B     Jim    632       9      NULL     NULL
8     B     Hugh   12832    14      NULL     NULL
9     C     Jim    NULL    NULL    testing   2334
tbl_events
id   type   from                    Attributes
1     A     Jim                  “{‘att1’:1237}”
2     A     Dave                  “{‘att1’:362}”
3     B     Bill            “{‘att1’:2938, ‘att2’:23}”
4     C     Bob           “{‘att3’:‘hello’, ‘att4’:7249}”
5     A     Bob                  “{‘att1’:9283}”
6     C     Bill        “{‘att3’:‘goodbye’, ‘att4’:2391}”
7     B     Jim              “{‘att1’:632, ‘att2’:9}”
8     B     Hugh           “{‘att1’:12832, ‘att2’:14}”
9     C     Jim          “{‘att3’:‘testing’, ‘att4’:2334}”
tbl_events               tbl_events_attributes
id     type       from   id      eventId     name        value
1       A         Jim    1         1             att1    1237
2       A         Dave   2         2             att1    362
3       B         Bill   3         3             att1    2938
4       C         Bob    4         3             att2     23
5       A         Bob    5         4             att3    hello
6       C         Bill
                         6         4             att4    7249
7       B         Jim
                         7         5             att1    9283
8       B         Hugh
                         8         6             att3   goodbye
9       C         Jim
                         9         6             att4    2391
                         10        7             att1    632
                         11        7             att2     9
                                           ...
Obj. with differing structure

 Document-oriented storage link Mongo is schema-less
   1 collection for all events
   Each document has the structure applicable for its
   type
   Can index common attributes for queries
events collection :

{id:1,   type:’A’,   from:‘Jim’, att1:1237}
{id:2,   type:’A’,   from:‘Dave’, att1:362}
{id:5,   type:’A’,   from:‘Bob’, att1:9238}
{id:3,   type:’B’,   from:‘Bill’, att1:2938, att2:23}
{id:7,   type:’B’,   from:‘Jim’, att1:632, att2:9}
{id:8,   type:’B’,   from:‘Hugh’, att1:12832, att2:14}
{id:4,   type:’C’,   from:‘Bill’, att3:‘hello’, att4:7249}
{id:6,   type:’C’,   from:‘Jim’, att3:‘goodbye’, att4:23091}
{id:9,   type:’C’,   from:‘Hugh’, att3:‘testing’, att4:2334}
Options for scalability


 MySQL - Master-slave replication
 Mongo - Support master slave, replica pairs, master
 master and ... auto-sharding
Storing files

 In MySQL, you can use a table with BLOB field and
 other field for file meta data
 Mongo has GridFS
   Built for storage of large objects
   Split into chunks, also stores metadata
> db.fs.files.findOne();
{
! "_id" : ObjectId("4b9525096b00bd59b95f791f"),
! "filename" : "user.png",
! "length" : 43717,
! "chunkSize" : 262144,
! "uploadDate" : "Mon Mar 08 2010 11:25:45 GMT-0500 (EST)",
! "md5" : "3f6fcd4c0a51655d392fe95a99c29140",
! "mimeType" : "image/png"
}
> db.fs.chunks.findOne();
{
! "_id" : ObjectId("4b952509c568bb9fc8e3cddb"),
! "files_id" : ObjectId("4b9525096b00bd59b95f791f"),
! "n" : 0,
! "data" : BinData type: 2 len: 43721
}
Getting to know MongoDB
Basic concepts
A database has collections which holds documents
Documents in a collection can have any structure
Documents are JSON objects, stored as BSON
Data types:
  all basic JSON types: string, integer, boolean,
  double, null, array, object
  Special types: date, object id, binary, regexp, code
Important differences

 Collections instead of tables
 ObjectID instead of primary keys
 References instead of foreign keys
 JavaScript code execution instead of stored
 procedures
 [NULL] instead of joins
Inserting data
> doc = { author: 'joe',
  created : new Date('03-28-2009'),
  title : 'Yet another blog post',
  text : 'Here is the text...',
  tags : [ 'example', 'joe' ],
  comments : [
    { author: 'jim', comment: 'I disagree' },
    { author: 'nancy', comment: 'Good post' }
  ]
}
> db.posts.insert(doc);
Querying data
>   db.posts.find();
>   db.posts.find({‘author’:‘joe’});
>   db.posts.find({‘comments.author’:‘nancy’});
>   db.posts.find({‘comments.comment’: /disagree/i });

> db.posts.findOne({‘comment.author’:‘nancy’});
> db.posts.find({‘comment.author’:‘nancy’}).limit(5);

> db.posts.find({},{‘author’:true, ‘tags’:true});

> db.posts.find({‘author’:‘nancy’}).sort({‘created’:1});
Querying - advanced
features
  Support of OR conditions
  $ modifiers to introduce conditions
> db.posts.find({timestamp: {$gte:1268149684}});

  $where modifiers
> db.pictures.find({$where: function() { return
(this.creationTimestamp >= 1268149684) }})

  MapReduce
  Server-side code execution
> function getUniques() {
...   var uniques = [];
...   db.pictures.find({},{tags:true}).forEach(function(pic) {
...     pic.tags.forEach(function(tag) {
...       if (uniques.indexOf(tag) == -1) uniques.push(tag);
...     });
...   });
...   return uniques;
... }
> db.eval(getUniques);
[
! "firstTag",
! "thirdTag",
! "toto",
! "test",
! "comic",
! "secondTag"
]
Updating data
update( criteria, objNew, upsert, multi )
> db.myColl.update( { name: "Joe" }, { name: "Joe", age:
20 }, true, false );


save(object) - insert or update if _id exists
Update modifier operators

  $inc, $set, $unset, $push, $pushAll, $addToSet, $pop,
  $pull, $pullAll
> db.myColl.update({name:"Joe"}, { $set:{age:20}});

> db.posts.update({author:”Joe”},{$push:{tags:‘hockey’}});

> db.posts.update({},{$addToSet:{tags:‘hockey’}});
Removing data
> db.things.remove({});    // removes all
> db.things.remove({n:1}); // removes all where n == 1
> db.things.remove({_id: myobject._id});
References
>   p = db.postings.findOne();
{
!    "_id" : ObjectId("4b866f08234ae01d21d89604"),
!    "author" : "jim",
!    "title" : "Brewing Methods"
}
>   // get more info on author
>   db.users.findOne( { _id : p.author } )
{   "_id" : "jim", "email" : "jim@gmail.com" }
>   x = { name : 'Biology' }
{   "name" : "Biology" }
>   db.courses.save(x)
>   x
{   "name" : "Biology", "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1") }

> stu = { name : 'Joe', classes : [ new DBRef('courses', x._id) ] }
> db.students.save(stu)
> stu
{
        "name" : "Joe",
        "classes" : [
                 {
                        "$ref" : "courses",
                        "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1")
                 }
        ],
        "_id" : ObjectId("4b0552e4f0da7d1eb6f126a2")
}
> stu.classes[0]
{ "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") }

> stu.classes[0].fetch()
{ "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1"), "name" : "Biology" }
Limitations to keep in mind


 Namespace limit (24 000 collections and indexes)
 Database size maxed to 2GB on 32-bit systems ... use
 a 64-bit production system!
Licensing

   MongoDB is GNU AGPL 3.0, supported drivers re
   Apache License v2.0
   From www.mongodb.org/display/DOCS/Licensing :
If you are using a vanilla MongoDB server from either source or binary packages you
have NO obligations. You can ignore the rest of this page.
Hands-on example
SQL schema
                                                               tags
            pictures
                                                   pictureId          int
pictureId           int
                                                   tag                varchar
title               varchar

creationTimestamp   int
content             blob




             users
userId              int                   comments
name                varchar   pictureId           int

                              userId              int
                              txt                 varchar

                              creationTimestamp   int
let’s see some code ...

Contenu connexe

En vedette

Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous DeploymentBrian Moon
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redisqianshi
 
Why Memcached?
Why Memcached?Why Memcached?
Why Memcached?Gear6
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for BeginnersEnoch Joshua
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Payara
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 

En vedette (13)

Continuous Deployment
Continuous DeploymentContinuous Deployment
Continuous Deployment
 
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
 
Why Memcached?
Why Memcached?Why Memcached?
Why Memcached?
 
Mongo db basics
Mongo db basicsMongo db basics
Mongo db basics
 
Mongo DB
Mongo DB Mongo DB
Mongo DB
 
MongoDB for Beginners
MongoDB for BeginnersMongoDB for Beginners
MongoDB for Beginners
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Mongo db
Mongo dbMongo db
Mongo db
 
Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?Microservices Platforms - Which is Best?
Microservices Platforms - Which is Best?
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 

Similaire à ConFoo - Migrating To Mongo Db

Windows Azure Storage
Windows Azure StorageWindows Azure Storage
Windows Azure Storagegoodfriday
 
San Francisco Java User Group
San Francisco Java User GroupSan Francisco Java User Group
San Francisco Java User Groupkchodorow
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingBoxed Ice
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingBoxed Ice
 
Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Steve Smith
 

Similaire à ConFoo - Migrating To Mongo Db (7)

Windows Azure Storage
Windows Azure StorageWindows Azure Storage
Windows Azure Storage
 
San Francisco Java User Group
San Francisco Java User GroupSan Francisco Java User Group
San Francisco Java User Group
 
Tricks
TricksTricks
Tricks
 
MongoDB - Monitoring and queueing
MongoDB - Monitoring and queueingMongoDB - Monitoring and queueing
MongoDB - Monitoring and queueing
 
MongoDB - Monitoring & queueing
MongoDB - Monitoring & queueingMongoDB - Monitoring & queueing
MongoDB - Monitoring & queueing
 
Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015Understanding Git - GOTO London 2015
Understanding Git - GOTO London 2015
 
Git as NoSQL
Git as NoSQLGit as NoSQL
Git as NoSQL
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

ConFoo - Migrating To Mongo Db

  • 1. Migrating to MongoDB Why we moved from MySQL to Mongo Getting to know Mongo Demo app using Mongo with PHP
  • 2.
  • 3. Reasons we looked for alternative to RDBM setup
  • 4. Issues with our RDBM setup Architecture was highly distributed, number of databases was becoming an issue Storing similar objects with different structure Options for scalability Storing files
  • 5. Many DBs In a MySQL server (with MyISAM)... 1 database = 1 directory 1 table = more than 1 file in DB directory Filesystem limits number of inodes per directory and it’s not that big Had a mix of MySQL with SQLite databases spreaded across directory hierarchy
  • 6. Many DBs In a Mongo server ... No 1:1 relation between databases and files Stores data set of files pre-allocated with increasing size Number of files grows as needed Using many collections within a single database allowed to move everything in DB server
  • 7. A “collection”? RDBM model: Database has tables which hold records Records in a table are identical Document-oriented storage Database has collections which hold documents
  • 8. Obj. with differing structure For example, events where attributes vary based on type of event Event A: from, att1 Event B: from, att1, att2 Event C: from, att3, att4 What’s your schema for this?
  • 9. tbl_events_A id from Att1 1 Jim 1237 2 Dave 362 tbl_events_C 3 Bob 9283 id from Att3 Att4 1 Bob hello 7249 tbl_events_B 2 Bill goodbye 23091 id from Att1 Att2 3 Jim testing 2334 1 Bill 2938 23 2 Jim 632 9 3 Hugh 12832 14
  • 10. tbl_events id type from Att1 Att2 Att3 Att4 1 A Jim 1237 NULL NULL NULL 2 A Dave 362 NULL NULL NULL 3 B Bill 2938 23 NULL NULL 4 C Bob NULL NULL hello 7249 5 A Bob 9283 NULL NULL NULL 6 C Bill NULL NULL goodbye 23091 7 B Jim 632 9 NULL NULL 8 B Hugh 12832 14 NULL NULL 9 C Jim NULL NULL testing 2334
  • 11. tbl_events id type from Attributes 1 A Jim “{‘att1’:1237}” 2 A Dave “{‘att1’:362}” 3 B Bill “{‘att1’:2938, ‘att2’:23}” 4 C Bob “{‘att3’:‘hello’, ‘att4’:7249}” 5 A Bob “{‘att1’:9283}” 6 C Bill “{‘att3’:‘goodbye’, ‘att4’:2391}” 7 B Jim “{‘att1’:632, ‘att2’:9}” 8 B Hugh “{‘att1’:12832, ‘att2’:14}” 9 C Jim “{‘att3’:‘testing’, ‘att4’:2334}”
  • 12. tbl_events tbl_events_attributes id type from id eventId name value 1 A Jim 1 1 att1 1237 2 A Dave 2 2 att1 362 3 B Bill 3 3 att1 2938 4 C Bob 4 3 att2 23 5 A Bob 5 4 att3 hello 6 C Bill 6 4 att4 7249 7 B Jim 7 5 att1 9283 8 B Hugh 8 6 att3 goodbye 9 C Jim 9 6 att4 2391 10 7 att1 632 11 7 att2 9 ...
  • 13. Obj. with differing structure Document-oriented storage link Mongo is schema-less 1 collection for all events Each document has the structure applicable for its type Can index common attributes for queries
  • 14. events collection : {id:1, type:’A’, from:‘Jim’, att1:1237} {id:2, type:’A’, from:‘Dave’, att1:362} {id:5, type:’A’, from:‘Bob’, att1:9238} {id:3, type:’B’, from:‘Bill’, att1:2938, att2:23} {id:7, type:’B’, from:‘Jim’, att1:632, att2:9} {id:8, type:’B’, from:‘Hugh’, att1:12832, att2:14} {id:4, type:’C’, from:‘Bill’, att3:‘hello’, att4:7249} {id:6, type:’C’, from:‘Jim’, att3:‘goodbye’, att4:23091} {id:9, type:’C’, from:‘Hugh’, att3:‘testing’, att4:2334}
  • 15. Options for scalability MySQL - Master-slave replication Mongo - Support master slave, replica pairs, master master and ... auto-sharding
  • 16. Storing files In MySQL, you can use a table with BLOB field and other field for file meta data Mongo has GridFS Built for storage of large objects Split into chunks, also stores metadata
  • 17. > db.fs.files.findOne(); { ! "_id" : ObjectId("4b9525096b00bd59b95f791f"), ! "filename" : "user.png", ! "length" : 43717, ! "chunkSize" : 262144, ! "uploadDate" : "Mon Mar 08 2010 11:25:45 GMT-0500 (EST)", ! "md5" : "3f6fcd4c0a51655d392fe95a99c29140", ! "mimeType" : "image/png" } > db.fs.chunks.findOne(); { ! "_id" : ObjectId("4b952509c568bb9fc8e3cddb"), ! "files_id" : ObjectId("4b9525096b00bd59b95f791f"), ! "n" : 0, ! "data" : BinData type: 2 len: 43721 }
  • 18. Getting to know MongoDB
  • 19. Basic concepts A database has collections which holds documents Documents in a collection can have any structure Documents are JSON objects, stored as BSON Data types: all basic JSON types: string, integer, boolean, double, null, array, object Special types: date, object id, binary, regexp, code
  • 20. Important differences Collections instead of tables ObjectID instead of primary keys References instead of foreign keys JavaScript code execution instead of stored procedures [NULL] instead of joins
  • 21. Inserting data > doc = { author: 'joe', created : new Date('03-28-2009'), title : 'Yet another blog post', text : 'Here is the text...', tags : [ 'example', 'joe' ], comments : [ { author: 'jim', comment: 'I disagree' }, { author: 'nancy', comment: 'Good post' } ] } > db.posts.insert(doc);
  • 22. Querying data > db.posts.find(); > db.posts.find({‘author’:‘joe’}); > db.posts.find({‘comments.author’:‘nancy’}); > db.posts.find({‘comments.comment’: /disagree/i }); > db.posts.findOne({‘comment.author’:‘nancy’}); > db.posts.find({‘comment.author’:‘nancy’}).limit(5); > db.posts.find({},{‘author’:true, ‘tags’:true}); > db.posts.find({‘author’:‘nancy’}).sort({‘created’:1});
  • 23. Querying - advanced features Support of OR conditions $ modifiers to introduce conditions > db.posts.find({timestamp: {$gte:1268149684}}); $where modifiers > db.pictures.find({$where: function() { return (this.creationTimestamp >= 1268149684) }}) MapReduce Server-side code execution
  • 24. > function getUniques() { ... var uniques = []; ... db.pictures.find({},{tags:true}).forEach(function(pic) { ... pic.tags.forEach(function(tag) { ... if (uniques.indexOf(tag) == -1) uniques.push(tag); ... }); ... }); ... return uniques; ... } > db.eval(getUniques); [ ! "firstTag", ! "thirdTag", ! "toto", ! "test", ! "comic", ! "secondTag" ]
  • 25. Updating data update( criteria, objNew, upsert, multi ) > db.myColl.update( { name: "Joe" }, { name: "Joe", age: 20 }, true, false ); save(object) - insert or update if _id exists
  • 26. Update modifier operators $inc, $set, $unset, $push, $pushAll, $addToSet, $pop, $pull, $pullAll > db.myColl.update({name:"Joe"}, { $set:{age:20}}); > db.posts.update({author:”Joe”},{$push:{tags:‘hockey’}}); > db.posts.update({},{$addToSet:{tags:‘hockey’}});
  • 27. Removing data > db.things.remove({}); // removes all > db.things.remove({n:1}); // removes all where n == 1 > db.things.remove({_id: myobject._id});
  • 28. References > p = db.postings.findOne(); { ! "_id" : ObjectId("4b866f08234ae01d21d89604"), ! "author" : "jim", ! "title" : "Brewing Methods" } > // get more info on author > db.users.findOne( { _id : p.author } ) { "_id" : "jim", "email" : "jim@gmail.com" }
  • 29. > x = { name : 'Biology' } { "name" : "Biology" } > db.courses.save(x) > x { "name" : "Biology", "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } > stu = { name : 'Joe', classes : [ new DBRef('courses', x._id) ] } > db.students.save(stu) > stu { "name" : "Joe", "classes" : [ { "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } ], "_id" : ObjectId("4b0552e4f0da7d1eb6f126a2") } > stu.classes[0] { "$ref" : "courses", "$id" : ObjectId("4b0552b0f0da7d1eb6f126a1") } > stu.classes[0].fetch() { "_id" : ObjectId("4b0552b0f0da7d1eb6f126a1"), "name" : "Biology" }
  • 30. Limitations to keep in mind Namespace limit (24 000 collections and indexes) Database size maxed to 2GB on 32-bit systems ... use a 64-bit production system!
  • 31. Licensing MongoDB is GNU AGPL 3.0, supported drivers re Apache License v2.0 From www.mongodb.org/display/DOCS/Licensing : If you are using a vanilla MongoDB server from either source or binary packages you have NO obligations. You can ignore the rest of this page.
  • 33. SQL schema tags pictures pictureId int pictureId int tag varchar title varchar creationTimestamp int content blob users userId int comments name varchar pictureId int userId int txt varchar creationTimestamp int
  • 34. let’s see some code ...