SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Pierre-Louis Gottfrois
Bastien Murzeau
Apéro Ruby Bordeaux, 8 novembre 2011
• Brève introduction


• Cas pratique


• Map / Reduce
Qu’est ce que mongoDB ?


 mongoDB est une base de donnée
        de type NoSQL,
          sans schéma
       document-oriented
sans-schéma

• Très utile en développements
  ‘agiles’ (itérations, rapidité de modifications,
  flexibilité pour les développeurs)

• Supporte des fonctionnalités qui seraient, en
  BDDs relationnelles :
 • quasi-impossible (stockage d’éléments non finis, ex. tags)

 • trop complexes pour ce qu’elles sont (migrations)
document-oriented
• mongoDB stocke des documents, pas de
  rows

 • les documents sont stockés sous forme de
   JSON; binary JSON

• la syntaxe de requêtage est aussi fournie que
  SQL

• le mécanisme de documents ‘embedded’
  résout bon nombre de problèmes rencontrés
document-oriented

• Les documents sont stockés dans une
 collection, en RoR = model


• une partie des ces données sont indexées
 pour optimiser les performances


• un document n’est pas une poubelle !
stockage de données
        volumineuses
• mongoDB (et autres NoSQL) sont plus
 performantes pour la scalabilité horizontale
 • ajout de serveurs pour augmenter la capacité
   de stockage («sharding»)
 • garantissant ainsi une meilleur disponibilité
 • load-balancing optimisé entre les nodes
 • augmentation transparente pour l’application
Cas pratique
• ORM devient ODM, la gem de référence mongoid
  • ou : mongoMapper, DataMapper
• Création d’une application a base de NoSQL MongoDB
  • rails new nosql
  • edition du Gemfile
    •   gem ‘mongoid’

    •   gem ‘bson_ext’

  • bundle install
  • rails generate mongoid:config
Cas pratique
• edition du config/application.rb
  • #require 'rails/all'
  • require "action_controller/railtie"
  • require "action_mailer/railtie"
  • require "active_resource/railtie"
  • require "rails/test_unit/railtie"
Cas pratique
class Subject
  include Mongoid::Document
  include Mongoid::Timestamps

  has_many :scores,     :as => :scorable, :dependent => :delete, :autosave => true
  has_many :requests,   :dependent => :delete
  belongs_to :author,   :class_name => 'User'




    class Conversation
      include Mongoid::Document
      include Mongoid::Timestamps


      field :public,            :type => Boolean, :default => false

      has_many :scores,         :as => :scorable, :dependent => :delete
      has_and_belongs_to_many   :subjects
      belongs_to :timeline
      embeds_many :messages
Map Reduce
Example


                               A “ticket” collection




{                       {                       {                       {
    “id” : 1,               “id” : 2,               “id” : 3,               “id” : 4,
    “day” : 20111017,       “day” : 20111017,       “day” : 20111017,       “day” : 20111017,
    “checkout” : 100        “checkout” : 42         “checkout” : 215        “checkout” : 73
}                       }                       }                       }
Problematic

• We want to
 • Calculate the ‘checkout’ sum of each object in our
    ticket’s collection

 • Be able to distribute this operation over the network
 • Be fast!
• We don’t want to
 • Go over all objects again when an update is made
Map : emit(checkout)

    The ‘map’ function emit (select) every checkout value
               of each object in our collection


          100                      42                     215                      73



{                       {                       {                       {
    “id” : 1,               “id” : 2,               “id” : 3,               “id” : 4,
    “day” : 20111017,       “day” : 20111017,       “day” : 20111017,       “day” : 20111017,
    “checkout” : 100        “checkout” : 42         “checkout” : 215        “checkout” : 73
}                       }                       }                       }
Reduce : sum(checkout)
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 215          “checkout” : 73
}                         }                         }                        }
Reduce function

 The ‘reduce’ function apply the algorithmic logic
 for each key/value received from ‘map’ function

This function has to be ‘idempotent’ to be called
      recursively or in a distributed system

reduce(k, A, B) == reduce(k, B, A)
reduce(k, A, B) == reduce(k, reduce(A, B))
Inherently Distributed
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 215          “checkout” : 73
}                         }                         }                        }
Distributed
Since ‘map’ function emits objects to be reduced
and ‘reduce’ function processes for each emitted
   objects independently, it can be distributed
            through multiple workers.




         map                     reduce
Logaritmic Update

For the same reason, when updating an object, we
    don’t have to reprocess for each obejcts.

   We can call ‘map’ function only on updated
                     objects.
Logaritmic Update
                                                  430




                        142                                                 288




          100                        42                       215                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logaritmic Update
                                                  430




                        142                                                 288




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logaritmic Update
                                                  430




                        142                                                 283




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Logarithmic Update
                                                  425




                        142                                                 283




          100                        42                       210                        73



{                         {                         {                        {
    “id” : 1,                 “id” : 2,                 “id” : 3,                 “id” : 4,
    “day” : 20111017,         “day” : 20111017,         “day” : 20111017,         “day” : 20111017,
    “checkout” : 100          “checkout” : 42           “checkout” : 210          “checkout” : 73
}                         }                         }                        }
Let’s do some code!
$> mongo

>   db.tickets.save({   "_id":   1,   "day":   20111017,   "checkout":   100 })
>   db.tickets.save({   "_id":   2,   "day":   20111017,   "checkout":   42 })
>   db.tickets.save({   "_id":   3,   "day":   20111017,   "checkout":   215 })
>   db.tickets.save({   "_id":   4,   "day":   20111017,   "checkout":   73 })

> db.tickets.count()
4

> db.tickets.find()
{ "_id" : 1, "day" : 20111017, "checkout" : 100 }
...

> db.tickets.find({ "_id": 1 })
{ "_id" : 1, "day" : 20111017, "checkout" : 100 }
> var map = function() {
... emit(null, this.checkout)
}

> var reduce = function(key, values) {
... var sum = 0
... for (var index in values) sum += values[index]
... return sum
}
Temporary Collection
> sumOfCheckouts = db.tickets.mapReduce(map, reduce)
{
  "result" : "tmp.mr.mapreduce_123456789_4",
  "timeMills" : 8,
  "counts" : { "input" : 4, "emit" : 4, "output" : 1 },
  "ok" : 1
}

> db.getCollectionNames()
[
  "tickets",
  "tmp.mr.mapreduce_123456789_4"
]

> db[sumOfCheckouts.result].find()
{ "_id" : null, "value" : 430 }
Persistent Collection
> db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" })

> db.getCollectionNames()
[
  "sumOfCheckouts",
  "tickets",
  "tmp.mr.mapreduce_123456789_4"
]

> db.sumOfCheckouts.find()
{ "_id" : null, "value" : 430 }

> db.sumOfCheckouts.findOne().value
430
Reduce by Date
> var map = function() {
... emit(this.date, this.checkout)
}

> var reduce = function(key, values) {
... var sum = 0
... for (var index in values) sum += values[index]
... return sum
}
> db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" })

> db.sumOfCheckouts.find()
{ "_id" : 20111017, "value" : 430 }
What we can do
Scored Subjects per
        User
Subject   User   Score
   1       1       2
   1       1       2
   1       2       2
   2       1       2
   2       2      10
   2       2       5
Scored Subjects per
   User (reduced)
Subject   User   Score

  1        1      4

  1        2      2

  2        1      2

  2        2      15
$> mongo

>   db.scores.save({   "_id":   1,   "subject_id":   1,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   2,   "subject_id":   1,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   3,   "subject_id":   1,   "user_id":   2,   "score":   2 })
>   db.scores.save({   "_id":   4,   "subject_id":   2,   "user_id":   1,   "score":   2 })
>   db.scores.save({   "_id":   5,   "subject_id":   2,   "user_id":   2,   "score":   10 })
>   db.scores.save({   "_id":   6,   "subject_id":   2,   "user_id":   2,   "score":   5 })

> db.scores.count()
6

> db.scores.find()
{ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
...

> db.scores.find({ "_id": 1 })
{ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
> var map = function() {
... emit([this.user_id, this.subject_id].join("-"), {subject_id:this.subject_id,
... user_id:this.user_id, score:this.score});
}

> var reduce = function(key, values) {
... var result = {user_id:"", subject_id:"", score:0};
... values.forEach(function (value) {result.score += value.score;result.user_id =
... value.user_id;result.subject_id = value.subject_id;});
... return result
}
ReducedScores
                         Collection
> db.scores.mapReduce(map, reduce, { "out" : "reduced_scores" })

> db.getCollectionNames()
[
  "reduced_scores",
  "scores"
]

>   db.reduced_scores.find()
{   "_id" : "1-1", "value" :   {   "user_id"   :   1,   "subject_id"   :   1,   "score"   :   4 } }
{   "_id" : "1-2", "value" :   {   "user_id"   :   1,   "subject_id"   :   2,   "score"   :   2 } }
{   "_id" : "2-1", "value" :   {   "user_id"   :   2,   "subject_id"   :   1,   "score"   :   2 } }
{   "_id" : "2-2", "value" :   {   "user_id"   :   2,   "subject_id"   :   2,   "score"   :   15 } }

> db.reduced_scores.findOne().score
4
Dealing with Rails Query

ruby-1.9.2-p180 :007 > ReducedScores.first
 => #<ReducedScores _id: 1-1, _type: nil, value: {"user_id"=>BSON::ObjectId('...'),
"subject_id"=>BSON::ObjectId('...'), "score"=>4.0}>

ruby-1.9.2-p180 :008 > ReducedScores.where("value.user_id" => u1.id).count
 => 2

ruby-1.9.2-p180 :009 > ReducedScores.where("value.user_id" => u1.id).first.value['score']
 => 4.0

ruby-1.9.2-p180 :010 > ReducedScores.where("value.user_id" => u1.id).last.value['score']
 => 2.0
Questions ?

Contenu connexe

En vedette

LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014bndmr
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB
 
sshGate - RMLL 2011
sshGate - RMLL 2011sshGate - RMLL 2011
sshGate - RMLL 2011Tauop
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB
 
Automatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSAutomatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSMongoDB
 
Le monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataLe monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataClaude Falguiere
 
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBPlus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBMongoDB
 
L\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesL\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesIbrahima FALL
 
Supervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosSupervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosAziz Rgd
 
ElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementMohamed hedi Abidi
 
Rapport de stage nagios
Rapport de stage nagiosRapport de stage nagios
Rapport de stage nagioshindif
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 
Installer et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxInstaller et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxZakariyaa AIT ELMOUDEN
 
Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Silicon Comté
 
Tirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchTirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchSéven Le Mesle
 

En vedette (17)

LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
LORD : un outil d'aide au codage des maladies - JFIM - 13 juin 2014
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
 
sshGate - RMLL 2011
sshGate - RMLL 2011sshGate - RMLL 2011
sshGate - RMLL 2011
 
MongoDB Deployment Checklist
MongoDB Deployment ChecklistMongoDB Deployment Checklist
MongoDB Deployment Checklist
 
Automatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMSAutomatisez votre gestion de MongoDB avec MMS
Automatisez votre gestion de MongoDB avec MMS
 
Le monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big DataLe monitoring à l'heure de DevOps et Big Data
Le monitoring à l'heure de DevOps et Big Data
 
Supervision
SupervisionSupervision
Supervision
 
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDBPlus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
Plus de flexibilité et de scalabilité chez Bouygues Télécom grâce à MongoDB
 
L\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et TechnologiesL\'authentification forte : Concept et Technologies
L\'authentification forte : Concept et Technologies
 
Supervision de réseau informatique - Nagios
Supervision de réseau informatique - NagiosSupervision de réseau informatique - Nagios
Supervision de réseau informatique - Nagios
 
ElasticSearch : Architecture et Développement
ElasticSearch : Architecture et DéveloppementElasticSearch : Architecture et Développement
ElasticSearch : Architecture et Développement
 
Rapport de stage nagios
Rapport de stage nagiosRapport de stage nagios
Rapport de stage nagios
 
PKI par la Pratique
PKI par la PratiquePKI par la Pratique
PKI par la Pratique
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 
Installer et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linuxInstaller et configurer NAGIOS sous linux
Installer et configurer NAGIOS sous linux
 
Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014Présentation de ElasticSearch / Digital apéro du 12/11/2014
Présentation de ElasticSearch / Digital apéro du 12/11/2014
 
Tirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearchTirer le meilleur de ses données avec ElasticSearch
Tirer le meilleur de ses données avec ElasticSearch
 

Similaire à Apéro RubyBdx - MongoDB - 8-11-2011

Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...InfluxData
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarMongoDB
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for AnalyticsMongoDB
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB
 
You will learn RxJS in 2017
You will learn RxJS in 2017You will learn RxJS in 2017
You will learn RxJS in 2017名辰 洪
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2GeoSolutions
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable CodeBaidu, Inc.
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxAndrei Negruti
 
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Databricks
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applicationsKexin Xie
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJSKyung Yeol Kim
 
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip  with a Webcam, a GSP and Some Fun with NodeHow to Hack a Road Trip  with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Nodepdeschen
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
D3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsD3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsApptension
 
Browsers with Wings
Browsers with WingsBrowsers with Wings
Browsers with WingsRemy Sharp
 
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONFun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONTomomi Imura
 

Similaire à Apéro RubyBdx - MongoDB - 8-11-2011 (20)

Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
 
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
MongoDB for Time Series Data: Analyzing Time Series Data Using the Aggregatio...
 
Search@airbnb
Search@airbnbSearch@airbnb
Search@airbnb
 
Operational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB WebinarOperational Intelligence with MongoDB Webinar
Operational Intelligence with MongoDB Webinar
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
MongoDB for Time Series Data Part 2: Analyzing Time Series Data Using the Agg...
 
Advancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGISAdvancing Scientific Data Support in ArcGIS
Advancing Scientific Data Support in ArcGIS
 
You will learn RxJS in 2017
You will learn RxJS in 2017You will learn RxJS in 2017
You will learn RxJS in 2017
 
What's new in GeoServer 2.2
What's new in GeoServer 2.2What's new in GeoServer 2.2
What's new in GeoServer 2.2
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
 
IT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptxIT Days - Parse huge JSON files in a streaming way.pptx
IT Days - Parse huge JSON files in a streaming way.pptx
 
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
Scaling Up: How Switching to Apache Spark Improved Performance, Realizability...
 
Scaling up data science applications
Scaling up data science applicationsScaling up data science applications
Scaling up data science applications
 
Compose Async with RxJS
Compose Async with RxJSCompose Async with RxJS
Compose Async with RxJS
 
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip  with a Webcam, a GSP and Some Fun with NodeHow to Hack a Road Trip  with a Webcam, a GSP and Some Fun with Node
How to Hack a Road Trip with a Webcam, a GSP and Some Fun with Node
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
D3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand wordsD3.js - A picture is worth a thousand words
D3.js - A picture is worth a thousand words
 
Browsers with Wings
Browsers with WingsBrowsers with Wings
Browsers with Wings
 
R and cpp
R and cppR and cpp
R and cpp
 
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSONFun with D3.js: Data Visualization Eye Candy with Streaming JSON
Fun with D3.js: Data Visualization Eye Candy with Streaming JSON
 

Dernier

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Apéro RubyBdx - MongoDB - 8-11-2011

  • 1. Pierre-Louis Gottfrois Bastien Murzeau Apéro Ruby Bordeaux, 8 novembre 2011
  • 2. • Brève introduction • Cas pratique • Map / Reduce
  • 3. Qu’est ce que mongoDB ? mongoDB est une base de donnée de type NoSQL, sans schéma document-oriented
  • 4. sans-schéma • Très utile en développements ‘agiles’ (itérations, rapidité de modifications, flexibilité pour les développeurs) • Supporte des fonctionnalités qui seraient, en BDDs relationnelles : • quasi-impossible (stockage d’éléments non finis, ex. tags) • trop complexes pour ce qu’elles sont (migrations)
  • 5. document-oriented • mongoDB stocke des documents, pas de rows • les documents sont stockés sous forme de JSON; binary JSON • la syntaxe de requêtage est aussi fournie que SQL • le mécanisme de documents ‘embedded’ résout bon nombre de problèmes rencontrés
  • 6. document-oriented • Les documents sont stockés dans une collection, en RoR = model • une partie des ces données sont indexées pour optimiser les performances • un document n’est pas une poubelle !
  • 7. stockage de données volumineuses • mongoDB (et autres NoSQL) sont plus performantes pour la scalabilité horizontale • ajout de serveurs pour augmenter la capacité de stockage («sharding») • garantissant ainsi une meilleur disponibilité • load-balancing optimisé entre les nodes • augmentation transparente pour l’application
  • 8. Cas pratique • ORM devient ODM, la gem de référence mongoid • ou : mongoMapper, DataMapper • Création d’une application a base de NoSQL MongoDB • rails new nosql • edition du Gemfile • gem ‘mongoid’ • gem ‘bson_ext’ • bundle install • rails generate mongoid:config
  • 9. Cas pratique • edition du config/application.rb • #require 'rails/all' • require "action_controller/railtie" • require "action_mailer/railtie" • require "active_resource/railtie" • require "rails/test_unit/railtie"
  • 10. Cas pratique class Subject include Mongoid::Document include Mongoid::Timestamps has_many :scores, :as => :scorable, :dependent => :delete, :autosave => true has_many :requests, :dependent => :delete belongs_to :author, :class_name => 'User' class Conversation include Mongoid::Document include Mongoid::Timestamps field :public, :type => Boolean, :default => false has_many :scores, :as => :scorable, :dependent => :delete has_and_belongs_to_many :subjects belongs_to :timeline embeds_many :messages
  • 12. Example A “ticket” collection { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 13. Problematic • We want to • Calculate the ‘checkout’ sum of each object in our ticket’s collection • Be able to distribute this operation over the network • Be fast! • We don’t want to • Go over all objects again when an update is made
  • 14. Map : emit(checkout) The ‘map’ function emit (select) every checkout value of each object in our collection 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 15. Reduce : sum(checkout) 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 16. Reduce function The ‘reduce’ function apply the algorithmic logic for each key/value received from ‘map’ function This function has to be ‘idempotent’ to be called recursively or in a distributed system reduce(k, A, B) == reduce(k, B, A) reduce(k, A, B) == reduce(k, reduce(A, B))
  • 17. Inherently Distributed 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 215 “checkout” : 73 } } } }
  • 18. Distributed Since ‘map’ function emits objects to be reduced and ‘reduce’ function processes for each emitted objects independently, it can be distributed through multiple workers. map reduce
  • 19. Logaritmic Update For the same reason, when updating an object, we don’t have to reprocess for each obejcts. We can call ‘map’ function only on updated objects.
  • 20. Logaritmic Update 430 142 288 100 42 215 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 21. Logaritmic Update 430 142 288 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 22. Logaritmic Update 430 142 283 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 23. Logarithmic Update 425 142 283 100 42 210 73 { { { { “id” : 1, “id” : 2, “id” : 3, “id” : 4, “day” : 20111017, “day” : 20111017, “day” : 20111017, “day” : 20111017, “checkout” : 100 “checkout” : 42 “checkout” : 210 “checkout” : 73 } } } }
  • 25. $> mongo > db.tickets.save({ "_id": 1, "day": 20111017, "checkout": 100 }) > db.tickets.save({ "_id": 2, "day": 20111017, "checkout": 42 }) > db.tickets.save({ "_id": 3, "day": 20111017, "checkout": 215 }) > db.tickets.save({ "_id": 4, "day": 20111017, "checkout": 73 }) > db.tickets.count() 4 > db.tickets.find() { "_id" : 1, "day" : 20111017, "checkout" : 100 } ... > db.tickets.find({ "_id": 1 }) { "_id" : 1, "day" : 20111017, "checkout" : 100 }
  • 26. > var map = function() { ... emit(null, this.checkout) } > var reduce = function(key, values) { ... var sum = 0 ... for (var index in values) sum += values[index] ... return sum }
  • 27. Temporary Collection > sumOfCheckouts = db.tickets.mapReduce(map, reduce) { "result" : "tmp.mr.mapreduce_123456789_4", "timeMills" : 8, "counts" : { "input" : 4, "emit" : 4, "output" : 1 }, "ok" : 1 } > db.getCollectionNames() [ "tickets", "tmp.mr.mapreduce_123456789_4" ] > db[sumOfCheckouts.result].find() { "_id" : null, "value" : 430 }
  • 28. Persistent Collection > db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" }) > db.getCollectionNames() [ "sumOfCheckouts", "tickets", "tmp.mr.mapreduce_123456789_4" ] > db.sumOfCheckouts.find() { "_id" : null, "value" : 430 } > db.sumOfCheckouts.findOne().value 430
  • 30. > var map = function() { ... emit(this.date, this.checkout) } > var reduce = function(key, values) { ... var sum = 0 ... for (var index in values) sum += values[index] ... return sum }
  • 31. > db.tickets.mapReduce(map, reduce, { "out" : "sumOfCheckouts" }) > db.sumOfCheckouts.find() { "_id" : 20111017, "value" : 430 }
  • 33. Scored Subjects per User Subject User Score 1 1 2 1 1 2 1 2 2 2 1 2 2 2 10 2 2 5
  • 34. Scored Subjects per User (reduced) Subject User Score 1 1 4 1 2 2 2 1 2 2 2 15
  • 35. $> mongo > db.scores.save({ "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 2, "subject_id": 1, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 3, "subject_id": 1, "user_id": 2, "score": 2 }) > db.scores.save({ "_id": 4, "subject_id": 2, "user_id": 1, "score": 2 }) > db.scores.save({ "_id": 5, "subject_id": 2, "user_id": 2, "score": 10 }) > db.scores.save({ "_id": 6, "subject_id": 2, "user_id": 2, "score": 5 }) > db.scores.count() 6 > db.scores.find() { "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 } ... > db.scores.find({ "_id": 1 }) { "_id": 1, "subject_id": 1, "user_id": 1, "score": 2 }
  • 36. > var map = function() { ... emit([this.user_id, this.subject_id].join("-"), {subject_id:this.subject_id, ... user_id:this.user_id, score:this.score}); } > var reduce = function(key, values) { ... var result = {user_id:"", subject_id:"", score:0}; ... values.forEach(function (value) {result.score += value.score;result.user_id = ... value.user_id;result.subject_id = value.subject_id;}); ... return result }
  • 37. ReducedScores Collection > db.scores.mapReduce(map, reduce, { "out" : "reduced_scores" }) > db.getCollectionNames() [ "reduced_scores", "scores" ] > db.reduced_scores.find() { "_id" : "1-1", "value" : { "user_id" : 1, "subject_id" : 1, "score" : 4 } } { "_id" : "1-2", "value" : { "user_id" : 1, "subject_id" : 2, "score" : 2 } } { "_id" : "2-1", "value" : { "user_id" : 2, "subject_id" : 1, "score" : 2 } } { "_id" : "2-2", "value" : { "user_id" : 2, "subject_id" : 2, "score" : 15 } } > db.reduced_scores.findOne().score 4
  • 38. Dealing with Rails Query ruby-1.9.2-p180 :007 > ReducedScores.first => #<ReducedScores _id: 1-1, _type: nil, value: {"user_id"=>BSON::ObjectId('...'), "subject_id"=>BSON::ObjectId('...'), "score"=>4.0}> ruby-1.9.2-p180 :008 > ReducedScores.where("value.user_id" => u1.id).count => 2 ruby-1.9.2-p180 :009 > ReducedScores.where("value.user_id" => u1.id).first.value['score'] => 4.0 ruby-1.9.2-p180 :010 > ReducedScores.where("value.user_id" => u1.id).last.value['score'] => 2.0