SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
1	
  
Directory Layout
•  Separate files per database
•  Aggressive preallocation
•  Files contain one or more extents

  -rw-------   1   ben   ben    64M   May   1   19:14   test.0!
  -rw-------   1   ben   ben   128M   May   1   19:14   test.1!
  -rw-------   1   ben   ben   256M   May   1   18:25   test.2!
  -rw-------   1   ben   ben   512M   May   1   19:14   test.3!
  -rw-------   1   ben   ben   1.0G   May   1   19:14   test.4!
  -rw-------   1   ben   ben   2.0G   May   1   18:58   test.5!
  -rw-------   1   ben   ben    16M   May   1   19:14   test.ns!




                                                        2	
  
Memory Mapping
 0x7fffffffffff	
  
                          STACK!
                            …!


                          LIBS!
                            …!
                         test.ns!     Disk	
  
                         test.0!
                         test.1!
                            …!
                           !
                            …!
                          HEAP!       {	
  …	
  }	
  
                         MONGOD!
                          NULL!
               0x0	
  
                                    Document	
  
 Process	
  Virtual	
  Memory	
  
Data Structures
•  DiskLoc
  •  Stores file number and offset of data on disk
  •  Record *r = mmap base + DiskLoc.offset!
  •  Max offset is 2^31 (2GB)!
•  NamespaceDetails
  •  Stores collection metadata!
•  Extent!
  •  Stores contiguous blocks within a namespace
  •  Max extent size is 2GB	
  
•  Record!
  •  Holds a BSON document or B-tree bucket
  •  DeletedRecord overwrites a Record!
  •  Includes Padding
Namespace Details
•    Holds metadata about a collection or index
•    Stored in 1KB buckets in <dbname>.ns file
•    .ns file fixed size of 16MB
•    Maintains document count
•    Contains heads of linked lists

      NamespaceDetails	
  
       firstExtent	
     lastExtent	
     _indexes[]	
     stats	
     freeList[]	
  
Extent Structure

  Extent	
         Extent	
  
    length	
         length	
  

     xNext	
          xNext	
  


     xPrev	
          xPrev	
  


  firstRecord	
     firstRecord	
  


  lastRecord	
     lastRecord	
  
Extents
>	
  db.foo.validate(	
  {	
  full	
  :	
  true	
  }	
  ).extents.forEach(	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  function(z){	
  print(	
  z.loc	
  +	
  "tt"	
  +	
  z.size	
  );	
  }	
  )	
  
0:3000 	
                                	
  20480	
  
0:12000                  	
              	
  81920	
  
0:26000                  	
              	
  327680	
  
0:76000                  	
              	
  1310720	
  
0:1da000                                 	
  5242880	
  
0:76a000                                 	
  6291456	
  
0:d6a000                                 	
  7553024	
  
0:16de000                                	
  9064448	
  
0:1f83000                                	
  10878976	
  
0:29e3000                                	
  13058048	
  
1:2000 	
                                	
  15671296	
  
1:ef4000                                 	
  18808832	
  
1:29e4000                                	
  22573056	
  
Index Extents

>	
  db.system.namespaces.find()	
  
{	
  "name"	
  :	
  "test.foo"	
  }	
  
{	
  "name"	
  :	
  "test.system.indexes"	
  }	
  
{	
  "name"	
  :	
  "test.foo.$_id_"	
  }	
  
	
  
>	
  db["foo.$_id_"].validate(	
  {	
  full	
  :	
  true	
  }	
  ).extents.forEach(	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  function(z){	
  print(	
  z.loc	
  +	
  "tt"	
  +	
  z.size	
  );	
  }	
  )	
  
0:9000 	
                                	
  36864	
  
0:1b6000                                 	
  147456	
  
0:6da000                                 	
  589824	
  
0:149e000                                	
  2359296	
  
1:20e4000                                	
  9437184	
  
Extents and Records

Extent	
  
   length	
  

    xNext	
  
                  Data	
  Record	
  
    xPrev	
       length	
     Document	
  
                               {	
  	
  
                  rNext	
  
 firstRecord	
                  	
  	
  _id:	
  “foo”,	
  
                               	
  	
  ...	
  	
  
                   rPrev	
     }	
  
 lastRecord	
  
Extents and Records

Extent	
  
   length	
  

    xNext	
  
                  Data	
  Record	
  
    xPrev	
       length	
     Document	
  
                               {	
  	
  
                  rNext	
  
 firstRecord	
                  	
  	
  _id:	
  “foo”,	
  
                               	
  	
  ...	
  	
  
                   rPrev	
     }	
  
 lastRecord	
  
Extents and Records

Extent	
  
   length	
  

    xNext	
  
                  Data	
  Record	
                          Data	
  Record	
  
    xPrev	
       length	
     Document	
                   length	
     Document	
  
                               {	
  	
                                   {	
  	
  
                  rNext	
                                   rNext	
  
 firstRecord	
                  	
  	
  _id:	
  “foo”,	
                  	
  	
  _id:	
  “foo”,	
  
                               	
  	
  ...	
  	
                         	
  	
  ...	
  	
  
                   rPrev	
     }	
                           rPrev	
     }	
  
 lastRecord	
  
BSON Format

        {	
  hello:	
  “world”	
  }	
  

  Doc	
  Length	
       Value	
  Type	
  

  x16x00x00x00 x02hellox00 !
  x06x00x00x00 worldx00x00!
  Value	
  Length	
  
Index Extents

Extent	
  
   length	
  
                  Index	
  Record	
                               Index	
  Record	
  
    xNext	
  


    xPrev	
       length	
                 Bucket	
               length	
            Bucket	
  
                                             parent	
                                  parent	
  
                   rNext	
                                         rNext	
  
 firstRecord	
                              numKeys	
                                  numKeys	
  
                   rPrev	
              K 	
  
                                        	
          	
     	
      rPrev	
     	
  
 lastRecord	
  



                               {	
  Document	
  }	
  
Index Extents                                                                       	
  
                                                                                    4       	
  
                                                                                            9



                                                                  	
  
                                                                  1      	
  
                                                                         3          	
  
                                                                                    5      	
  
                                                                                           6       	
  
                                                                                                   8             	
  
                                                                                                                 A        	
  
                                                                                                                          B




Extent	
  
   length	
  
                  Index	
  Record	
                                             Index	
  Record	
  
    xNext	
  


    xPrev	
       length	
                 Bucket	
                             length	
                                Bucket	
  
                                             parent	
                                                                    parent	
  
                   rNext	
                                                       rNext	
  
 firstRecord	
                              numKeys	
                                                                    numKeys	
  
                   rPrev	
              K 	
  
                                        	
          	
     	
                    rPrev	
                  	
  
 lastRecord	
  



                               {	
  Document	
  }	
  
Journaling
•  Write ahead logging
•  Operations written to journal before memory
  mapped regions
  •  Private view
  •  Shared view
•  Once journal written, data safe unless
   hardware problem
•  By default, journal flushed every 100ms,
   100mb of writes, or on write concern of j=true
  •  User configurable with --journalCommitInterval
Journal Format
JHeader	
  
                                 •  Section	
  contains	
  single	
  group	
  commit	
  
JSectHeader	
  [LSN	
  3]	
  
                                 •  Applied	
  all-­‐or-­‐nothing	
  
          DurOp	
  
          DurOp	
  

          DurOp	
               Op_DbContext	
          Set	
  database	
  context	
  for	
  
JSectFooter	
                   length	
                subsequent	
  operations	
  
                                offset	
  
JSectHeader	
  [LSN	
  7]	
  
                                fileNo	
  
          DurOp	
               data[length]	
  
          DurOp	
               length	
  
                                offset	
  
                                                         Write	
  Operation	
  
          DurOp	
               fileNo	
  
                                data[length]	
  
JSectFooter	
  
                                length	
  
…	
                             offset	
  
                                fileNo	
  
                                data[length]	
  
Journal Performance
•  On 99.9% read systems, no impact
•  Write performance degraded 5-30% when
   journal on same drive
•  Separate drive as low as 3%
Journal Admin
•  Journal stored in /dbpath/journal folder
•  If faster, three 1gb files may be preallocated
•  Can symlink to a different spindle
•  --journalCommitInterval* (2ms - 300ms)
•  When to journal
   •  Single node: required for data integrity
   •  Replica set: at least 1 node
   •  All nodes: removes possible need to resync
Fragmentation
•  Files may become fragmented over time if
   documents change size
•  Free lists also contribute to fragmentation
  •  2.0 reduced scanning to reasonable amounts
  •  2.2 will change allocation strategy
  •  Need to re-write free list to do online compaction
Compaction
•  1.8 and previous: repairDatabase
•  2.0+ : compact command
  •  Currently resets paddingFactor, but can be
     changed.
  •  Index (re)generation is now concurrent, so
     compaction can be N times faster
•  Generally causes some extra allocation
  •  Does not delete or truncate files
Planned Changes
•  Split data and indexes into different files
•  Indexes could by symlinked to a different
   drive (SSD)
•  Improved allocation strategy
Download	
  MongoDB	
  

http://www.mongodb.org/downloads	
  
               	
  




       Ben	
  Becker	
  
  ben.becker@10gen.com	
  

Contenu connexe

Tendances

MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation FrameworkCaserta
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...Altinity Ltd
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseMike Dirolf
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixYugabyteDB
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDBvaluebound
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopDataWorks Summit
 
E-Commerce search with Elasticsearch
E-Commerce search with ElasticsearchE-Commerce search with Elasticsearch
E-Commerce search with ElasticsearchYevhen Shyshkin
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDataWorks Summit
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Amy W. Tang
 
A Practical Introduction to Apache Solr
A Practical Introduction to Apache SolrA Practical Introduction to Apache Solr
A Practical Introduction to Apache SolrAngel Borroy López
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use CasesDATAVERSITY
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Kai Zhao
 
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędzi
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędziApache Cassandra - wprowadzenie do architektury, modelowania i narzędzi
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędziSemantive
 

Tendances (20)

MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
MongoDB
MongoDBMongoDB
MongoDB
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
 
MySQL SQL Tutorial
MySQL SQL TutorialMySQL SQL Tutorial
MySQL SQL Tutorial
 
The Basics of MongoDB
The Basics of MongoDBThe Basics of MongoDB
The Basics of MongoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache HadoopTez Shuffle Handler: Shuffling at Scale with Apache Hadoop
Tez Shuffle Handler: Shuffling at Scale with Apache Hadoop
 
E-Commerce search with Elasticsearch
E-Commerce search with ElasticsearchE-Commerce search with Elasticsearch
E-Commerce search with Elasticsearch
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Design Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data AnalyticsDesign Patterns For Real Time Streaming Data Analytics
Design Patterns For Real Time Streaming Data Analytics
 
Deep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDBDeep Dive on Amazon DynamoDB
Deep Dive on Amazon DynamoDB
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
 
A Practical Introduction to Apache Solr
A Practical Introduction to Apache SolrA Practical Introduction to Apache Solr
A Practical Introduction to Apache Solr
 
Common MongoDB Use Cases
Common MongoDB Use CasesCommon MongoDB Use Cases
Common MongoDB Use Cases
 
Introduction to DNS
Introduction to DNSIntroduction to DNS
Introduction to DNS
 
Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)Mongodb introduction and_internal(simple)
Mongodb introduction and_internal(simple)
 
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędzi
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędziApache Cassandra - wprowadzenie do architektury, modelowania i narzędzi
Apache Cassandra - wprowadzenie do architektury, modelowania i narzędzi
 

Similaire à Database Directory Layout and Data Structures

Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyMongoDB
 
Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout source{d}
 
Linux Resource Management - Мариян Маринов (Siteground)
Linux Resource Management - Мариян Маринов (Siteground)Linux Resource Management - Мариян Маринов (Siteground)
Linux Resource Management - Мариян Маринов (Siteground)PlovDev Conference
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxData
 
Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207Jay Coskey
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...MongoDB
 
#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and ProtocolsPhilippe Back
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com琛琳 饶
 
Postgresql Database Administration Basic - Day2
Postgresql  Database Administration Basic  - Day2Postgresql  Database Administration Basic  - Day2
Postgresql Database Administration Basic - Day2PoguttuezhiniVP
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 

Similaire à Database Directory Layout and Data Structures (20)

Webinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data SafetyWebinar: Understanding Storage for Performance and Data Safety
Webinar: Understanding Storage for Performance and Data Safety
 
Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout Introduction to source{d} Engine and source{d} Lookout
Introduction to source{d} Engine and source{d} Lookout
 
Linux Resource Management - Мариян Маринов (Siteground)
Linux Resource Management - Мариян Маринов (Siteground)Linux Resource Management - Мариян Маринов (Siteground)
Linux Resource Management - Мариян Маринов (Siteground)
 
Linux resource limits
Linux resource limitsLinux resource limits
Linux resource limits
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
Python redis talk
Python redis talkPython redis talk
Python redis talk
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Performance .NET Core - M. Terech, P. Janowski
Performance .NET Core - M. Terech, P. JanowskiPerformance .NET Core - M. Terech, P. Janowski
Performance .NET Core - M. Terech, P. Janowski
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
Python 3.6 Features 20161207
Python 3.6 Features 20161207Python 3.6 Features 20161207
Python 3.6 Features 20161207
 
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
Ensuring High Availability for Real-time Analytics featuring Boxed Ice / Serv...
 
File mangement
File mangementFile mangement
File mangement
 
#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols
 
Rar
RarRar
Rar
 
ELK stack at weibo.com
ELK stack at weibo.comELK stack at weibo.com
ELK stack at weibo.com
 
Postgresql Database Administration Basic - Day2
Postgresql  Database Administration Basic  - Day2Postgresql  Database Administration Basic  - Day2
Postgresql Database Administration Basic - Day2
 
Avro introduction
Avro introductionAvro introduction
Avro introduction
 
CSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptxCSV JSON and XML files in Python.pptx
CSV JSON and XML files in Python.pptx
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 

Plus de MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

Plus de MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Dernier

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Database Directory Layout and Data Structures

  • 2. Directory Layout •  Separate files per database •  Aggressive preallocation •  Files contain one or more extents -rw------- 1 ben ben 64M May 1 19:14 test.0! -rw------- 1 ben ben 128M May 1 19:14 test.1! -rw------- 1 ben ben 256M May 1 18:25 test.2! -rw------- 1 ben ben 512M May 1 19:14 test.3! -rw------- 1 ben ben 1.0G May 1 19:14 test.4! -rw------- 1 ben ben 2.0G May 1 18:58 test.5! -rw------- 1 ben ben 16M May 1 19:14 test.ns! 2  
  • 3. Memory Mapping 0x7fffffffffff   STACK! …! LIBS! …! test.ns! Disk   test.0! test.1! …! ! …! HEAP! {  …  }   MONGOD! NULL! 0x0   Document   Process  Virtual  Memory  
  • 4. Data Structures •  DiskLoc •  Stores file number and offset of data on disk •  Record *r = mmap base + DiskLoc.offset! •  Max offset is 2^31 (2GB)! •  NamespaceDetails •  Stores collection metadata! •  Extent! •  Stores contiguous blocks within a namespace •  Max extent size is 2GB   •  Record! •  Holds a BSON document or B-tree bucket •  DeletedRecord overwrites a Record! •  Includes Padding
  • 5. Namespace Details •  Holds metadata about a collection or index •  Stored in 1KB buckets in <dbname>.ns file •  .ns file fixed size of 16MB •  Maintains document count •  Contains heads of linked lists NamespaceDetails   firstExtent   lastExtent   _indexes[]   stats   freeList[]  
  • 6. Extent Structure Extent   Extent   length   length   xNext   xNext   xPrev   xPrev   firstRecord   firstRecord   lastRecord   lastRecord  
  • 7. Extents >  db.foo.validate(  {  full  :  true  }  ).extents.forEach(                      function(z){  print(  z.loc  +  "tt"  +  z.size  );  }  )   0:3000    20480   0:12000    81920   0:26000    327680   0:76000    1310720   0:1da000  5242880   0:76a000  6291456   0:d6a000  7553024   0:16de000  9064448   0:1f83000  10878976   0:29e3000  13058048   1:2000    15671296   1:ef4000  18808832   1:29e4000  22573056  
  • 8. Index Extents >  db.system.namespaces.find()   {  "name"  :  "test.foo"  }   {  "name"  :  "test.system.indexes"  }   {  "name"  :  "test.foo.$_id_"  }     >  db["foo.$_id_"].validate(  {  full  :  true  }  ).extents.forEach(                      function(z){  print(  z.loc  +  "tt"  +  z.size  );  }  )   0:9000    36864   0:1b6000  147456   0:6da000  589824   0:149e000  2359296   1:20e4000  9437184  
  • 9. Extents and Records Extent   length   xNext   Data  Record   xPrev   length   Document   {     rNext   firstRecord      _id:  “foo”,      ...     rPrev   }   lastRecord  
  • 10. Extents and Records Extent   length   xNext   Data  Record   xPrev   length   Document   {     rNext   firstRecord      _id:  “foo”,      ...     rPrev   }   lastRecord  
  • 11. Extents and Records Extent   length   xNext   Data  Record   Data  Record   xPrev   length   Document   length   Document   {     {     rNext   rNext   firstRecord      _id:  “foo”,      _id:  “foo”,      ...        ...     rPrev   }   rPrev   }   lastRecord  
  • 12. BSON Format {  hello:  “world”  }   Doc  Length   Value  Type   x16x00x00x00 x02hellox00 ! x06x00x00x00 worldx00x00! Value  Length  
  • 13. Index Extents Extent   length   Index  Record   Index  Record   xNext   xPrev   length   Bucket   length   Bucket   parent   parent   rNext   rNext   firstRecord   numKeys   numKeys   rPrev   K         rPrev     lastRecord   {  Document  }  
  • 14. Index Extents   4   9   1   3   5   6   8   A   B Extent   length   Index  Record   Index  Record   xNext   xPrev   length   Bucket   length   Bucket   parent   parent   rNext   rNext   firstRecord   numKeys   numKeys   rPrev   K         rPrev     lastRecord   {  Document  }  
  • 15. Journaling •  Write ahead logging •  Operations written to journal before memory mapped regions •  Private view •  Shared view •  Once journal written, data safe unless hardware problem •  By default, journal flushed every 100ms, 100mb of writes, or on write concern of j=true •  User configurable with --journalCommitInterval
  • 16. Journal Format JHeader   •  Section  contains  single  group  commit   JSectHeader  [LSN  3]   •  Applied  all-­‐or-­‐nothing   DurOp   DurOp   DurOp   Op_DbContext   Set  database  context  for   JSectFooter   length   subsequent  operations   offset   JSectHeader  [LSN  7]   fileNo   DurOp   data[length]   DurOp   length   offset   Write  Operation   DurOp   fileNo   data[length]   JSectFooter   length   …   offset   fileNo   data[length]  
  • 17. Journal Performance •  On 99.9% read systems, no impact •  Write performance degraded 5-30% when journal on same drive •  Separate drive as low as 3%
  • 18. Journal Admin •  Journal stored in /dbpath/journal folder •  If faster, three 1gb files may be preallocated •  Can symlink to a different spindle •  --journalCommitInterval* (2ms - 300ms) •  When to journal •  Single node: required for data integrity •  Replica set: at least 1 node •  All nodes: removes possible need to resync
  • 19. Fragmentation •  Files may become fragmented over time if documents change size •  Free lists also contribute to fragmentation •  2.0 reduced scanning to reasonable amounts •  2.2 will change allocation strategy •  Need to re-write free list to do online compaction
  • 20. Compaction •  1.8 and previous: repairDatabase •  2.0+ : compact command •  Currently resets paddingFactor, but can be changed. •  Index (re)generation is now concurrent, so compaction can be N times faster •  Generally causes some extra allocation •  Does not delete or truncate files
  • 21. Planned Changes •  Split data and indexes into different files •  Indexes could by symlinked to a different drive (SSD) •  Improved allocation strategy
  • 22. Download  MongoDB   http://www.mongodb.org/downloads     Ben  Becker   ben.becker@10gen.com