SlideShare une entreprise Scribd logo
1  sur  33
Dremel: Interactive Analysis of Web-
Scale Datasets
 Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer,
 Shiva Shivakumar, Matt Tolton, Theo Vassilakis
 Google, Inc.
 VLDB, 2010

 Presentation by ensky (enskylin@gmail.com)
Outline
 Introduction
 Nested columnar storage
 Query processing
 Experiments
 Conclusion




                            2
Introduction
   Dremel is an query system
    For analysis of read-only nested data

   Use case

        Interactive                 Trends
           Tools                   Detection


             Web     Spam         Network
          Dashboards             Optimization
                                            3
DocId: 10
                                           Links


           Features
                                             Forward: 20
                                           Name
                                             Language
                                               Code: 'en-us'
                                               Country: 'us'
             Multi-level structure          Url: 'http://A'
                                           Name

             SQL-like query language
                                             Url: 'http://B'



             Fast: Trillion-row tables in second level
             Big scale: thousands of CPUs and
              petabytes of data
             Widely used by google: in production
              since 2006
SELECT A, COUNT(B) FROM T
GROUP BY A
T = {/gfs/1, /gfs/2, …, /gfs/100000}
                                                               4
Contribution
   This paper presented two major
    technique in Dremel:

   Nested columnar storage
    ◦ store / split into columns
    ◦ Assembly to record
   Query
    ◦ Language
    ◦ Execution
                                     5
Outline
 Introduction
 Nested columnar storage
 Query processing
 Experiments
 Conclusion




                            6
Why nested?
   Flexible
    ◦ Data in web and scientific is often non-
      relational
   Reduce cost
    ◦ Normalizing and recombining nested data
      is often prohibited in TB, PB scale of data.




                                                     7
Why column?
                  DocId: 10
                  Links
                                           column-oriented
                    Forward: 20
                  Name                              A
                                                  *    *
                                                    ...
                    Language
                      Code: 'en-us'
                      Country: 'us'             B          E
                    Url: 'http://A'                   *
                                               C           D
                                                               r1
                  Name
                    Url: 'http://B'

                                          r1
 r1                                                   r1
                                          r2                   r2
 r2                                   Read less,      r2
                                      cheaper
 Record-oriented                      decompression

Challenge: preserve structure, reconstruct from a subset of fields
                                                                 8
DocId: 10
                    Nested data model
Links
  Forward: 20        message Document {
  Forward: 40
  Forward: 60          required int64 DocId;           [1,1]
Name                   optional group Links {
  Language
    Code: 'en-us'
                         repeated int64 Backward;      [0,*]
    Country: 'us'        repeated int64 Forward;
  Language             }
    Code: 'en'
  Url: 'http://A'      repeated group Name {
Name                     repeated group Language {
  Url: 'http://B'
Name                       required string Code;
  Language                 optional string Country;    [0,1]
    Code: 'en-gb'
    Country: 'gb'
                         }
                         optional string Url;
DocId: 20              }
Links                }
  Backward: 10
  Backward: 30
  Forward: 80
Name
                     http://code.google.com/apis/protocolbuffers
  Url: 'http://C'                                                  9
Column-striped representation
       DocId            Name.Url                Links.Forward        Links.Backward
       value    r   d   value       r   d       value   r   d        value    r   d
        10     0    0   http://A    0   2        20     0   2        NULL     0   1
        20     0    0   http://B    1   2        40     1   2            10   0   2

DocId: 10
                         NULL       1   1        60     1   2            30   1   2
Links                   http://C    0   2        80     0   2
  Forward: 20
  Forward: 40
  Forward: 60
Name
  Language                 Name.Language.Code               Name.Language.Country
    Code: 'en-us'
    Country: 'us'           value       r   d               value    r    d
  Language                  en-us       0   2                   us   0    3
    Code: 'en'
  Url: 'http://A'            en         2   2                NULL    2    2
Name
  Url: 'http://B'           NULL        1   1                NULL    1    1
Name
  Language                  en-gb       1   2                   gb   1    3
    Code: 'en-gb'           NULL        0   1                NULL    0    1
    Country: 'gb'                                                                     10
Column-oriented problem
   When query sub-tree, there is no way for
    you to know the whole structure(even the
    position of yourself)             A
                                           *        *
                                       B        ...          E
   To solve this problem,                 *
    this paper presents            C            D
    repetition & definition
                                                        r1
                              r1
                                           r1
                              r2                        r2
           Where am I?
                                           r2
                                                                 11
Repetition and definition levels
                                                                        r
                                                        DocId: 10
                                                        Links             1
                                                          Forward: 20
                                                          Forward: 40
    Name.Language.Code                                    Forward: 60
     value    r   d                                     Name
                             First time, repeat=0         Language
     en-us    0   2                                         Code: 'en-us'
                          Language repeat, level = 2        Country: 'us'
       en     2   2                                       Language
                                                            Code: 'en'
     NULL     1   1                                       Url: 'http://A'
     en-gb    1   2                                     Name
                                                          Url: 'http://B'
     NULL     0   1                                     Name
                                                          Language
                                                            Code: 'en-gb'
                               None repeat, level = 0       Country: 'gb'


                                                        DocId: 20
                                                                         2  r
                                                        Links
                                                          Backward: 10
                                                          Backward: 30
    r: At what repeated field in the field's path         Forward: 80
       the value has repeated                           Name
                                                          Url: 'http://C'12
Repetition and definition levels
                                                                       r
                                                       DocId: 10
                                                       Links             1
                                                         Forward: 20
                                                         Forward: 40
    Name.Language.Country                                Forward: 60
     value    r   d                                    Name
                                                         Language
       us     0   3                                        Code: 'en-us'
                                                           Country: 'us'
      NULL    2   2                                      Language
                                                           Code: 'en'
      NULL    1   1                                      Url: 'http://A'
       gb     1   3                                    Name
                                                         Url: 'http://B'
      NULL    0   1                                    Name
                                                         Language
                                                           Code: 'en-gb'
                                                           Country: 'gb'


                                                       DocId: 20
                                                                        2  r
                                                       Links
                                                         Backward: 10
                                                         Backward: 30
    d: How many fields in paths that could be            Forward: 80
                                                       Name
       undefined (opt. or rep.) are actually present     Url: 'http://C'13
Column-oriented is all right,
but what if I still need record-
oriented query?

                  (e.g., MapReduce)


                                      14
Record assembly FSM
                                   Transitions labeled with
                                   repetition levels
                         DocId
                    0
                            0                     1
     1     Links.Backward        Links.Forward
                            0

                         0,1,2
    Name.Language.Code           Name.Language.Country
                            2
                         Name.Ur         0,1
                1
                             l
                           0



For record-oriented data processing (e.g., MapReduce)
                                                              15
Reading two fields

                                DocId: 10      s 1
                                Name
                                  Language
                DocId               Country: 'us'
                                  Language
                0               Name
                                Name
 1,2    Name.Language.Country     Language
                0                   Country: 'gb'

                                DocId: 20      s2
                                Name



Structure of parent fields is preserved.
Useful for queries like /Name[3]/Language[1]/Country

                                                       16
Outline
 Introduction
 Nested columnar storage
 Query processing
 Experiments
 Conclusion




                            17
Query language
 Based on SQL
 Performs
    ◦   Projection
    ◦   Selection
    ◦   Nested subqueries
    ◦   inner and intra-record aggregation
    ◦   Top-k
    ◦   Joins
    ◦   User-defined functions

                                             18
DocId: 10
                                                  Links
                                                    Forward: 20
                                                    Forward: 40

          Example usage                             Forward: 60
                                                  Name
                                                    Language
                                                      Code: 'en-us'
                                                      Country: 'us'
SELECT DocId AS Id,                                 Language
                                                      Code: 'en'
  COUNT(Name.Language.Code) WITHIN Name AS Cnt,     Url: 'http://A'
                                                  Name
  Name.Url + ',' + Name.Language.Code AS Str        Url: 'http://B'
                                                  Name
FROM t                                              Language
WHERE REGEXP(Name.Url, '^http') AND DocId < 20;       Code: 'en-gb'
                                                      Country: 'gb'



Output table                 Output schema
Id: 10                 t1    message QueryResult {
Name                           required int64 Id;
  Cnt: 2                       repeated group Name {
  Language                       optional uint64 Cnt;
    Str: 'http://A,en-us'        repeated group Language {
    Str: 'http://A,en'             optional string Str;
Name                             }
  Cnt: 0                       }
                             }
                                                                19
Query execution
                   client
                                     • Parallelizes scheduling
root server                            and aggregation
                                     • Fault tolerance
intermediate
                            ...      • query dispatcher can
servers
                                       dispatch servers

leaf servers       ...
(with local                  ...
 storage)


         storage layer (e.g., GFS)                               20
Example: count()
      SELECT A, COUNT(B) FROM T GROUP          SELECT A, SUM(c)
0     BY A                                     FROM (R11 UNION ALL R110)
      T = {/gfs/1, /gfs/2, …, /gfs/100000}     GROUP BY A




      R11                                R12
      SELECT A, COUNT(B) AS c            SELECT A, COUNT(B) AS c
1     FROM T11 GROUP BY A                FROM T12 GROUP BY A
      T11 = {/gfs/1, …, /gfs/10000}      T12 = {/gfs/10001, …, /gfs/20000}

...
      SELECT A, COUNT(B) AS c
3     FROM T31 GROUP BY A             ...
      T31 = {/gfs/1}

                Data access ops                                              21
Outline
 Introduction
 Nested columnar storage
 Query processing
 Experiments
 Conclusion




                            22
Experiment Data
• 1 PB of real data
  (uncompressed, non-replicated)
• 100K-800K tablets per table
• Experiments run during business hours

Table   Number of       Size (unrepl.,   Number       Data     Repl.
name    records         compressed)      of fields    center   factor
  T1      85 billion             87 TB          270     A        3×
  T2      24 billion             13 TB          530     A        3×
  T3        4 billion            70 TB         1200     A        3×
  T4      1+ trillion          105 TB            50     B        3×
  T5      1+ trillion            20 TB           30     B        2×
                                                                        23
Column v.s. Record"cold" time on local disk,
                                  time (sec)                            averaged over 30 runs
                                                                                 (e) parse as
                   from records                                                      C++ objects
10x speedup
using columnar                                       objects
storage                                                                          (d) read +
                                                                                     decompress
                                                               records
                                                                                 (c) parse as
                   from columns




                                           columns
                                                                                     C++ objects
                                                                                 (b) assemble
2-4x overhead of                                                                      records
using records                                                                    (a) read +
                                                                                     decompress

                                                     number of fields


            Table partition: 375 MB (compressed), 300K rows, 125 columns 24
MR and Dremel execution
 Avg # of terms in txtField in 85 billion record table T1
     execution time (sec) on 3000 nodes
                                             Sawzall program ran on MR:

                                             num_recs: table sum of int;
                                             num_words: table sum of int;
                                             emit num_recs <- 1;
                                             emit num_words <-

                                             count_words(input.txtField);

          87 TB        0.5 TB       0.5 TB



Q1: SELECT SUM(count_words(txtField)) / COUNT(*)
    FROM T1

MR overheads: launch jobs, schedule 0.5M tasks, assemble record
                                                           25
Impact of serving tree depth
      execution time (sec)




       (returns 100s of records)   (returns 1M records)


Q2:   SELECT country, SUM(item.amount) FROM T2
      GROUP BY country

Q3:   SELECT domain, SUM(item.amount) FROM T2
      WHERE domain CONTAINS ’.net’
      GROUP BY domain
                                   40 billion nested items26
Scalability
     execution time (sec)




                                        number of
                                        leaf servers




Q5 on a trillion-row table T4:
     SELECT TOP(aid, 20), COUNT(*) FROM T4
                                                 27
Outline
 Introduction
 Nested columnar storage
 Query processing
 Experiments
 Conclusion




                            28
Observation
                          Monthly query workload
                          of one 3000-node
percentage of queries     Dremel instance




                                          execution
                                          time (sec)




     Most queries complete under 10 sec                29
Conclusion
 Dremel is an query system
  For analysis of read-only nested data
 Main feature is fast(interactive response
  time), column-oriented, SQL-like Query
  language
 Introduced two major method:
    ◦ Nested columnar storage – in order to solve
      partial query problem.
    ◦ Query processing – parallel processing &
      distributing, decompressing queries


                                                    30
Comments
 Google is awesome!
 Pros
    ◦ Nested storage gives us more flexibility.
    ◦ repetition & definition is an novel idea
      and it can solve the locality problem
      easily.
    ◦ Distributed serving tree is awesome,
      faster than MR.



                                                  31
Comments
   Cons
    ◦ Read-only may not fit every requirement
    ◦ Dremel is not a database, so you’ll need
      to convert your real data into dremel
      when analyzing
      Converting may cost lost of time and space
      Google doesn’t care this problem, they have
       GFS and many servers.




                                                     32
Thank you for listening.




                           33

Contenu connexe

Tendances

In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013Gluster.org
 
C# as a System Language
C# as a System LanguageC# as a System Language
C# as a System LanguageScyllaDB
 
CCDT(client connection)MQ.docx
CCDT(client connection)MQ.docxCCDT(client connection)MQ.docx
CCDT(client connection)MQ.docxsarvank2
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...xKinAnx
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan BlueDatabricks
 
C API for Lotus Notes & Domino
C API for Lotus Notes & DominoC API for Lotus Notes & Domino
C API for Lotus Notes & DominoUlrich Krause
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Databricks
 
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개 [Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개 CJ Olivenetworks
 
Data Virtualization Manager for z/OS
Data Virtualization Manager for z/OS Data Virtualization Manager for z/OS
Data Virtualization Manager for z/OS Gustav Lundström
 
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...Surekha Parekh
 
Distributed Locking in Kubernetes
Distributed Locking in KubernetesDistributed Locking in Kubernetes
Distributed Locking in KubernetesRafał Leszko
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...DataWorks Summit/Hadoop Summit
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HAharoonm
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013larsgeorge
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 

Tendances (20)

Google File System
Google File SystemGoogle File System
Google File System
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
 
C# as a System Language
C# as a System LanguageC# as a System Language
C# as a System Language
 
CCDT(client connection)MQ.docx
CCDT(client connection)MQ.docxCCDT(client connection)MQ.docx
CCDT(client connection)MQ.docx
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
 
Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
C API for Lotus Notes & Domino
C API for Lotus Notes & DominoC API for Lotus Notes & Domino
C API for Lotus Notes & Domino
 
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
Apache Spark Performance Troubleshooting at Scale, Challenges, Tools, and Met...
 
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개 [Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개
[Gridgain]인메모리컴퓨팅 및 국내레퍼런스 소개
 
Data Virtualization Manager for z/OS
Data Virtualization Manager for z/OS Data Virtualization Manager for z/OS
Data Virtualization Manager for z/OS
 
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS  by  Namik Hrle ...
Efficient Monitoring & Tuning of Dynamic SQL in DB2 for z/OS by Namik Hrle ...
 
PostgreSQL replication
PostgreSQL replicationPostgreSQL replication
PostgreSQL replication
 
GDPS and System Complex
GDPS and System ComplexGDPS and System Complex
GDPS and System Complex
 
Distributed Locking in Kubernetes
Distributed Locking in KubernetesDistributed Locking in Kubernetes
Distributed Locking in Kubernetes
 
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
 
PostgreSQL HA
PostgreSQL   HAPostgreSQL   HA
PostgreSQL HA
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 

Similaire à Dremel: interactive analysis of web-scale datasets

Apache parquet - Apache big data North America 2017
Apache parquet - Apache big data North America 2017Apache parquet - Apache big data North America 2017
Apache parquet - Apache big data North America 2017techmaddy
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets robertlz
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...source{d}
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon
 
Polygot persistence for Java Developers - August 2011 / @Oakjug
Polygot persistence for Java Developers - August 2011 / @OakjugPolygot persistence for Java Developers - August 2011 / @Oakjug
Polygot persistence for Java Developers - August 2011 / @OakjugChris Richardson
 
Programming Design Guidelines
Programming Design GuidelinesProgramming Design Guidelines
Programming Design Guidelinesintuitiv.de
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Using RDA for Archives and Manuscripts
Using RDA for Archives and ManuscriptsUsing RDA for Archives and Manuscripts
Using RDA for Archives and ManuscriptsAdrienne Pruitt
 
Practical Implementation of Space-Efficient Dynamic Keyword Dictionaries
Practical Implementation of Space-Efficient Dynamic Keyword DictionariesPractical Implementation of Space-Efficient Dynamic Keyword Dictionaries
Practical Implementation of Space-Efficient Dynamic Keyword DictionariesShunsuke Kanda
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesJimmy Lai
 
Keg.io
Keg.ioKeg.io
Keg.ioHeroku
 
Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Chris Richardson
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetupvmoorthy
 

Similaire à Dremel: interactive analysis of web-scale datasets (20)

Apache parquet - Apache big data North America 2017
Apache parquet - Apache big data North America 2017Apache parquet - Apache big data North America 2017
Apache parquet - Apache big data North America 2017
 
Understanding hdfs
Understanding hdfsUnderstanding hdfs
Understanding hdfs
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...
 
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBaseHBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
HBaseCon 2015: S2Graph - A Large-scale Graph Database with HBase
 
Seminar_Final
Seminar_FinalSeminar_Final
Seminar_Final
 
Polygot persistence for Java Developers - August 2011 / @Oakjug
Polygot persistence for Java Developers - August 2011 / @OakjugPolygot persistence for Java Developers - August 2011 / @Oakjug
Polygot persistence for Java Developers - August 2011 / @Oakjug
 
Programming Design Guidelines
Programming Design GuidelinesProgramming Design Guidelines
Programming Design Guidelines
 
Network and DNS Vulnerabilities
Network and DNS VulnerabilitiesNetwork and DNS Vulnerabilities
Network and DNS Vulnerabilities
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Using RDA for Archives and Manuscripts
Using RDA for Archives and ManuscriptsUsing RDA for Archives and Manuscripts
Using RDA for Archives and Manuscripts
 
Practical Implementation of Space-Efficient Dynamic Keyword Dictionaries
Practical Implementation of Space-Efficient Dynamic Keyword DictionariesPractical Implementation of Space-Efficient Dynamic Keyword Dictionaries
Practical Implementation of Space-Efficient Dynamic Keyword Dictionaries
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languages
 
Keg.io
Keg.ioKeg.io
Keg.io
 
Chado-XML
Chado-XMLChado-XML
Chado-XML
 
Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...Polyglot persistence for Java developers - moving out of the relational comfo...
Polyglot persistence for Java developers - moving out of the relational comfo...
 
Hadoop architecture meetup
Hadoop architecture meetupHadoop architecture meetup
Hadoop architecture meetup
 
C++primer
C++primerC++primer
C++primer
 
Avro
AvroAvro
Avro
 
2011 09-pdfjs
2011 09-pdfjs2011 09-pdfjs
2011 09-pdfjs
 

Plus de Hung-yu Lin

2014 database - course 2 - php
2014 database - course 2 - php2014 database - course 2 - php
2014 database - course 2 - phpHung-yu Lin
 
2014 database - course 3 - PHP and MySQL
2014 database - course 3 - PHP and MySQL2014 database - course 3 - PHP and MySQL
2014 database - course 3 - PHP and MySQLHung-yu Lin
 
2014 database - course 1 - www introduction
2014 database - course 1 - www introduction2014 database - course 1 - www introduction
2014 database - course 1 - www introductionHung-yu Lin
 
OpenWebSchool - 11 - CodeIgniter
OpenWebSchool - 11 - CodeIgniterOpenWebSchool - 11 - CodeIgniter
OpenWebSchool - 11 - CodeIgniterHung-yu Lin
 
OpenWebSchool - 06 - PHP + MySQL
OpenWebSchool - 06 - PHP + MySQLOpenWebSchool - 06 - PHP + MySQL
OpenWebSchool - 06 - PHP + MySQLHung-yu Lin
 
OpenWebSchool - 05 - MySQL
OpenWebSchool - 05 - MySQLOpenWebSchool - 05 - MySQL
OpenWebSchool - 05 - MySQLHung-yu Lin
 
OpenWebSchool - 02 - PHP Part I
OpenWebSchool - 02 - PHP Part IOpenWebSchool - 02 - PHP Part I
OpenWebSchool - 02 - PHP Part IHung-yu Lin
 
OpenWebSchool - 01 - WWW Intro
OpenWebSchool - 01 - WWW IntroOpenWebSchool - 01 - WWW Intro
OpenWebSchool - 01 - WWW IntroHung-yu Lin
 
OpenWebSchool - 03 - PHP Part II
OpenWebSchool - 03 - PHP Part IIOpenWebSchool - 03 - PHP Part II
OpenWebSchool - 03 - PHP Part IIHung-yu Lin
 
Google App Engine
Google App EngineGoogle App Engine
Google App EngineHung-yu Lin
 

Plus de Hung-yu Lin (11)

2014 database - course 2 - php
2014 database - course 2 - php2014 database - course 2 - php
2014 database - course 2 - php
 
2014 database - course 3 - PHP and MySQL
2014 database - course 3 - PHP and MySQL2014 database - course 3 - PHP and MySQL
2014 database - course 3 - PHP and MySQL
 
2014 database - course 1 - www introduction
2014 database - course 1 - www introduction2014 database - course 1 - www introduction
2014 database - course 1 - www introduction
 
OpenWebSchool - 11 - CodeIgniter
OpenWebSchool - 11 - CodeIgniterOpenWebSchool - 11 - CodeIgniter
OpenWebSchool - 11 - CodeIgniter
 
OpenWebSchool - 06 - PHP + MySQL
OpenWebSchool - 06 - PHP + MySQLOpenWebSchool - 06 - PHP + MySQL
OpenWebSchool - 06 - PHP + MySQL
 
OpenWebSchool - 05 - MySQL
OpenWebSchool - 05 - MySQLOpenWebSchool - 05 - MySQL
OpenWebSchool - 05 - MySQL
 
OpenWebSchool - 02 - PHP Part I
OpenWebSchool - 02 - PHP Part IOpenWebSchool - 02 - PHP Part I
OpenWebSchool - 02 - PHP Part I
 
OpenWebSchool - 01 - WWW Intro
OpenWebSchool - 01 - WWW IntroOpenWebSchool - 01 - WWW Intro
OpenWebSchool - 01 - WWW Intro
 
OpenWebSchool - 03 - PHP Part II
OpenWebSchool - 03 - PHP Part IIOpenWebSchool - 03 - PHP Part II
OpenWebSchool - 03 - PHP Part II
 
Google App Engine
Google App EngineGoogle App Engine
Google App Engine
 
Redis
RedisRedis
Redis
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Dremel: interactive analysis of web-scale datasets

  • 1. Dremel: Interactive Analysis of Web- Scale Datasets Sergey Melnik, Andrey Gubarev, Jing Jing Long, Geoffrey Romer, Shiva Shivakumar, Matt Tolton, Theo Vassilakis Google, Inc. VLDB, 2010 Presentation by ensky (enskylin@gmail.com)
  • 2. Outline  Introduction  Nested columnar storage  Query processing  Experiments  Conclusion 2
  • 3. Introduction  Dremel is an query system For analysis of read-only nested data  Use case Interactive Trends Tools Detection Web Spam Network Dashboards Optimization 3
  • 4. DocId: 10 Links Features Forward: 20 Name Language Code: 'en-us' Country: 'us'  Multi-level structure Url: 'http://A' Name  SQL-like query language Url: 'http://B'  Fast: Trillion-row tables in second level  Big scale: thousands of CPUs and petabytes of data  Widely used by google: in production since 2006 SELECT A, COUNT(B) FROM T GROUP BY A T = {/gfs/1, /gfs/2, …, /gfs/100000} 4
  • 5. Contribution  This paper presented two major technique in Dremel:  Nested columnar storage ◦ store / split into columns ◦ Assembly to record  Query ◦ Language ◦ Execution 5
  • 6. Outline  Introduction  Nested columnar storage  Query processing  Experiments  Conclusion 6
  • 7. Why nested?  Flexible ◦ Data in web and scientific is often non- relational  Reduce cost ◦ Normalizing and recombining nested data is often prohibited in TB, PB scale of data. 7
  • 8. Why column? DocId: 10 Links column-oriented Forward: 20 Name A * * ... Language Code: 'en-us' Country: 'us' B E Url: 'http://A' * C D r1 Name Url: 'http://B' r1 r1 r1 r2 r2 r2 Read less, r2 cheaper Record-oriented decompression Challenge: preserve structure, reconstruct from a subset of fields 8
  • 9. DocId: 10 Nested data model Links Forward: 20 message Document { Forward: 40 Forward: 60 required int64 DocId; [1,1] Name optional group Links { Language Code: 'en-us' repeated int64 Backward; [0,*] Country: 'us' repeated int64 Forward; Language } Code: 'en' Url: 'http://A' repeated group Name { Name repeated group Language { Url: 'http://B' Name required string Code; Language optional string Country; [0,1] Code: 'en-gb' Country: 'gb' } optional string Url; DocId: 20 } Links } Backward: 10 Backward: 30 Forward: 80 Name http://code.google.com/apis/protocolbuffers Url: 'http://C' 9
  • 10. Column-striped representation DocId Name.Url Links.Forward Links.Backward value r d value r d value r d value r d 10 0 0 http://A 0 2 20 0 2 NULL 0 1 20 0 0 http://B 1 2 40 1 2 10 0 2 DocId: 10 NULL 1 1 60 1 2 30 1 2 Links http://C 0 2 80 0 2 Forward: 20 Forward: 40 Forward: 60 Name Language Name.Language.Code Name.Language.Country Code: 'en-us' Country: 'us' value r d value r d Language en-us 0 2 us 0 3 Code: 'en' Url: 'http://A' en 2 2 NULL 2 2 Name Url: 'http://B' NULL 1 1 NULL 1 1 Name Language en-gb 1 2 gb 1 3 Code: 'en-gb' NULL 0 1 NULL 0 1 Country: 'gb' 10
  • 11. Column-oriented problem  When query sub-tree, there is no way for you to know the whole structure(even the position of yourself) A * * B ... E  To solve this problem, * this paper presents C D repetition & definition r1 r1 r1 r2 r2 Where am I? r2 11
  • 12. Repetition and definition levels r DocId: 10 Links 1 Forward: 20 Forward: 40 Name.Language.Code Forward: 60 value r d Name First time, repeat=0 Language en-us 0 2 Code: 'en-us' Language repeat, level = 2 Country: 'us' en 2 2 Language Code: 'en' NULL 1 1 Url: 'http://A' en-gb 1 2 Name Url: 'http://B' NULL 0 1 Name Language Code: 'en-gb' None repeat, level = 0 Country: 'gb' DocId: 20 2 r Links Backward: 10 Backward: 30 r: At what repeated field in the field's path Forward: 80 the value has repeated Name Url: 'http://C'12
  • 13. Repetition and definition levels r DocId: 10 Links 1 Forward: 20 Forward: 40 Name.Language.Country Forward: 60 value r d Name Language us 0 3 Code: 'en-us' Country: 'us' NULL 2 2 Language Code: 'en' NULL 1 1 Url: 'http://A' gb 1 3 Name Url: 'http://B' NULL 0 1 Name Language Code: 'en-gb' Country: 'gb' DocId: 20 2 r Links Backward: 10 Backward: 30 d: How many fields in paths that could be Forward: 80 Name undefined (opt. or rep.) are actually present Url: 'http://C'13
  • 14. Column-oriented is all right, but what if I still need record- oriented query? (e.g., MapReduce) 14
  • 15. Record assembly FSM Transitions labeled with repetition levels DocId 0 0 1 1 Links.Backward Links.Forward 0 0,1,2 Name.Language.Code Name.Language.Country 2 Name.Ur 0,1 1 l 0 For record-oriented data processing (e.g., MapReduce) 15
  • 16. Reading two fields DocId: 10 s 1 Name Language DocId Country: 'us' Language 0 Name Name 1,2 Name.Language.Country Language 0 Country: 'gb' DocId: 20 s2 Name Structure of parent fields is preserved. Useful for queries like /Name[3]/Language[1]/Country 16
  • 17. Outline  Introduction  Nested columnar storage  Query processing  Experiments  Conclusion 17
  • 18. Query language  Based on SQL  Performs ◦ Projection ◦ Selection ◦ Nested subqueries ◦ inner and intra-record aggregation ◦ Top-k ◦ Joins ◦ User-defined functions 18
  • 19. DocId: 10 Links Forward: 20 Forward: 40 Example usage Forward: 60 Name Language Code: 'en-us' Country: 'us' SELECT DocId AS Id, Language Code: 'en' COUNT(Name.Language.Code) WITHIN Name AS Cnt, Url: 'http://A' Name Name.Url + ',' + Name.Language.Code AS Str Url: 'http://B' Name FROM t Language WHERE REGEXP(Name.Url, '^http') AND DocId < 20; Code: 'en-gb' Country: 'gb' Output table Output schema Id: 10 t1 message QueryResult { Name required int64 Id; Cnt: 2 repeated group Name { Language optional uint64 Cnt; Str: 'http://A,en-us' repeated group Language { Str: 'http://A,en' optional string Str; Name } Cnt: 0 } } 19
  • 20. Query execution client • Parallelizes scheduling root server and aggregation • Fault tolerance intermediate ... • query dispatcher can servers dispatch servers leaf servers ... (with local ... storage) storage layer (e.g., GFS) 20
  • 21. Example: count() SELECT A, COUNT(B) FROM T GROUP SELECT A, SUM(c) 0 BY A FROM (R11 UNION ALL R110) T = {/gfs/1, /gfs/2, …, /gfs/100000} GROUP BY A R11 R12 SELECT A, COUNT(B) AS c SELECT A, COUNT(B) AS c 1 FROM T11 GROUP BY A FROM T12 GROUP BY A T11 = {/gfs/1, …, /gfs/10000} T12 = {/gfs/10001, …, /gfs/20000} ... SELECT A, COUNT(B) AS c 3 FROM T31 GROUP BY A ... T31 = {/gfs/1} Data access ops 21
  • 22. Outline  Introduction  Nested columnar storage  Query processing  Experiments  Conclusion 22
  • 23. Experiment Data • 1 PB of real data (uncompressed, non-replicated) • 100K-800K tablets per table • Experiments run during business hours Table Number of Size (unrepl., Number Data Repl. name records compressed) of fields center factor T1 85 billion 87 TB 270 A 3× T2 24 billion 13 TB 530 A 3× T3 4 billion 70 TB 1200 A 3× T4 1+ trillion 105 TB 50 B 3× T5 1+ trillion 20 TB 30 B 2× 23
  • 24. Column v.s. Record"cold" time on local disk, time (sec) averaged over 30 runs (e) parse as from records C++ objects 10x speedup using columnar objects storage (d) read + decompress records (c) parse as from columns columns C++ objects (b) assemble 2-4x overhead of records using records (a) read + decompress number of fields Table partition: 375 MB (compressed), 300K rows, 125 columns 24
  • 25. MR and Dremel execution Avg # of terms in txtField in 85 billion record table T1 execution time (sec) on 3000 nodes Sawzall program ran on MR: num_recs: table sum of int; num_words: table sum of int; emit num_recs <- 1; emit num_words <- count_words(input.txtField); 87 TB 0.5 TB 0.5 TB Q1: SELECT SUM(count_words(txtField)) / COUNT(*) FROM T1 MR overheads: launch jobs, schedule 0.5M tasks, assemble record 25
  • 26. Impact of serving tree depth execution time (sec) (returns 100s of records) (returns 1M records) Q2: SELECT country, SUM(item.amount) FROM T2 GROUP BY country Q3: SELECT domain, SUM(item.amount) FROM T2 WHERE domain CONTAINS ’.net’ GROUP BY domain 40 billion nested items26
  • 27. Scalability execution time (sec) number of leaf servers Q5 on a trillion-row table T4: SELECT TOP(aid, 20), COUNT(*) FROM T4 27
  • 28. Outline  Introduction  Nested columnar storage  Query processing  Experiments  Conclusion 28
  • 29. Observation Monthly query workload of one 3000-node percentage of queries Dremel instance execution time (sec) Most queries complete under 10 sec 29
  • 30. Conclusion  Dremel is an query system For analysis of read-only nested data  Main feature is fast(interactive response time), column-oriented, SQL-like Query language  Introduced two major method: ◦ Nested columnar storage – in order to solve partial query problem. ◦ Query processing – parallel processing & distributing, decompressing queries 30
  • 31. Comments  Google is awesome!  Pros ◦ Nested storage gives us more flexibility. ◦ repetition & definition is an novel idea and it can solve the locality problem easily. ◦ Distributed serving tree is awesome, faster than MR. 31
  • 32. Comments  Cons ◦ Read-only may not fit every requirement ◦ Dremel is not a database, so you’ll need to convert your real data into dremel when analyzing  Converting may cost lost of time and space  Google doesn’t care this problem, they have GFS and many servers. 32
  • 33. Thank you for listening. 33