SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Metaxa Architecture
       June 22th
 By Camuel, OpenDremel
Meet Metaxa
•   Implements Dremel using LAPHROAIG as execution engine and as storage
    backend.
•   No distribution, METAXA is single jar file and executed in single JVM, it
    produced and executes single threaded MAP job.
•   All input data reside inside single LAPHROAIG object.
•   Output is one of following:
           •   New LAPHROAIG objet
           •   Streamed back.
•   Convert type commands convert single LAPHROAIG object from popular
    objects serialization formats to nested columnar dremel format or vice versa.
•   Query type commands process LAPHROAIG objects in nested columnar
    dremel format and can store result in another object or convert them to
    popular objects serialization formats and stream back to user.
•   LAPHROAIG object is a container of other “serialized objects” or
    “columnar encoded objects”. Two types of objects not to be confused.
•   Just four use cases:
     –   Convert “serialized objects” into “columnar encoded objects”.
     –   Convert “columnar encoded objects” into “serialized objects”.
     –   Query “columnar encoded objects” with BQL producing “serialized objects”
         and streaming it back to caller.
     –   Query “columnar encoded objects” with BQL producing “serialized objects”
         and saving it as new LAPHROAIG “container” object
     –   Query “columnar encoded objects” with BQL producing “columnar
         encoded objects” and saving it as new LAPHROAIG “container” object
Use case #1: Convert serialized objects into columnar-encoded
                            objects
   Convert
  Command              Hierarchical
                        Schema                      Serialized objects
                                                 (Protobuf, Avro, Thrift)




Metaxa.jar                            LAPHROAIG




                                      columnar-encoded
                                       objects (Tablet)
Use case #2: Convert columnar-encoded objects into serialized
                          objects
   Convert
  Command
                                 columnar-encoded
                                  objects (Tablet)




Metaxa.jar                            LAPHROAIG




                       Hierarchical
                        Schema                          Serialized objects
                                                     (Protobuf, Avro, Thrift)
Use case #3: Query “columnar encoded objects”   with BQL
   producing “serialized objects” and streaming it back to
                           caller.
                    BQL                          columnar-encoded
                    Query
                                                  objects (Tablet)




                Metaxa.jar                       LAPHROAIG




 Hierarchical
  Schema                 Serialized objects
                      (Protobuf, Avro, Thrift)
Use case #4: Query “columnar encoded objects” with BQL producing
                   “serialized objects” and saving it


          BQL                                     columnar-encoded
          Query                                    objects (Tablet)




    Metaxa.jar                               LAPHROAIG




                                   Hierarchical
                                    Schema           Serialized objects
                                                  (Protobuf, Avro, Thrift)
Use case #5: Query
                “columnar encoded objects” with BQL
   producing “columnar encoded objects” and saving it

               BQL                     columnar-encoded
               Query
                                        objects (Tablet)




        Metaxa.jar                    LAPHROAIG




                                     columnar-encoded
                                      objects (Tablet)
SerObjs – Serialized Objects
• A result data got by serializing objects with
  Protobuf, Avro and Thrift.
• Hierarchical data.
• Flat data like CSV
• RDBMS originated data.
• Data from KV-stores and document stores.
• Logs.
• Schema may be embedded or provided
  separately.
Tablet– Columnar-encoded objects
•   Immutable chunk of data.
•   Logically comprised from Slices and can be turned into Slice series.
•   Columnar and dremel-encoded.
•   Consists of header (called Tablet Schema) and multiple {byte, word, dword or
    qword}-streams.
•   Tablet schema describes
    –   Tablet columns (multi-dimensional arrays) including metadata and compression and encoding metadata
        as well as references for associated dictionaries, rep & def levels and etc.
    –   Original SerObjs schema and mapping to tablet columns
    –   Future: additional SerObjs schemas and mappings
•   Tablet data are a set of multidimensional arrays of 8,16 ,32 or 64 bit elements
    denoted byte or b, word or w, double word or dw and quad word or qw. Each
    arrays represents a column and can be accessed independently without incurring access
    costs for neighbor arrays. Every element is a bit-field with various bits representing
    different information. For example (multiple) column values, counts (RLE)m rep and
    def levels.
•   Tablet scanner can mask some of the details of column encoding and provide
    higher-level interface to tablet automatically decoding RLE, dictionary and rep & def
    levels. However, tablet binary format is an stable interface between Metaxa
    modules and between different versions of OpenDremel system
•   Tablet are horizontal partitions of larger columnar dataset.
Slice– Columnar-encoded object fraction
 •   Slice is a vector (ordered list of scalars) where each scalar corresponds to a current
     value of a different tablet column that is being scanned / iterated.
 •   Tablet can be broken down into ordered list of slices and comprised back from
     series of slices.
 •   Slice in Metaxa contains plain integer values (not bit fields) of b, w, dw and qw.
 •   Slice may contain less values than columns in tablet. In this case columns
     represented in slice are called “projected columns”.
 •   Slice also contains additional integer field called Level. This Level is also aliased as
     FetchLevel or SelectLevel depending whether Tablet is being sliced into
     series of slices or being reconstructed from series of slices.
Query Plan (QP)
• QP is a descriptor of source tablet, a result tablet and a set of scalar
  transformations and a DAG of their dataflow interconnections.
• Scalar transformations are of one of following types
    – Plain transformation => Also called expressions, many inputs but one output.
    – Predicates => boolean expression which when evaluating to false cancels the issuance of
      the result slice.
    – Aggregates => Count, Sum and Distinct functions, aggregates slices and then when the
      last slice in a aggregation group is detected, issues multiple result slices.
• QP input and output is always slice. Because of predicates it is
  possible that for some input slices no output slice will be issued. Also
  because of aggregates it is also possible that for one input slice,
  multiple output slices will be issued.
• Input slices contain FetchLevel and output slices contain SelectLevel.
  (according to appendix D in paper)
Conceptual View of Tablet
Levels
(dimensions)

 0 1 2
            Record [5]   Record [4]   Record [3]   Record [2]   Record [1]   Record [0]



[]




[ ][ ]




[ ][ ][ ]
Conceptual View of Tablet Slicing
Levels
(dimensions)

 0 1 2                   Slice       Slice       Slice       Slice       Slice
            Record [0]                                                               Slice
                         [0][2][2]   [0][1][1]   [0][1][0]   [0][0][2]   [0][0][1]   [0][0][0]


[]

[ ][ ]

[ ][ ]

[ ][ ][ ]

[ ][ ][ ]
Conceptual View of QP
Levels
(dimensions)

 0 1 2                                Slice       Slice
            Record [1]   Record [0]
                                      [0][1][1]   [0][0][0]


[]                                                            Expr (rep=0)         []

[ ][ ]
                                                              Expr (rep=1)     [ ][ ]
[ ][ ]

[ ][ ][ ]                                                     Expr (rep=2)   [ ][ ][ ]

[ ][ ][ ]
Compiler
Translates BQL into Query Plan
 Requirements:
 – Must parse and compile valid BQL as defined by BigQuery.
 – Must not accept invalid BQL and supply user-friendly messages.
 – Must produce executable QP object with following features:
    • It is Serializable => without circular references, without references to “system”
        objects like file handlers, pure object model
    •   getProcessSliceSource => returns text of in java source-code form
    •   getSourceTablets => returns tablets to run QP on
    •   setResultTablet => Sets result tablet
    •   setExecutionStatusCode => to indicate status of QP execution
    •   log => allows logging important events during QP execution
    •   getDiagram => returns graphic image of QP diagram (for debugging)
 – Must provide basic command-line arguments functionality as well as
   simple shell functionality.
Vocabulary                      Compiler


•   Token - lexeme
•   Parse tree – token tree
•   AST – Abstract Syntax Tree
•   SM – Semantic Model
•   ASM – Annotated Semantic Model
•   QP – Query Plan
•   DAG – Directed Acyclic Graph
•   Schema – Metadata about dataset.
Compiler
Prerequisite Materials
–   http://code.google.com/apis/bigquery/docs/query-reference.html
– http://www.antlr.org/
– http://en.wikipedia.org/wiki/Parsing
– http://en.wikipedia.org/wiki/Query_plan
– http://en.wikipedia.org/wiki/Compiler_construction
– http://www.amazon.com/Terence-Parr/e/B001JS3O0U
Compiler
High-Level Design (verbose)
                                                                                                      SerObjs
  Command                                                                                             Schema
      line
  arguments
    / shell        Shell
     input                       BQL                Antlr
                                                                           AST              SemanticP
                                                    Parser
                                                                                              arser


                 Result
 Result                            SM                 Semantic Analyzer
 SerObjs        Schema                                     •Validation                              SM
 Schema        Generator        Annotated                  •Resolving references
                                                           •Result Schema Inference            Semantic Model
                                Semantic
                                                           •Optimization                     (Java object model
                                 Model
                                                                                            implemented via java
                 QP                                                                              collections)
   QP
               Generator
Query Plan
 (includes                                  Metadata
ResultTablet                            (files locations     Optimization             Validation
 metadata)
                     C / asm             and statistics)        Rules                   Rules
                     Template
Compiler
[Annotated] Semantic Model

 • Comprehensibly describes query to every detail
 • Java objects (packed into collections, without
   spaghetti cyclic references)
 • Must be serializable with SerObjLib
   framework to a file and restorable.
 • Must be printable to something comprehensible
   by human
 • Must be rendered on request into nice graphic
   diagram with legend.
QP: Scalar Transformation functions (Expr)                          Compiler


  • Set of primitive predefined scalar operations and functions applied on
    xfunc arguments in particular prescribed order.
  • Expressed in valid C or assembly with some restrictions.
  • Purely functional => side-effect free. Meaning no static/global
    variables and no memory allocations. However, for performance and
    brevity they are inlined into single processSlice function.
  • Some functions have a context object where they can store their
    externalized state between calls. One regular and one associative array
    is provided as context for this functions
      – Context-free transformation functions
          • One value in, one value out   a+b
      – Scalar context transformation functions
          • Many value in, many value out sum(a) within links
      – Map context transformation functions
          • Many value in, many value out (out of sync) sum(a) group by date
Compiler
             QP in C Form
• Generated ProcessSlice(..){..} function.
    – Input: inSlice
    – Output: outSlice
    – Context object for state-externalization
• inSlice contains scalar values for every source function and
  also fetchLevel
• outSlice must have correct scalar values for every result
  function and also correct selectLevel.
   – outSlice are guarantied to preserve its content between calls. So it can be
     used as cache result functions that haven’t changed and also as cache for
     selectLevel if it is not changed.

   – outSlice values can also be read (contains results of previous outSlice)

   – on first call all values on outSlice are guaranteed to be zeros.
QP template                                        Compiler
              (according to appendix D)
void   processSlice(inSlice,   outSlice, Context) {

    Evaluate where clause…, if evaluates to false then do:
    outSlice.setSkip;
    outSlice.selectLevel = min(outSlice.selectLevel, inSlice.fetchLevel);
    return;



    If where clause evaluates to true then…
    switch(inSlice.fetchLevel) {
         case 0:
                  Evaluate expressions (xfuncs) with repetition level = 0
         ……..
         ……..
         case n:
                  Evaluate expressions (xfuncs) with repetition level = n
                  If it is the last slide in aggregation group then:
                  //the below line will cause to additional calls to ProcessSlice
                  outSlice.setAdditionalSliceCount( Number of slices in aggregation
    }
}
Columnar Abstraction
•   Tableton is a set of sequentially-accessed multidimensional scalar arrays.
•   Tablet is serialized dremel-encoded columnar dataset with fixed size. Each array in
    tablet can be independently serially accessed without incurring the cost of buffering
    neighbor arrays.
•   Four types of arrays: bytes, words (16b), dwords(32b), qwords(64b).
•   Following operations are defined:

     –   Parsing Tablet Schema    => reading and parsing tablet header/metadata also called tablet
                                  schema and providing an object model for it.

     –   Reading                  => converting Tablet to SerObjs using FSM for better performance as
                                  descrbed in Dremel paper (calling calback functions to let them construct
                                  SerObjs in various formats)

     –   Slicing                  => synchronized multi-array scalar iteration of Tablet


     –   Building Tablet Schema   => creating tablet header/metadata also called tablet schema with
                                  convenient builder API. Also called TabletSchema Editor.

     –   Construction             => re-creating Tablet from slices, this interface is also used for dissecting
                                  SerObjs into tablet.

     –   Compaction               => constructed Tablet is compressed and hash key generated for it and
                                  from that point on it becomes immutable.

Tableton
What about other datatypes?
• They are mapped into yet another dimension of
  scalar array.
• It is strongly recommended not to use java strings.
  They are impossible to work with without incurring
  full cost of object lifecycle management.
• It is ok not to support them at all, and then
  gradually add support for them.
• All Java string class goodies will anyway be
  impossible to support in Metaxa because of
  performance.
• Same thing about BLOB, images and any other
  complex data type. All are mapped to yet another
  dimension of scalar array.
Tableton
Hierarchical vs. Columnar
• Different abstractions / domains / contexts
• Different schemas
• Most confusion stems from not differentiating!
• Always keep in mind the context when you r developing…
• Don’t thinks about both in the same time unless you are
  willing to develop schizophrenia.
• Columnar is not an implementation artifact of hierarchical.
  Columnar is whole new model in its own
• We must adopt two different vocabulary for these domains.
  Confusion is notoriously common here.

Tableton
Hierarchical vs. Columnar
                    Hierarchical                       Columnar

               A SerObjs in our lingo             A Tablet in our lingo

             Protobuf, Avro, Thirft files      Dremel generated tablets

                 Serialized Objects             Multi-dimensional arrays

           The only user-level abstraction    User never knows what it is

           BQL queries written against it    Query plans executed against it

               More frontend-related             More backend-related

           More logical / external format    More physical / internal format

           hierarchical is queried            Columns are scanned
           SerObjLib component                Tableton component
Tableton
Hierarchical Example




Tableton
Executes QP against tablets
• Requirements
   – Must convert QP into executable bytecode and execute it (not interpret).
   – Must work with QP in object-model, but initially compiling and running
     QP in java form will suffice.
   – Must not mask data and task parallelism.
       • Data parallelism on tablet level and also on column level within tablet.
       • Task parallelism on separate QP transformation functions
   – Must be ultra-high performance
       • Latency overhead within few milliseconds (assuming data in RAM).
       • Throughput multi GB/sec




                                                                                    Executor
Vocabulary

• QP – Query Plan
• DAG – Directed Acyclic Graph
• Slot – Like thread (todo)
• Expression – operator tree on scalar arguments and
  scalar constants
• CF – Context Free (stateless scalar expression)
• FC – Fixed-size Context (scalar expr with
  accumulator)
• VC – variable-size Context (scalar expr with
  growing list of accumulators)
                                             Executor
Code generation
 • [todo] Janino!
 • [todo] Explain dynamic java code generation
   and compilation
 • [todo] Use code templates! No classes/functions
   & classes just code listing with labels and jumps.
   Generated code is every time different no one is
   going to study it. Put static-portions in library
   and pre-compile it regularly. All dynamic portion
   is just code snippet

                                              Executor
Thanks
(sneak preview of future versions in next slides)
The overall vision for OpenDremel
• Interactive data cloud platform for managing
  high volumes of static data in forms of
  serialized objects.
• Compatible to Google tools such as BigQuery,
  prediction API, Fusion Tables and Google
  storage and etc...
• Aggressively use existing open-source
  software, preferably apache licensed to
  quickly “implement” desired functionality.
Features Backlog
• Processing compressed data directly without decompressing.
• Macro parallelism 1) multithreading 2) multi-process 3)multi-node 4)
  massive clustering
• Micro parallelism 1) SSE&AVX 2) OpenCL 3) Better machine code to
  leverage ILP 4) light-threads for parallel processing of single tablet 5)
  LLVM 6) special hardware GPU & tilera
• Interactive joins and indexing support, zone maps and global system-
  recognized dimensions such as time, geography, ip
• Advanced analytics, statistics and machine learning capabilities.
• Richer SerObjLib, more formats
• Advanced visualization and streaming.
• Batch data-crunching and map-reduce support.
• Multi-tenancy, resource control, metering and accounting.
• CEP capabilities, fast lookups and querying also data that is not yet packed
  into tablets.
• User-defined functions.
• Scratch tables and rolling queries

Contenu connexe

Tendances

UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
Marco Gralike
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­Time
Seven Nguyen
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
Marco Gralike
 
BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will perform
Marco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
Marco Gralike
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index Strategies
Marco Gralike
 
No SQL, No problem - using MongoDB in Ruby
No SQL, No problem - using MongoDB in RubyNo SQL, No problem - using MongoDB in Ruby
No SQL, No problem - using MongoDB in Ruby
sbeam
 

Tendances (20)

Polyglot Persistence
Polyglot PersistencePolyglot Persistence
Polyglot Persistence
 
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
 
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
 
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­Time
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
 
BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will perform
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index Strategies
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XML
 
Scala for scripting
Scala for scriptingScala for scripting
Scala for scripting
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2
 
XML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBXML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDB
 
Squeak DBX
Squeak DBXSqueak DBX
Squeak DBX
 
Map-Reduce and Apache Hadoop
Map-Reduce and Apache HadoopMap-Reduce and Apache Hadoop
Map-Reduce and Apache Hadoop
 
How and Where in GLORP
How and Where in GLORPHow and Where in GLORP
How and Where in GLORP
 
XML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured dataXML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured data
 
No SQL, No problem - using MongoDB in Ruby
No SQL, No problem - using MongoDB in RubyNo SQL, No problem - using MongoDB in Ruby
No SQL, No problem - using MongoDB in Ruby
 
XFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in thereXFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in there
 

En vedette

Survey health2
Survey health2Survey health2
Survey health2
laurensj12
 
Raffel bernardiresearchpaper
Raffel bernardiresearchpaperRaffel bernardiresearchpaper
Raffel bernardiresearchpaper
prmlibrarian
 
As website evaluation
As website evaluationAs website evaluation
As website evaluation
laurensj12
 
Website research health
Website research healthWebsite research health
Website research health
laurensj12
 
Aspider M2M In The Mobile Value Chain Jan 2011
Aspider M2M In The Mobile Value Chain Jan 2011Aspider M2M In The Mobile Value Chain Jan 2011
Aspider M2M In The Mobile Value Chain Jan 2011
arthurvanmook
 
As website evaluation
As website evaluationAs website evaluation
As website evaluation
laurensj12
 
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi] [Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
Qui Bari Libera
 
GUI Meetup Spring, Ольга Павлова
GUI Meetup Spring, Ольга ПавловаGUI Meetup Spring, Ольга Павлова
GUI Meetup Spring, Ольга Павлова
Rustem Gayfutdinov
 
Aspider M2M Service Offering Jan 2011
Aspider M2M Service Offering Jan 2011Aspider M2M Service Offering Jan 2011
Aspider M2M Service Offering Jan 2011
arthurvanmook
 

En vedette (16)

Survey health2
Survey health2Survey health2
Survey health2
 
Raffel bernardiresearchpaper
Raffel bernardiresearchpaperRaffel bernardiresearchpaper
Raffel bernardiresearchpaper
 
Survey
SurveySurvey
Survey
 
As website evaluation
As website evaluationAs website evaluation
As website evaluation
 
Trabajo colaborativo Paty
Trabajo colaborativo PatyTrabajo colaborativo Paty
Trabajo colaborativo Paty
 
Website research health
Website research healthWebsite research health
Website research health
 
Aspider M2M In The Mobile Value Chain Jan 2011
Aspider M2M In The Mobile Value Chain Jan 2011Aspider M2M In The Mobile Value Chain Jan 2011
Aspider M2M In The Mobile Value Chain Jan 2011
 
As website evaluation
As website evaluationAs website evaluation
As website evaluation
 
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi] [Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
[Borsellino, mafia, servizi segreti, massoneria, vaticano, stragi]
 
GUI Meetup Spring, Ольга Павлова
GUI Meetup Spring, Ольга ПавловаGUI Meetup Spring, Ольга Павлова
GUI Meetup Spring, Ольга Павлова
 
Инструмент прототипирования GUI Machine
Инструмент прототипирования GUI MachineИнструмент прототипирования GUI Machine
Инструмент прототипирования GUI Machine
 
Apache Drill (ver. 0.2)
Apache Drill (ver. 0.2)Apache Drill (ver. 0.2)
Apache Drill (ver. 0.2)
 
Проблемы и решения проектирования и прототипирования программных интерфейсов
Проблемы и решения проектирования и прототипирования программных интерфейсовПроблемы и решения проектирования и прототипирования программных интерфейсов
Проблемы и решения проектирования и прототипирования программных интерфейсов
 
Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)Apache Drill (ver. 0.1, check ver. 0.2)
Apache Drill (ver. 0.1, check ver. 0.2)
 
Обзор и анализ инструментов проектирования и прототипирования интерфейсов
Обзор и анализ инструментов проектирования и прототипирования интерфейсовОбзор и анализ инструментов проектирования и прототипирования интерфейсов
Обзор и анализ инструментов проектирования и прототипирования интерфейсов
 
Aspider M2M Service Offering Jan 2011
Aspider M2M Service Offering Jan 2011Aspider M2M Service Offering Jan 2011
Aspider M2M Service Offering Jan 2011
 

Similaire à OpenDremel's Metaxa Architecture

Java Serialization Facts and Fallacies
Java Serialization Facts and FallaciesJava Serialization Facts and Fallacies
Java Serialization Facts and Fallacies
Roman Elizarov
 

Similaire à OpenDremel's Metaxa Architecture (20)

Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
ppt_on_java.pptx
ppt_on_java.pptxppt_on_java.pptx
ppt_on_java.pptx
 
Open stack and k8s(v4)
Open stack and k8s(v4)Open stack and k8s(v4)
Open stack and k8s(v4)
 
Java Serialization Facts and Fallacies
Java Serialization Facts and FallaciesJava Serialization Facts and Fallacies
Java Serialization Facts and Fallacies
 
Dive into spark2
Dive into spark2Dive into spark2
Dive into spark2
 
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
 
How to Build an Apache Kafka® Connector
How to Build an Apache Kafka® ConnectorHow to Build an Apache Kafka® Connector
How to Build an Apache Kafka® Connector
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo2016 feb-23 pyugre-py_mongo
2016 feb-23 pyugre-py_mongo
 
Using MongoDB and Python
Using MongoDB and PythonUsing MongoDB and Python
Using MongoDB and Python
 
The Why and How of Scala at Twitter
The Why and How of Scala at TwitterThe Why and How of Scala at Twitter
The Why and How of Scala at Twitter
 
Real-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache StormReal-Time Streaming with Apache Spark Streaming and Apache Storm
Real-Time Streaming with Apache Spark Streaming and Apache Storm
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and ...
 
Kubernetes introduction
Kubernetes introductionKubernetes introduction
Kubernetes introduction
 
Spark real world use cases and optimizations
Spark real world use cases and optimizationsSpark real world use cases and optimizations
Spark real world use cases and optimizations
 
10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL10 Reasons to Start Your Analytics Project with PostgreSQL
10 Reasons to Start Your Analytics Project with PostgreSQL
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

OpenDremel's Metaxa Architecture

  • 1. Metaxa Architecture June 22th By Camuel, OpenDremel
  • 2. Meet Metaxa • Implements Dremel using LAPHROAIG as execution engine and as storage backend. • No distribution, METAXA is single jar file and executed in single JVM, it produced and executes single threaded MAP job. • All input data reside inside single LAPHROAIG object. • Output is one of following: • New LAPHROAIG objet • Streamed back. • Convert type commands convert single LAPHROAIG object from popular objects serialization formats to nested columnar dremel format or vice versa. • Query type commands process LAPHROAIG objects in nested columnar dremel format and can store result in another object or convert them to popular objects serialization formats and stream back to user. • LAPHROAIG object is a container of other “serialized objects” or “columnar encoded objects”. Two types of objects not to be confused. • Just four use cases: – Convert “serialized objects” into “columnar encoded objects”. – Convert “columnar encoded objects” into “serialized objects”. – Query “columnar encoded objects” with BQL producing “serialized objects” and streaming it back to caller. – Query “columnar encoded objects” with BQL producing “serialized objects” and saving it as new LAPHROAIG “container” object – Query “columnar encoded objects” with BQL producing “columnar encoded objects” and saving it as new LAPHROAIG “container” object
  • 3. Use case #1: Convert serialized objects into columnar-encoded objects Convert Command Hierarchical Schema Serialized objects (Protobuf, Avro, Thrift) Metaxa.jar LAPHROAIG columnar-encoded objects (Tablet)
  • 4. Use case #2: Convert columnar-encoded objects into serialized objects Convert Command columnar-encoded objects (Tablet) Metaxa.jar LAPHROAIG Hierarchical Schema Serialized objects (Protobuf, Avro, Thrift)
  • 5. Use case #3: Query “columnar encoded objects” with BQL producing “serialized objects” and streaming it back to caller. BQL columnar-encoded Query objects (Tablet) Metaxa.jar LAPHROAIG Hierarchical Schema Serialized objects (Protobuf, Avro, Thrift)
  • 6. Use case #4: Query “columnar encoded objects” with BQL producing “serialized objects” and saving it BQL columnar-encoded Query objects (Tablet) Metaxa.jar LAPHROAIG Hierarchical Schema Serialized objects (Protobuf, Avro, Thrift)
  • 7. Use case #5: Query “columnar encoded objects” with BQL producing “columnar encoded objects” and saving it BQL columnar-encoded Query objects (Tablet) Metaxa.jar LAPHROAIG columnar-encoded objects (Tablet)
  • 8. SerObjs – Serialized Objects • A result data got by serializing objects with Protobuf, Avro and Thrift. • Hierarchical data. • Flat data like CSV • RDBMS originated data. • Data from KV-stores and document stores. • Logs. • Schema may be embedded or provided separately.
  • 9. Tablet– Columnar-encoded objects • Immutable chunk of data. • Logically comprised from Slices and can be turned into Slice series. • Columnar and dremel-encoded. • Consists of header (called Tablet Schema) and multiple {byte, word, dword or qword}-streams. • Tablet schema describes – Tablet columns (multi-dimensional arrays) including metadata and compression and encoding metadata as well as references for associated dictionaries, rep & def levels and etc. – Original SerObjs schema and mapping to tablet columns – Future: additional SerObjs schemas and mappings • Tablet data are a set of multidimensional arrays of 8,16 ,32 or 64 bit elements denoted byte or b, word or w, double word or dw and quad word or qw. Each arrays represents a column and can be accessed independently without incurring access costs for neighbor arrays. Every element is a bit-field with various bits representing different information. For example (multiple) column values, counts (RLE)m rep and def levels. • Tablet scanner can mask some of the details of column encoding and provide higher-level interface to tablet automatically decoding RLE, dictionary and rep & def levels. However, tablet binary format is an stable interface between Metaxa modules and between different versions of OpenDremel system • Tablet are horizontal partitions of larger columnar dataset.
  • 10. Slice– Columnar-encoded object fraction • Slice is a vector (ordered list of scalars) where each scalar corresponds to a current value of a different tablet column that is being scanned / iterated. • Tablet can be broken down into ordered list of slices and comprised back from series of slices. • Slice in Metaxa contains plain integer values (not bit fields) of b, w, dw and qw. • Slice may contain less values than columns in tablet. In this case columns represented in slice are called “projected columns”. • Slice also contains additional integer field called Level. This Level is also aliased as FetchLevel or SelectLevel depending whether Tablet is being sliced into series of slices or being reconstructed from series of slices.
  • 11. Query Plan (QP) • QP is a descriptor of source tablet, a result tablet and a set of scalar transformations and a DAG of their dataflow interconnections. • Scalar transformations are of one of following types – Plain transformation => Also called expressions, many inputs but one output. – Predicates => boolean expression which when evaluating to false cancels the issuance of the result slice. – Aggregates => Count, Sum and Distinct functions, aggregates slices and then when the last slice in a aggregation group is detected, issues multiple result slices. • QP input and output is always slice. Because of predicates it is possible that for some input slices no output slice will be issued. Also because of aggregates it is also possible that for one input slice, multiple output slices will be issued. • Input slices contain FetchLevel and output slices contain SelectLevel. (according to appendix D in paper)
  • 12. Conceptual View of Tablet Levels (dimensions) 0 1 2 Record [5] Record [4] Record [3] Record [2] Record [1] Record [0] [] [ ][ ] [ ][ ][ ]
  • 13. Conceptual View of Tablet Slicing Levels (dimensions) 0 1 2 Slice Slice Slice Slice Slice Record [0] Slice [0][2][2] [0][1][1] [0][1][0] [0][0][2] [0][0][1] [0][0][0] [] [ ][ ] [ ][ ] [ ][ ][ ] [ ][ ][ ]
  • 14. Conceptual View of QP Levels (dimensions) 0 1 2 Slice Slice Record [1] Record [0] [0][1][1] [0][0][0] [] Expr (rep=0) [] [ ][ ] Expr (rep=1) [ ][ ] [ ][ ] [ ][ ][ ] Expr (rep=2) [ ][ ][ ] [ ][ ][ ]
  • 15. Compiler Translates BQL into Query Plan Requirements: – Must parse and compile valid BQL as defined by BigQuery. – Must not accept invalid BQL and supply user-friendly messages. – Must produce executable QP object with following features: • It is Serializable => without circular references, without references to “system” objects like file handlers, pure object model • getProcessSliceSource => returns text of in java source-code form • getSourceTablets => returns tablets to run QP on • setResultTablet => Sets result tablet • setExecutionStatusCode => to indicate status of QP execution • log => allows logging important events during QP execution • getDiagram => returns graphic image of QP diagram (for debugging) – Must provide basic command-line arguments functionality as well as simple shell functionality.
  • 16. Vocabulary Compiler • Token - lexeme • Parse tree – token tree • AST – Abstract Syntax Tree • SM – Semantic Model • ASM – Annotated Semantic Model • QP – Query Plan • DAG – Directed Acyclic Graph • Schema – Metadata about dataset.
  • 17. Compiler Prerequisite Materials – http://code.google.com/apis/bigquery/docs/query-reference.html – http://www.antlr.org/ – http://en.wikipedia.org/wiki/Parsing – http://en.wikipedia.org/wiki/Query_plan – http://en.wikipedia.org/wiki/Compiler_construction – http://www.amazon.com/Terence-Parr/e/B001JS3O0U
  • 18. Compiler High-Level Design (verbose) SerObjs Command Schema line arguments / shell Shell input BQL Antlr AST SemanticP Parser arser Result Result SM Semantic Analyzer SerObjs Schema •Validation SM Schema Generator Annotated •Resolving references •Result Schema Inference Semantic Model Semantic •Optimization (Java object model Model implemented via java QP collections) QP Generator Query Plan (includes Metadata ResultTablet (files locations Optimization Validation metadata) C / asm and statistics) Rules Rules Template
  • 19. Compiler [Annotated] Semantic Model • Comprehensibly describes query to every detail • Java objects (packed into collections, without spaghetti cyclic references) • Must be serializable with SerObjLib framework to a file and restorable. • Must be printable to something comprehensible by human • Must be rendered on request into nice graphic diagram with legend.
  • 20. QP: Scalar Transformation functions (Expr) Compiler • Set of primitive predefined scalar operations and functions applied on xfunc arguments in particular prescribed order. • Expressed in valid C or assembly with some restrictions. • Purely functional => side-effect free. Meaning no static/global variables and no memory allocations. However, for performance and brevity they are inlined into single processSlice function. • Some functions have a context object where they can store their externalized state between calls. One regular and one associative array is provided as context for this functions – Context-free transformation functions • One value in, one value out a+b – Scalar context transformation functions • Many value in, many value out sum(a) within links – Map context transformation functions • Many value in, many value out (out of sync) sum(a) group by date
  • 21. Compiler QP in C Form • Generated ProcessSlice(..){..} function. – Input: inSlice – Output: outSlice – Context object for state-externalization • inSlice contains scalar values for every source function and also fetchLevel • outSlice must have correct scalar values for every result function and also correct selectLevel. – outSlice are guarantied to preserve its content between calls. So it can be used as cache result functions that haven’t changed and also as cache for selectLevel if it is not changed. – outSlice values can also be read (contains results of previous outSlice) – on first call all values on outSlice are guaranteed to be zeros.
  • 22. QP template Compiler (according to appendix D) void processSlice(inSlice, outSlice, Context) { Evaluate where clause…, if evaluates to false then do: outSlice.setSkip; outSlice.selectLevel = min(outSlice.selectLevel, inSlice.fetchLevel); return; If where clause evaluates to true then… switch(inSlice.fetchLevel) { case 0: Evaluate expressions (xfuncs) with repetition level = 0 …….. …….. case n: Evaluate expressions (xfuncs) with repetition level = n If it is the last slide in aggregation group then: //the below line will cause to additional calls to ProcessSlice outSlice.setAdditionalSliceCount( Number of slices in aggregation } }
  • 23. Columnar Abstraction • Tableton is a set of sequentially-accessed multidimensional scalar arrays. • Tablet is serialized dremel-encoded columnar dataset with fixed size. Each array in tablet can be independently serially accessed without incurring the cost of buffering neighbor arrays. • Four types of arrays: bytes, words (16b), dwords(32b), qwords(64b). • Following operations are defined: – Parsing Tablet Schema => reading and parsing tablet header/metadata also called tablet schema and providing an object model for it. – Reading => converting Tablet to SerObjs using FSM for better performance as descrbed in Dremel paper (calling calback functions to let them construct SerObjs in various formats) – Slicing => synchronized multi-array scalar iteration of Tablet – Building Tablet Schema => creating tablet header/metadata also called tablet schema with convenient builder API. Also called TabletSchema Editor. – Construction => re-creating Tablet from slices, this interface is also used for dissecting SerObjs into tablet. – Compaction => constructed Tablet is compressed and hash key generated for it and from that point on it becomes immutable. Tableton
  • 24. What about other datatypes? • They are mapped into yet another dimension of scalar array. • It is strongly recommended not to use java strings. They are impossible to work with without incurring full cost of object lifecycle management. • It is ok not to support them at all, and then gradually add support for them. • All Java string class goodies will anyway be impossible to support in Metaxa because of performance. • Same thing about BLOB, images and any other complex data type. All are mapped to yet another dimension of scalar array. Tableton
  • 25. Hierarchical vs. Columnar • Different abstractions / domains / contexts • Different schemas • Most confusion stems from not differentiating! • Always keep in mind the context when you r developing… • Don’t thinks about both in the same time unless you are willing to develop schizophrenia. • Columnar is not an implementation artifact of hierarchical. Columnar is whole new model in its own • We must adopt two different vocabulary for these domains. Confusion is notoriously common here. Tableton
  • 26. Hierarchical vs. Columnar Hierarchical Columnar A SerObjs in our lingo A Tablet in our lingo Protobuf, Avro, Thirft files Dremel generated tablets Serialized Objects Multi-dimensional arrays The only user-level abstraction User never knows what it is BQL queries written against it Query plans executed against it More frontend-related More backend-related More logical / external format More physical / internal format hierarchical is queried Columns are scanned SerObjLib component Tableton component Tableton
  • 28. Executes QP against tablets • Requirements – Must convert QP into executable bytecode and execute it (not interpret). – Must work with QP in object-model, but initially compiling and running QP in java form will suffice. – Must not mask data and task parallelism. • Data parallelism on tablet level and also on column level within tablet. • Task parallelism on separate QP transformation functions – Must be ultra-high performance • Latency overhead within few milliseconds (assuming data in RAM). • Throughput multi GB/sec Executor
  • 29. Vocabulary • QP – Query Plan • DAG – Directed Acyclic Graph • Slot – Like thread (todo) • Expression – operator tree on scalar arguments and scalar constants • CF – Context Free (stateless scalar expression) • FC – Fixed-size Context (scalar expr with accumulator) • VC – variable-size Context (scalar expr with growing list of accumulators) Executor
  • 30. Code generation • [todo] Janino! • [todo] Explain dynamic java code generation and compilation • [todo] Use code templates! No classes/functions & classes just code listing with labels and jumps. Generated code is every time different no one is going to study it. Put static-portions in library and pre-compile it regularly. All dynamic portion is just code snippet Executor
  • 31. Thanks (sneak preview of future versions in next slides)
  • 32. The overall vision for OpenDremel • Interactive data cloud platform for managing high volumes of static data in forms of serialized objects. • Compatible to Google tools such as BigQuery, prediction API, Fusion Tables and Google storage and etc... • Aggressively use existing open-source software, preferably apache licensed to quickly “implement” desired functionality.
  • 33. Features Backlog • Processing compressed data directly without decompressing. • Macro parallelism 1) multithreading 2) multi-process 3)multi-node 4) massive clustering • Micro parallelism 1) SSE&AVX 2) OpenCL 3) Better machine code to leverage ILP 4) light-threads for parallel processing of single tablet 5) LLVM 6) special hardware GPU & tilera • Interactive joins and indexing support, zone maps and global system- recognized dimensions such as time, geography, ip • Advanced analytics, statistics and machine learning capabilities. • Richer SerObjLib, more formats • Advanced visualization and streaming. • Batch data-crunching and map-reduce support. • Multi-tenancy, resource control, metering and accounting. • CEP capabilities, fast lookups and querying also data that is not yet packed into tablets. • User-defined functions. • Scratch tables and rolling queries