SlideShare une entreprise Scribd logo
1  sur  30
Optimizing HBase scanner
performance


Mikhail Bautin
Software Engineer
01/19/2012
HBase Scanners
What happens on a Get

                              RegionScanner

       ColumnFamily1                               ColumnFamily2

                                                                   Store = (Region,
              StoreScanner                   StoreScanner                CF)


                                           ...
StoreFileScanne                 StoreFileScanne              StoreFileScanne
                        ...
        r                               r                            r
(R1,C1,T3) (R1,C2,T2)           (R1,C1,T1) (R1,C2,T3)        (R2,C2,T1) . . .
(R1,C2,T1)                      (R2,C1,T2)
HBase Scanner State
What happens on a next()

                               RegionScanner

      ColumnFamily1               Priority           ColumnFamily2
                                  Queue
                                                                     Store = (Region,
            StoreScanner                      StoreScanner                 CF)

                    Priority                         Priority
                    Queue                    ...     Queue

 StoreFileScanne                 StoreFileScanne                StoreFileScanne
                        ...
         r                               r                              r
 Current KeyValue                 Current KeyValue              Current KeyValue
Avoiding next() on StoreFileScanner
Every next() call may result in disk I/O
▪   HBASE-4433: avoid extra next if done with row/column
    (Kannan)
    ▪   An optimization for queries specifying a column set
    ▪   INCLUDE_AND_SEEK_NEXT_COL
    ▪   INCLUDE_AND_SEEK_NEXT_ROW

▪   HBASE-4434: Don't do HFile Scanner next() unless the next
    KV is needed (Kannan)
    ▪   Avoid aggressive pre-fetching
Simple ROWCOL Bloom Filters
Do we have to read all of these files?
Query: (R1, C3)
 Row   Col   TS       Row   Col   TS     Row   Col   TS

             T2             C1    T3           C1    T4
       C1                                R1
 R1          T1        R1   C2    T3           C2    T2
       C2    T1             C3    T2     R2    C1    T1
       C1    T1             C1    T2
                       R2
             T2             C2    T3
 R2    C2
             T1
       C3    T1
Simple ROWCOL Bloom Filters
In some cases, we only have to read one file
Query: (R1, C3)
                      Row   Col   TS

                      R1    C1    T3
                      R1    C2    T3
                      R1    C3    T2
                            C1    T2
                      R2
                            C2    T3
Multi-column Bloom Filters (HBASE-2794)
ROWCOL Bloom filters for multi-column queries
Query: C1 and C3 in all rows
 Row   Col   TS       Row      Col   TS   Row   Col   TS

             T2                C1    T3         C1    T4
       C1                                 R1
 R1          T1        R1      C2    T3         C2    T2
       C2    T1                C3    T2   R2    C1    T1
       C1    T1                C1    T2
                       R2
             T2                C2    T3
 R2    C2
             T1
       C3    T1
Multi-column Bloom Filters (HBASE-2794)
ROWCOL Bloom filters for multi-column queries
Query: C1 and C3 in all rows—seek to (R1,
C1)
 Row   Col   TS       Row   Col   TS        Row   Col   TS

                                            R1    C1    T4
 R1    C1    T1       R1    C2    T3        R1    C2    T2
 R1    C2    T1       R1    C3    T2        R2    C1    T1
       C1    T1             C1    T2
                      R2
             T2             C2    T3
 R2    C2
             T1
       C3    T1
Multi-column Bloom Filters (HBASE-2794)
 ROWCOL Bloom filters for multi-column queries
  Query: C1 and C3 in all rows—seek to (R1,
  C3)
   Row   Col   TS       Row   Col   TS        Row   Col   TS

                                              R1    C1    T4


                        R1    C3    T2        R2    C1    T1
         C1    T1             C1    T2   Fake key: (R1, end of
                        R2               C3)
               T2             C2    T3
   R2    C2
               T1
         C3    T1
Fake key: (R1, end of
C3)
Multi-column Bloom Filters (HBASE-2794)
ROWCOL Bloom filters for multi-column queries
Query: C1 and C3 in all rows—seek to (R2,
C1)
 Row   Col   TS       Row   Col   TS        Row   Col   TS

                                            R1    C1    T4


                      R1    C3    T2
                      R2    C1    T2
 R2    C2    T2       R2    C2    T3
 R2    C2    T1        (R2, C1, T2)
 R2    C3    T1           wins by
                        timestamp
Multi-column Bloom Filters (HBASE-2794)
ROWCOL Bloom filters for multi-column queries
Query: C1 and C3 in all rows—seek to (R2,
C3)
 Row   Col    TS      Row   Col   TS         Row   Col   TS

                                             R1    C1    T4


                      R1    C3    T2
                      R2    C1    T2   Fake key: (R2, end of
                                       C3)
                     Fake key: (R2, end of
 R2    C3     T1
                     C3)
   (R2, C3,
   T1)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                T1 –
         T2                   T3                  T4
  Row   Col    TS      Row   Col    TS     Row   Col    TS

               T2            C1     T3           C1     T4
        C1                                  R1
   R1          T1       R1   C2     T3           C2     T2
        C2     T1            C3     T2      R2   C1     T1
        C1     T1            C1     T2     Fake key: (R1, C1,
                        R2                 T4)
               T2            C2     T3
   R2   C2
               T1     Fake key: (R1, C1,
        C3     T1
                      T3)
Fake key: (R1, C1,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                T1 –
         T2                   T3                  T4
  Row   Col    TS      Row   Col    TS     Row   Col    TS

               T2            C1     T3      R1   C1     T4
        C1
   R1          T1       R1   C2     T3      R1   C2     T2
        C2     T1            C3     T2      R2   C1     T1
        C1     T1            C1     T2
                        R2                 (R1, C1, T4)
               T2            C2     T3
   R2   C2
               T1     Fake key: (R1, C1,
        C3     T1
                      T3)
Fake key: (R1, C1,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                T1 –
         T2                   T3                  T4
  Row   Col    TS      Row   Col    TS     Row   Col    TS

                                            R1   C1     T4


                        R1   C3     T2      R2   C1     T1
        C1     T1            C1     T2     Fake key: (R1, C3,
                        R2                 T4)
               T2            C2     T3
   R2   C2
               T1     Fake key: (R1, C3,
        C3     T1
                      T3)
Fake key: (R1, C3,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                T1 –
         T2                   T3                  T4
  Row   Col    TS      Row   Col    TS     Row   Col    TS

                                            R1   C1     T4


                        R1   C3     T2      R2   C1     T1
        C1     T1            C1     T2
                        R2                 (R2, C1, T1)
               T2            C2     T3
   R2   C2
               T1     Fake key: (R1, C3,
        C3     T1
                      T3)
Fake key: (R1, C3,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –              T1 –
         T2                   T3                T4
  Row   Col    TS      Row   Col    TS   Row   Col    TS

                                          R1   C1     T4


                        R1   C3     T2    R2   C1     T1
        C1     T1            C1     T2
                        R2               (R2, C1, T1)
               T2            C2     T3
   R2   C2
               T1     (R1, C3, T2) is
        C3     T1     next
Fake key: (R1, C3,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                  T1 –
         T2                   T3                    T4
  Row   Col    TS      Row   Col    TS       Row   Col    TS

                                              R1   C1     T4


                        R1   C3     T2        R2   C1     T1
        C1     T1            C1     T2
                        R2                   (R2, C1, T1)
               T2            C2     T3
   R2   C2
               T1     Fake key: (R2, C1,
                      T3)
        C3     T1
                      To be selected next.
Fake key: (R2, C1,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –              T1 –
         T2                   T3                T4
  Row   Col    TS      Row   Col    TS   Row   Col    TS

                                          R1   C1     T4


                        R1   C3     T2
        C1     T1       R2   C1     T2
                                         (R2, C1, T1)
               T2       R2   C2     T3
   R2   C2
               T1       (R2, C1, T2)
        C3     T1
                           wins by
                         timestamp
Fake key: (R2, C1,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –                T1 –
         T2                   T3                  T4
  Row   Col    TS      Row   Col    TS     Row   Col    TS

                                            R1   C1     T4


                        R1   C3     T2
        C1     T1       R2   C1     T2     Fake key: (R2, C3,
               T2       R2   C2     T3
                                           T4)
   R2   C2
               T1     Fake key: (R2, C3,
                             T3)
        C3     T1
Fake key: (R2, C3,
T2)
Lazy Seek (HBASE-4465)
 Optimizing for reading recent data
        T1 –                 T2 –              T1 –
         T2                   T3                T4
  Row   Col    TS      Row   Col    TS   Row   Col    TS

                                         R1    C1     T4


                        R1   C3     T2
        C1     T1       R2   C1     T2         EO
                                               F
               T2       R2   C2     T3
   R2   C2
               T1
                        Real seek to
        C3     T1
                        (R2, C3, T3)
Fake key: (R2, C3,
T2)
Lazy Seek (HBASE-4465)
Optimizing for reading recent data
       T1 –                 T2 –              T1 –
        T2                   T3                T4
 Row   Col    TS      Row   Col    TS   Row   Col    TS

                                        R1    C1     T4


                       R1   C3     T2
                       R2   C1     T2         EO
                                              F

                            EOF
 R2    C3     T1
  (R2, C3,
  T1)
Top-of-the-row seek
Some applications do not use DeleteFamily
▪   We always seek to the top of the row first
    ▪   DeleteFamily comes before all columns, i.e. at (R1, empty column)
    ▪   Even if we only need (R1, C1), there might be a DeleteFamily for R1

▪   Some applications do not even use DeleteFamily
▪   Two fixes by Liyin Tang:
    ▪   Utilize existing ROWCOL Bloom filter (HBASE-4469)
    ▪   Added a separate ROW-only Bloom filter for DeleteFamily(HBASE-
        4532)
Seek on deleted KV (HBASE-4585)
What if the requested column has been deleted?
▪   We are requesting C1, C2, ..., Cn
▪   What if we see a delete marker for Ci?
▪   Previously, we would keep calling next()
▪   Now, we seek to (i + 1)’th requested column
    (also a fix by Liyin)
Data block read requests (dark launch)
Thu, Sep 15 – Sun, Sep 25 2011
                                  Fri Sep 16th vs. Sep 23rd:
                                  45% savings in logical
                                  block read requests
                                  (cache hits + misses)


 Pushed on Tue Sep 20th:
 • No extra next when done with
   column/row (HBASE-4433)
 • No KV prefetch (HBASE-4434)
 • Lazy Seek (HBASE-4465)
Data block read requests (dark launch)
Sun, Sep 25 – Mon, Oct 3 2011
                                  Sun Sep 25th vs. Oct 2nd:
                                  33% savings in logical
                                  block read requests
                                  (cache hits + misses)



 Pushed on Fri Sep 30th:
 • Avoid top-of-the-row seek
   (HBASE-4469, Liyin)
 • Off-peak compactions (HBASE-
   4463, Karthik)
Data block cache misses (dark launch)
▪   20.6 K (Mon Sep 19th) -> 11.8 K (Mon Sep 26th) -> 9.8 K (Mon
    Oct 3rd)
▪   52% savings (42% and then 17% more)



                  • No next KV prefetch
                  • No next() when done
                    with row/column       • No top-of-the-row seek
                  • Lazy Seek             • Off-peak compactios
Avoid loading previous block (HBASE-4443)
We sometimes go to previous block on exact match
▪   Future work
▪   Suppose the first key of a block matches (Row, Column)
▪   But maybe there is an earlier key that would also match?
▪   We load the previous block to find out
▪   Possible fixes:
    ▪   Track deletes and optimize the MAX_VERSIONS=1 case
    ▪   Add last key in block to index (increases index size)
Top-of-the-column seek (HBASE-4962)
Some applications do not use DeleteColumn
▪   Future work
▪   DeleteColumn deletes all versions of a particular column
▪   Comes before all Puts for a (Row, Column)
▪   Slows down timestamp range queries
▪   Proposed solution:
    ▪   Add a (Row, Column) Bloom filter for DeleteColumn only
    ▪   Seek to (Row, Column, T2) for a [T1, T2] range query
(c) 2009 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0

Contenu connexe

Dernier

Dernier (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Optimizing_hbase_scanner_performance

  • 1.
  • 2. Optimizing HBase scanner performance Mikhail Bautin Software Engineer 01/19/2012
  • 3. HBase Scanners What happens on a Get RegionScanner ColumnFamily1 ColumnFamily2 Store = (Region, StoreScanner StoreScanner CF) ... StoreFileScanne StoreFileScanne StoreFileScanne ... r r r (R1,C1,T3) (R1,C2,T2) (R1,C1,T1) (R1,C2,T3) (R2,C2,T1) . . . (R1,C2,T1) (R2,C1,T2)
  • 4. HBase Scanner State What happens on a next() RegionScanner ColumnFamily1 Priority ColumnFamily2 Queue Store = (Region, StoreScanner StoreScanner CF) Priority Priority Queue ... Queue StoreFileScanne StoreFileScanne StoreFileScanne ... r r r Current KeyValue Current KeyValue Current KeyValue
  • 5. Avoiding next() on StoreFileScanner Every next() call may result in disk I/O ▪ HBASE-4433: avoid extra next if done with row/column (Kannan) ▪ An optimization for queries specifying a column set ▪ INCLUDE_AND_SEEK_NEXT_COL ▪ INCLUDE_AND_SEEK_NEXT_ROW ▪ HBASE-4434: Don't do HFile Scanner next() unless the next KV is needed (Kannan) ▪ Avoid aggressive pre-fetching
  • 6. Simple ROWCOL Bloom Filters Do we have to read all of these files? Query: (R1, C3) Row Col TS Row Col TS Row Col TS T2 C1 T3 C1 T4 C1 R1 R1 T1 R1 C2 T3 C2 T2 C2 T1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 T2 C2 T3 R2 C2 T1 C3 T1
  • 7. Simple ROWCOL Bloom Filters In some cases, we only have to read one file Query: (R1, C3) Row Col TS R1 C1 T3 R1 C2 T3 R1 C3 T2 C1 T2 R2 C2 T3
  • 8. Multi-column Bloom Filters (HBASE-2794) ROWCOL Bloom filters for multi-column queries Query: C1 and C3 in all rows Row Col TS Row Col TS Row Col TS T2 C1 T3 C1 T4 C1 R1 R1 T1 R1 C2 T3 C2 T2 C2 T1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 T2 C2 T3 R2 C2 T1 C3 T1
  • 9. Multi-column Bloom Filters (HBASE-2794) ROWCOL Bloom filters for multi-column queries Query: C1 and C3 in all rows—seek to (R1, C1) Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C1 T1 R1 C2 T3 R1 C2 T2 R1 C2 T1 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 T2 C2 T3 R2 C2 T1 C3 T1
  • 10. Multi-column Bloom Filters (HBASE-2794) ROWCOL Bloom filters for multi-column queries Query: C1 and C3 in all rows—seek to (R1, C3) Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 Fake key: (R1, end of R2 C3) T2 C2 T3 R2 C2 T1 C3 T1 Fake key: (R1, end of C3)
  • 11. Multi-column Bloom Filters (HBASE-2794) ROWCOL Bloom filters for multi-column queries Query: C1 and C3 in all rows—seek to (R2, C1) Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T2 R2 C2 T2 R2 C2 T3 R2 C2 T1 (R2, C1, T2) R2 C3 T1 wins by timestamp
  • 12. Multi-column Bloom Filters (HBASE-2794) ROWCOL Bloom filters for multi-column queries Query: C1 and C3 in all rows—seek to (R2, C3) Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T2 Fake key: (R2, end of C3) Fake key: (R2, end of R2 C3 T1 C3) (R2, C3, T1)
  • 13. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS T2 C1 T3 C1 T4 C1 R1 R1 T1 R1 C2 T3 C2 T2 C2 T1 C3 T2 R2 C1 T1 C1 T1 C1 T2 Fake key: (R1, C1, R2 T4) T2 C2 T3 R2 C2 T1 Fake key: (R1, C1, C3 T1 T3) Fake key: (R1, C1, T2)
  • 14. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS T2 C1 T3 R1 C1 T4 C1 R1 T1 R1 C2 T3 R1 C2 T2 C2 T1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 (R1, C1, T4) T2 C2 T3 R2 C2 T1 Fake key: (R1, C1, C3 T1 T3) Fake key: (R1, C1, T2)
  • 15. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 Fake key: (R1, C3, R2 T4) T2 C2 T3 R2 C2 T1 Fake key: (R1, C3, C3 T1 T3) Fake key: (R1, C3, T2)
  • 16. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 (R2, C1, T1) T2 C2 T3 R2 C2 T1 Fake key: (R1, C3, C3 T1 T3) Fake key: (R1, C3, T2)
  • 17. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 (R2, C1, T1) T2 C2 T3 R2 C2 T1 (R1, C3, T2) is C3 T1 next Fake key: (R1, C3, T2)
  • 18. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T1 C1 T1 C1 T2 R2 (R2, C1, T1) T2 C2 T3 R2 C2 T1 Fake key: (R2, C1, T3) C3 T1 To be selected next. Fake key: (R2, C1, T2)
  • 19. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 C1 T1 R2 C1 T2 (R2, C1, T1) T2 R2 C2 T3 R2 C2 T1 (R2, C1, T2) C3 T1 wins by timestamp Fake key: (R2, C1, T2)
  • 20. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 C1 T1 R2 C1 T2 Fake key: (R2, C3, T2 R2 C2 T3 T4) R2 C2 T1 Fake key: (R2, C3, T3) C3 T1 Fake key: (R2, C3, T2)
  • 21. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 C1 T1 R2 C1 T2 EO F T2 R2 C2 T3 R2 C2 T1 Real seek to C3 T1 (R2, C3, T3) Fake key: (R2, C3, T2)
  • 22. Lazy Seek (HBASE-4465) Optimizing for reading recent data T1 – T2 – T1 – T2 T3 T4 Row Col TS Row Col TS Row Col TS R1 C1 T4 R1 C3 T2 R2 C1 T2 EO F EOF R2 C3 T1 (R2, C3, T1)
  • 23. Top-of-the-row seek Some applications do not use DeleteFamily ▪ We always seek to the top of the row first ▪ DeleteFamily comes before all columns, i.e. at (R1, empty column) ▪ Even if we only need (R1, C1), there might be a DeleteFamily for R1 ▪ Some applications do not even use DeleteFamily ▪ Two fixes by Liyin Tang: ▪ Utilize existing ROWCOL Bloom filter (HBASE-4469) ▪ Added a separate ROW-only Bloom filter for DeleteFamily(HBASE- 4532)
  • 24. Seek on deleted KV (HBASE-4585) What if the requested column has been deleted? ▪ We are requesting C1, C2, ..., Cn ▪ What if we see a delete marker for Ci? ▪ Previously, we would keep calling next() ▪ Now, we seek to (i + 1)’th requested column (also a fix by Liyin)
  • 25. Data block read requests (dark launch) Thu, Sep 15 – Sun, Sep 25 2011 Fri Sep 16th vs. Sep 23rd: 45% savings in logical block read requests (cache hits + misses) Pushed on Tue Sep 20th: • No extra next when done with column/row (HBASE-4433) • No KV prefetch (HBASE-4434) • Lazy Seek (HBASE-4465)
  • 26. Data block read requests (dark launch) Sun, Sep 25 – Mon, Oct 3 2011 Sun Sep 25th vs. Oct 2nd: 33% savings in logical block read requests (cache hits + misses) Pushed on Fri Sep 30th: • Avoid top-of-the-row seek (HBASE-4469, Liyin) • Off-peak compactions (HBASE- 4463, Karthik)
  • 27. Data block cache misses (dark launch) ▪ 20.6 K (Mon Sep 19th) -> 11.8 K (Mon Sep 26th) -> 9.8 K (Mon Oct 3rd) ▪ 52% savings (42% and then 17% more) • No next KV prefetch • No next() when done with row/column • No top-of-the-row seek • Lazy Seek • Off-peak compactios
  • 28. Avoid loading previous block (HBASE-4443) We sometimes go to previous block on exact match ▪ Future work ▪ Suppose the first key of a block matches (Row, Column) ▪ But maybe there is an earlier key that would also match? ▪ We load the previous block to find out ▪ Possible fixes: ▪ Track deletes and optimize the MAX_VERSIONS=1 case ▪ Add last key in block to index (increases index size)
  • 29. Top-of-the-column seek (HBASE-4962) Some applications do not use DeleteColumn ▪ Future work ▪ DeleteColumn deletes all versions of a particular column ▪ Comes before all Puts for a (Row, Column) ▪ Slows down timestamp range queries ▪ Proposed solution: ▪ Add a (Row, Column) Bloom filter for DeleteColumn only ▪ Seek to (Row, Column, T2) for a [T1, T2] range query
  • 30. (c) 2009 Facebook, Inc. or its licensors. "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0