SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
HybridStore: An Efficient Data Management System for
          Hybrid Flash-based Sensor Devices

                       Baobing Wang and John S. Baras

                    Department of Electrical and Computer Engineering
                             Institute for Systems Research
                       University of Maryland, College Park, USA
                                    briankw@umd.edu

          10th European Conference on Wireless Sensor Networks (EWSN)


                                 February 14, 2013



  Brian (UMD@USA)                      HybridStore                      February 14, 2013   1 / 15
Motivation


       In-situ Data Storage on Sensor Motes
            Centralized data collection: energy wastes (e.g., TinyDB)
                  LoCal project1 : 455 nodes, > 900M readings/year
            Only aggregated data are required: average noise level, peak power
            consumption, usage pattern
            Sensors store data locally: sensor database
            Flash memory: high capacity, energy efficient


                                        Figure:




  1
      http://local.cs.berkeley.edu/
       Brian (UMD@USA)                  HybridStore                  February 14, 2013   2 / 15
Motivation
       In-situ Data Storage on Sensor Motes
            Centralized data collection: energy wastes (e.g., TinyDB)
                  LoCal project1 : 455 nodes, > 900M readings/year
            Only aggregated data are required: average noise level, peak power
            consumption, usage pattern
            Sensors store data locally: sensor database
            Flash memory: high capacity, energy efficient




 Figure: Per-byte cost: storage, computation and communication [Mathur’06]

  1
      http://local.cs.berkeley.edu/
       Brian (UMD@USA)                  HybridStore                  February 14, 2013   2 / 15
Motivation




   Design Challenges
        Unlike magnetic disks, no in-place updates on flash memories
        NOR flash: byte-oriented, random-accessible, low capacity
        NAND flash: page-oriented, high capacity, more energy-efficient
        Random writes are 100× more expensive than sequential writes
        Very limited RAM: 4KB to 10KB




   Brian (UMD@USA)               HybridStore              February 14, 2013   3 / 15
Related Work




   Flash-based Storage Systems
        Only time-window queries: TL-Tree [Li’12], FlashLog [Nath’09]
        Large RAM footprint: FlashDB [Nath’07], LA-Tree [Agrawal’09]
        Antelope [Tsiftes’11]: NOR flash only, discrete values
        MicroHash [Lin’06]: long chain of partial pages, extensive page reads
        and writes, complex failure recovery
        No efficient joint queries support, global index




   Brian (UMD@USA)                HybridStore                February 14, 2013   4 / 15
Contributions


HybridStore Interface
    insert(float key , void* record, uint8 t length)
    select(uint32 t t1 , uint32 t t2 , float k1 , float k2 )

    HybridStore Features
         All NAND pages are fully occupied and written purely sequentially
         In-place updates and out-of-place writes are completely avoided
         Process typical joint queries efficiently, even on large-scale datasets
         Data aging without overhead
         Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1
    Potential Applications
         Storage layer abstraction: Squirrel [Mottola’10]



    Brian (UMD@USA)                 HybridStore                 February 14, 2013   5 / 15
Contributions


HybridStore Interface
    insert(float key , void* record, uint8 t length)
    select(uint32 t t1 , uint32 t t2 , float k1 , float k2 )

    HybridStore Features
         All NAND pages are fully occupied and written purely sequentially
         In-place updates and out-of-place writes are completely avoided
         Process typical joint queries efficiently, even on large-scale datasets
         Data aging without overhead
         Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1
    Potential Applications
         Storage layer abstraction: Squirrel [Mottola’10]



    Brian (UMD@USA)                 HybridStore                 February 14, 2013   5 / 15
Contributions


HybridStore Interface
    insert(float key , void* record, uint8 t length)
    select(uint32 t t1 , uint32 t t2 , float k1 , float k2 )

    HybridStore Features
         All NAND pages are fully occupied and written purely sequentially
         In-place updates and out-of-place writes are completely avoided
         Process typical joint queries efficiently, even on large-scale datasets
         Data aging without overhead
         Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1
    Potential Applications
         Storage layer abstraction: Squirrel [Mottola’10]



    Brian (UMD@USA)                 HybridStore                 February 14, 2013   5 / 15
HybridStore: Overview


   Partition the data stream into segments
   Create an in-segment index for each segment
   Create an inter-segment index to organize segments
   Benefits: skip unnecessary segments, small index per segment




   Brian (UMD@USA)             HybridStore              February 14, 2013   6 / 15
HybridStore: Index Management



   Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ]




                                                         NULL
                     Header




   In-segment β-Tree: locate records within [k1 , k2 ]
   In-segment Bloom filter: check the existence of key values if k1 = k2




   Brian (UMD@USA)                HybridStore                   February 14, 2013   7 / 15
HybridStore: Index Management



   Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ]




                                                         NULL
                     Header




   In-segment β-Tree: locate records within [k1 , k2 ]
   In-segment Bloom filter: check the existence of key values if k1 = k2




   Brian (UMD@USA)                HybridStore                   February 14, 2013   7 / 15
HybridStore: In-segment Index


   In-segment β-Tree: locate records within [k1 , k2 ]
         Binary tree: lowK , highK , left, right, splitK
         Prediction-based bucket splitting: compute splitK


                            [-60, 120]



                      [-60, 30]    (30, 120]



                             (30, 75]     (75, 120]                 (82, 84]



                                   (75, 97.5]    (97.5, 120]



                         (75, 86.25]      (86.25, 97.5]




    Brian (UMD@USA)                                   HybridStore              February 14, 2013   8 / 15
HybridStore: In-segment Index



   In-segment Bloom filter: check the existence of key values if k1 = k2
                                                                      1  qn q
         v bits, q hash functions, represent n items: p = 1 − 1 − v
         Must be maintained in RAM: NOR flash is byte-oriented
         If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB)
         Horizontal partition: fixed small bloom filter sections (e.g., 256B)
         Vertical partition: group fragments with the same offset in the same
         NAND page




    Brian (UMD@USA)                HybridStore               February 14, 2013   9 / 15
HybridStore: In-segment Index
   In-segment Bloom filter: check the existence of key values if k1 = k2
                                                                      1  qn q
         v bits, q hash functions, represent n items: p = 1 − 1 − v
         Must be maintained in RAM: NOR flash is byte-oriented
         If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB)
         Horizontal partition: fixed small bloom filter sections (e.g., 256B)
         Vertical partition: group fragments with the same offset in the same
         NAND page




    Brian (UMD@USA)                HybridStore               February 14, 2013   9 / 15
HybridStore: In-segment Index
   In-segment Bloom filter: check the existence of key values if k1 = k2
                                                                      1  qn q
         v bits, q hash functions, represent n items: p = 1 − 1 − v
         Must be maintained in RAM: NOR flash is byte-oriented
         If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB)
         Horizontal partition: fixed small bloom filter sections (e.g., 256B)
         Vertical partition: group fragments with the same offset in the same
         NAND page




    Brian (UMD@USA)                HybridStore               February 14, 2013   9 / 15
HybridStore: Storage Hierarchy

   NOR flash: circular array, fixed segment size
   NAND flash: circular array, logical segment (multiple erase blocks)
   Index structure: updated in a NOR segment, copied to the NAND
   segment later
   Header: [T1 , T2 ], [K1 , K2 ], dataAddr , idxAddr , bfAddr , skipList

        Skip List Header
                              Bloom        Write     Read
                           Filter Buffer                      RAM
                                                                             Readings       ...   Readings
               ...




                                           Buffer    Buffer

        Bloom Filter
                        NOR                          NOR
         Adaptive
                       Segment              ...     Segment
                                                              NOR
        Binary Tree                                                          Readings       ...   Readings


                                                                             Bloom Filter           Tree
                                                                                            ...
        Segment        Segment
                                           ...
                                                    Segment
                                                              NAND
                                                                                Tree                         }Header
                                                                                                               Page




                (a) Storage Hierarchy                                  (b) NAND Segment Structure

    Brian (UMD@USA)                                            HybridStore                           February 14, 2013   10 / 15
HybridStore: Operations

   Insertion
         Update the β tree: allocate new bucket if necessary
         Update the Bloom filter buffer: flush it out to NOR flash if necessary
         NOR segment is full: copy to the NAND segment, update the skip list,
         start a new segment
   Querying: t1 , t2 , k1 , k2
                        t1 = t2                      t1 < t2
          k1 = k2 skip list           skip list + Bloom filter + β-Tree
          k1 < k2 skip list                    skip list + β-Tree
         Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ]
   Data Aging: delete the oldest NAND segment
         No need to update any pointer
         No need to move any data page


    Brian (UMD@USA)                    HybridStore           February 14, 2013   11 / 15
HybridStore: Operations

   Insertion
         Update the β tree: allocate new bucket if necessary
         Update the Bloom filter buffer: flush it out to NOR flash if necessary
         NOR segment is full: copy to the NAND segment, update the skip list,
         start a new segment
   Querying: t1 , t2 , k1 , k2
                        t1 = t2                      t1 < t2
          k1 = k2 skip list           skip list + Bloom filter + β-Tree
          k1 < k2 skip list                    skip list + β-Tree
         Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ]
   Data Aging: delete the oldest NAND segment
         No need to update any pointer
         No need to move any data page


    Brian (UMD@USA)                    HybridStore           February 14, 2013   11 / 15
HybridStore: Operations

   Insertion
         Update the β tree: allocate new bucket if necessary
         Update the Bloom filter buffer: flush it out to NOR flash if necessary
         NOR segment is full: copy to the NAND segment, update the skip list,
         start a new segment
   Querying: t1 , t2 , k1 , k2
                        t1 = t2                      t1 < t2
          k1 = k2 skip list           skip list + Bloom filter + β-Tree
          k1 < k2 skip list                    skip list + β-Tree
         Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ]
   Data Aging: delete the oldest NAND segment
         No need to update any pointer
         No need to move any data page


    Brian (UMD@USA)                    HybridStore           February 14, 2013   11 / 15
HybridStore: Implementation and Evaluation

                  TinyOS implementation: 16.5KB ROM, 3.2KB RAM
                  Trace-driven simulation: over 2.6 million weather records in 5 years
                  Insertion: 13% ∼ 18% improvement

             2                                                       90                                                        40
                       β−Tree       Static tree                              β−Tree       Static tree                                    β−Tree       Static tree
            1.8                                                      80                                                        35
            1.6
                                                                     70
                                                                                                                               30
            1.4
                                                                     60




                                                                                                          Space Overhead (%)
                                                                                                                               25
            1.2
                                                       Energy (µJ)
Time (ms)




                                                                     50
             1                                                                                                                 20
                                                                     40
            0.8
                                                                                                                               15
                                                                     30
            0.6
                                                                                                                               10
                                                                     20
            0.4

                                                                     10                                                         5
            0.2

             0                                                        0                                                         0
                    64          128           256                         64          128           256                               64          128           256
                     NOR Flash Segment Size (KB)                           NOR Flash Segment Size (KB)                                 NOR Flash Segment Size (KB)



                     (a) Latency                                            (b) Energy                                              (c) Space Overhead

                                                    Figure: Performance per insertion

                  Brian (UMD@USA)                                             HybridStore                                                February 14, 2013          12 / 15
HybridStore: Value-based Equality Query

        Key detection: 26.18ms and 1.5mJ over 0.5 million readings
        Nonexistent keys: more than 3× improvement

               300                                                                       18
                             β−Tree (64KB)                                                            β−Tree (64KB)
                             β−Tree (128KB)                                              16           β−Tree (128KB)
               250
                             β−Tree (256KB)                                                           β−Tree (256KB)
                                                                                         14
                             β−Tree (64KB w/o BF)                                                     β−Tree (64KB w/o BF)
               200           Static (128KB)                                              12           Static (128KB)




                                                                           Energy (mJ)
   Time (ms)




                                                                                         10
               150
                                                                                          8

               100                                                                        6

                                                                                          4
                50
                                                                                          2

                 0                                                                        0
                     1 day     1 week      1 month    3 month   1 year                        1 day      1 week     1 month       3 month    1 year
                                         Time Range                                                               Time Range


                                 (a) Latency                                                                (b) Energy

                                Figure: Impact of Bloom filter for nonexistent keys

         Brian (UMD@USA)                                          HybridStore                                                February 14, 2013        13 / 15
HybridStore: Full Query

            Retrieve 120K readings in 11.08 seconds from 0.5 million records
                                   [SenSys ’11]: over 20 seconds to get 50% from 50, 000 records

                      12                                                                                      700
                                    1 degree                                                                                1 degree
                                    3 degree                                                                                3 degree
                                                                                                              600
                      10            5 degree                                                                                5 degree
                                    7 degree                                                                                7 degree
                                    9 degree                                                                  500           9 degree




                                                                                        Energy (mJ) / Query
                       8
   Time (s) / Query




                                                                                                              400
                       6
                                                                                                              300

                       4
                                                                                                              200

                       2
                                                                                                              100


                       0                                                                                        0
                           1 day     1 week    1 month 3 months   6 months   1 year                                 1 day   1 week     1 month 3 months 6 months   1 year
                                                  Time Range                                                                              Time Range


                           (a) Total Latency per query                                                              (b) Total energy per query

                                    Figure: HybridStore performance per query of full queries

              Brian (UMD@USA)                                                  HybridStore                                                      February 14, 2013           14 / 15
Conclusion and Future Work



       Conclusion
            HybridStore: efficient, light-weight, and sensor-friendly
            Process typical joint queries efficiently
            Process large-scale dataset efficiently
       Future Work2
            Failure recovery mechanism
            Distributed database system based on HybridStore
            Testbed experiments




   2
    B. Wang and J. S. Baras. HybridDB: An Efficient Database System Supporting
Incremental epsilon-Approximate Querying for Storage-Centric Sensor Networks.
Submitted to the ACM Transactions on Sensor Networks, 2013, pp. 1–35
       Brian (UMD@USA)                HybridStore                February 14, 2013   15 / 15

Contenu connexe

Similaire à Presentation hybrid store-ewsn-2013

Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanCeph Community
 
High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)Nicholas Knize, Ph.D., GISP
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
Apache Cassandra Opinion and Fact
Apache Cassandra Opinion and FactApache Cassandra Opinion and Fact
Apache Cassandra Opinion and Factmediumdata
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
Optimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsOptimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsJavier González
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...confluent
 
Efficient Parallel Set-Similarity Joins Using MapReduce - Poster
Efficient Parallel Set-Similarity Joins Using MapReduce - PosterEfficient Parallel Set-Similarity Joins Using MapReduce - Poster
Efficient Parallel Set-Similarity Joins Using MapReduce - Posterrvernica
 
TokuDB vs RocksDB
TokuDB vs RocksDBTokuDB vs RocksDB
TokuDB vs RocksDBVlad Lesin
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed CoordinationLuis Galárraga
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2Nick Wang
 
Simple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud StorageSimple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud StorageKevin Tong
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxssuserabc741
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Amazon Web Services
 

Similaire à Presentation hybrid store-ewsn-2013 (20)

Performance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li XiaoyanPerformance tuning in BlueStore & RocksDB - Li Xiaoyan
Performance tuning in BlueStore & RocksDB - Li Xiaoyan
 
High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)High Dimensional Indexing using MongoDB (MongoSV 2012)
High Dimensional Indexing using MongoDB (MongoSV 2012)
 
Dv32754758
Dv32754758Dv32754758
Dv32754758
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
Big data
Big dataBig data
Big data
 
Apache Cassandra Opinion and Fact
Apache Cassandra Opinion and FactApache Cassandra Opinion and Fact
Apache Cassandra Opinion and Fact
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?Arrays in database systems, the next frontier?
Arrays in database systems, the next frontier?
 
Optimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDsOptimizing RocksDB for Open-Channel SSDs
Optimizing RocksDB for Open-Channel SSDs
 
What every software engineer should know about streams and tables in kafka ...
What every software engineer should know about streams and tables in kafka   ...What every software engineer should know about streams and tables in kafka   ...
What every software engineer should know about streams and tables in kafka ...
 
Spatial index(2)
Spatial index(2)Spatial index(2)
Spatial index(2)
 
Shignled disk
Shignled diskShignled disk
Shignled disk
 
RocksDB meetup
RocksDB meetupRocksDB meetup
RocksDB meetup
 
Efficient Parallel Set-Similarity Joins Using MapReduce - Poster
Efficient Parallel Set-Similarity Joins Using MapReduce - PosterEfficient Parallel Set-Similarity Joins Using MapReduce - Poster
Efficient Parallel Set-Similarity Joins Using MapReduce - Poster
 
TokuDB vs RocksDB
TokuDB vs RocksDBTokuDB vs RocksDB
TokuDB vs RocksDB
 
Distributed Coordination
Distributed CoordinationDistributed Coordination
Distributed Coordination
 
brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
Simple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud StorageSimple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud Storage
 
SDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptxSDC20 ScaleFlux.pptx
SDC20 ScaleFlux.pptx
 
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
Best Practices for Data Warehousing with Amazon Redshift | AWS Public Sector ...
 

Dernier

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Presentation hybrid store-ewsn-2013

  • 1. HybridStore: An Efficient Data Management System for Hybrid Flash-based Sensor Devices Baobing Wang and John S. Baras Department of Electrical and Computer Engineering Institute for Systems Research University of Maryland, College Park, USA briankw@umd.edu 10th European Conference on Wireless Sensor Networks (EWSN) February 14, 2013 Brian (UMD@USA) HybridStore February 14, 2013 1 / 15
  • 2. Motivation In-situ Data Storage on Sensor Motes Centralized data collection: energy wastes (e.g., TinyDB) LoCal project1 : 455 nodes, > 900M readings/year Only aggregated data are required: average noise level, peak power consumption, usage pattern Sensors store data locally: sensor database Flash memory: high capacity, energy efficient Figure: 1 http://local.cs.berkeley.edu/ Brian (UMD@USA) HybridStore February 14, 2013 2 / 15
  • 3. Motivation In-situ Data Storage on Sensor Motes Centralized data collection: energy wastes (e.g., TinyDB) LoCal project1 : 455 nodes, > 900M readings/year Only aggregated data are required: average noise level, peak power consumption, usage pattern Sensors store data locally: sensor database Flash memory: high capacity, energy efficient Figure: Per-byte cost: storage, computation and communication [Mathur’06] 1 http://local.cs.berkeley.edu/ Brian (UMD@USA) HybridStore February 14, 2013 2 / 15
  • 4. Motivation Design Challenges Unlike magnetic disks, no in-place updates on flash memories NOR flash: byte-oriented, random-accessible, low capacity NAND flash: page-oriented, high capacity, more energy-efficient Random writes are 100× more expensive than sequential writes Very limited RAM: 4KB to 10KB Brian (UMD@USA) HybridStore February 14, 2013 3 / 15
  • 5. Related Work Flash-based Storage Systems Only time-window queries: TL-Tree [Li’12], FlashLog [Nath’09] Large RAM footprint: FlashDB [Nath’07], LA-Tree [Agrawal’09] Antelope [Tsiftes’11]: NOR flash only, discrete values MicroHash [Lin’06]: long chain of partial pages, extensive page reads and writes, complex failure recovery No efficient joint queries support, global index Brian (UMD@USA) HybridStore February 14, 2013 4 / 15
  • 6. Contributions HybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  • 7. Contributions HybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  • 8. Contributions HybridStore Interface insert(float key , void* record, uint8 t length) select(uint32 t t1 , uint32 t t2 , float k1 , float k2 ) HybridStore Features All NAND pages are fully occupied and written purely sequentially In-place updates and out-of-place writes are completely avoided Process typical joint queries efficiently, even on large-scale datasets Data aging without overhead Sensor-friendly: 16.5KB ROM and 3.2KB RAM in TinyOS 2.1 Potential Applications Storage layer abstraction: Squirrel [Mottola’10] Brian (UMD@USA) HybridStore February 14, 2013 5 / 15
  • 9. HybridStore: Overview Partition the data stream into segments Create an in-segment index for each segment Create an inter-segment index to organize segments Benefits: skip unnecessary segments, small index per segment Brian (UMD@USA) HybridStore February 14, 2013 6 / 15
  • 10. HybridStore: Index Management Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ] NULL Header In-segment β-Tree: locate records within [k1 , k2 ] In-segment Bloom filter: check the existence of key values if k1 = k2 Brian (UMD@USA) HybridStore February 14, 2013 7 / 15
  • 11. HybridStore: Index Management Inter-segment skip list: addr , tmin , locate segments within [t1 , t2 ] NULL Header In-segment β-Tree: locate records within [k1 , k2 ] In-segment Bloom filter: check the existence of key values if k1 = k2 Brian (UMD@USA) HybridStore February 14, 2013 7 / 15
  • 12. HybridStore: In-segment Index In-segment β-Tree: locate records within [k1 , k2 ] Binary tree: lowK , highK , left, right, splitK Prediction-based bucket splitting: compute splitK [-60, 120] [-60, 30] (30, 120] (30, 75] (75, 120] (82, 84] (75, 97.5] (97.5, 120] (75, 86.25] (86.25, 97.5] Brian (UMD@USA) HybridStore February 14, 2013 8 / 15
  • 13. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  • 14. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  • 15. HybridStore: In-segment Index In-segment Bloom filter: check the existence of key values if k1 = k2 1 qn q v bits, q hash functions, represent n items: p = 1 − 1 − v Must be maintained in RAM: NOR flash is byte-oriented If q = 3, n = 4096, p ≈ 3.06%, then v = 32768 (i.e., 4KB) Horizontal partition: fixed small bloom filter sections (e.g., 256B) Vertical partition: group fragments with the same offset in the same NAND page Brian (UMD@USA) HybridStore February 14, 2013 9 / 15
  • 16. HybridStore: Storage Hierarchy NOR flash: circular array, fixed segment size NAND flash: circular array, logical segment (multiple erase blocks) Index structure: updated in a NOR segment, copied to the NAND segment later Header: [T1 , T2 ], [K1 , K2 ], dataAddr , idxAddr , bfAddr , skipList Skip List Header Bloom Write Read Filter Buffer RAM Readings ... Readings ... Buffer Buffer Bloom Filter NOR NOR Adaptive Segment ... Segment NOR Binary Tree Readings ... Readings Bloom Filter Tree ... Segment Segment ... Segment NAND Tree }Header Page (a) Storage Hierarchy (b) NAND Segment Structure Brian (UMD@USA) HybridStore February 14, 2013 10 / 15
  • 17. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  • 18. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  • 19. HybridStore: Operations Insertion Update the β tree: allocate new bucket if necessary Update the Bloom filter buffer: flush it out to NOR flash if necessary NOR segment is full: copy to the NAND segment, update the skip list, start a new segment Querying: t1 , t2 , k1 , k2 t1 = t2 t1 < t2 k1 = k2 skip list skip list + Bloom filter + β-Tree k1 < k2 skip list skip list + β-Tree Skip a segment if [K1 , K2 ] ⊂ [k1 , k2 ] Data Aging: delete the oldest NAND segment No need to update any pointer No need to move any data page Brian (UMD@USA) HybridStore February 14, 2013 11 / 15
  • 20. HybridStore: Implementation and Evaluation TinyOS implementation: 16.5KB ROM, 3.2KB RAM Trace-driven simulation: over 2.6 million weather records in 5 years Insertion: 13% ∼ 18% improvement 2 90 40 β−Tree Static tree β−Tree Static tree β−Tree Static tree 1.8 80 35 1.6 70 30 1.4 60 Space Overhead (%) 25 1.2 Energy (µJ) Time (ms) 50 1 20 40 0.8 15 30 0.6 10 20 0.4 10 5 0.2 0 0 0 64 128 256 64 128 256 64 128 256 NOR Flash Segment Size (KB) NOR Flash Segment Size (KB) NOR Flash Segment Size (KB) (a) Latency (b) Energy (c) Space Overhead Figure: Performance per insertion Brian (UMD@USA) HybridStore February 14, 2013 12 / 15
  • 21. HybridStore: Value-based Equality Query Key detection: 26.18ms and 1.5mJ over 0.5 million readings Nonexistent keys: more than 3× improvement 300 18 β−Tree (64KB) β−Tree (64KB) β−Tree (128KB) 16 β−Tree (128KB) 250 β−Tree (256KB) β−Tree (256KB) 14 β−Tree (64KB w/o BF) β−Tree (64KB w/o BF) 200 Static (128KB) 12 Static (128KB) Energy (mJ) Time (ms) 10 150 8 100 6 4 50 2 0 0 1 day 1 week 1 month 3 month 1 year 1 day 1 week 1 month 3 month 1 year Time Range Time Range (a) Latency (b) Energy Figure: Impact of Bloom filter for nonexistent keys Brian (UMD@USA) HybridStore February 14, 2013 13 / 15
  • 22. HybridStore: Full Query Retrieve 120K readings in 11.08 seconds from 0.5 million records [SenSys ’11]: over 20 seconds to get 50% from 50, 000 records 12 700 1 degree 1 degree 3 degree 3 degree 600 10 5 degree 5 degree 7 degree 7 degree 9 degree 500 9 degree Energy (mJ) / Query 8 Time (s) / Query 400 6 300 4 200 2 100 0 0 1 day 1 week 1 month 3 months 6 months 1 year 1 day 1 week 1 month 3 months 6 months 1 year Time Range Time Range (a) Total Latency per query (b) Total energy per query Figure: HybridStore performance per query of full queries Brian (UMD@USA) HybridStore February 14, 2013 14 / 15
  • 23. Conclusion and Future Work Conclusion HybridStore: efficient, light-weight, and sensor-friendly Process typical joint queries efficiently Process large-scale dataset efficiently Future Work2 Failure recovery mechanism Distributed database system based on HybridStore Testbed experiments 2 B. Wang and J. S. Baras. HybridDB: An Efficient Database System Supporting Incremental epsilon-Approximate Querying for Storage-Centric Sensor Networks. Submitted to the ACM Transactions on Sensor Networks, 2013, pp. 1–35 Brian (UMD@USA) HybridStore February 14, 2013 15 / 15