SlideShare une entreprise Scribd logo
1  sur  18
Google Bigtable
Kasim Mothana
Christian D. Ozoria
 Bigtable is a compressed, highly distributed,
high performance data storage system
 Is used by other Google products, including
Google Search, Google Analytics and Google
Earth, and a part of the Google’s Platform as a
Service (PaaS)
What is Bigtable?
 Bigtable is primarily designed to scale petabytes of data (for
commercial databases such scale is difficult and or expensive to
handle) across thousands of commodity machines
 Scalability is accomplished with the flexibility to add more resources
to the system on the fly without the need to reconfigure the system
 Its model and implementation allow to maintain good performance
on such volumes of data
 The model makes it widely applicable
 Cost-effective
 Self-managing
Why Bigtable?
 Bigtable can be loosely compared to a spreadsheet that maintains
versions of cells, each with a timestamp. Often it is referred to as a
distributed multidimensional sorted map – the map where a row key,
a column key, and a timestamp are mapped to a value
 Key model elements:
 Row key (stored as a string)
 Column key
 Timestamp
 Value that is an uninterrupted array of bytes
 Bigtable is a semi-structured store: for the same row key we can
have different columns (similar to the spreadsheet)
Data Model
 Storage is organized by row key (ordered
alphabetically), column key and timestamp
 Columns are grouped into column families to improve
storage characteristics
 Each cell can hold multiple version of data
Data Model (cont.)
Example 1
Below is an example of a social network for United States
presidents. Each president can follow posts from other
presidents. The following shows a Bigtable table that tracks
who each president is following on Prezzy:
 The user name (in this case, the president name) is
used as the row key
 This table has one column family containing multiple
column qualifiers
 Because new column qualifiers can be added
dynamically, it is easy to add new followers
 Illustrates the mapping
(row:string, column:string, time:int64) → string
Example1 -- Explanation
Example 2
 The row key is a reversed URL (explained in the following
slides)
 The Contents column family contains the page contents
 The Anchor column family contains the text of any anchors
that reference the page
 CNN’s home page is referenced by both the Sports Illustrated
and the MY-look home pages, so the row contains columns
named anchor:cnnsi.com and anchor:my.look.ca
 Each anchor cell has one version; the contents column has
several versions
Example 2 – Explanation
 The Bigtable implementation has three major
components:
 A library that is linked into every client
 One master server
 Many tablet servers
Implementation
 Data is dynamically partitioned based on the row key;
each row range creates a tablet, which is the unit of
distribution and load balancing
 Clients can exploit this property by selecting their row
keys so that they get good locality for their data
accesses. In Example 2, pages in the same domain are
grouped together by reversing the hostname
components of the URLs: hadoop.apache.com and
hbase.apache.com are stored next to each other as
org.apache.hadoop and org.apache.hbase
Dynamic Tablet Partitioning
 Google Analytics is a service that helps webmasters
analyze traffic patterns at their web sites
 Tracks website traffic and makes it available to
webmasters
Application: Google Analytics
 Personalized Search is a service that records user
queries and clicks when using Google search
 Personalized Search stores each user’s data in
Bigtable. Each user has a unique user id and is
assigned a row named by that user id
 Enables a more personalized search experience
Application: Personalized Search
 Incredible scalability. The Cloud Big Table is designed
to scale in direct proportion to the machines in the
cluster
 Simple administration. The Cloud Big Table handles
upgrades, restarts, and replication transparently
Benefits
 Bigtable is a distributed system for storing data at
Google
 Bigtable clusters have been in production use since
April 2005
 In August 2006, more than sixty projects were using
Bigtable.
 Users like the performance and high availability
provided by the Bigtable implementation
Conclusion
 Users like that they can scale by simply adding more
machines to the system as their resource demands
change over time
 Several additional Bigtable features, such as support
for secondary indices and infrastructure for building
cross-data-center replicated Bigtables with multiple
master replicas, are in the process of development.
Conclusion cont.
 Bigtable: A Distributed Storage System for Structured Data
http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf
 Under the covers of the App Engine Database.
 http://snarfed.org/datastore_talk.html
 Wikipedia: BigTable
 https://en.wikipedia.org/wiki/BigTable
 Cloud BigTable Beta
 https://cloud.google.com/bigtable/docs/
 Introducing Google Cloud BigTable
 http://googlecloudplatform.blogspot.com/2015/05/introducing-Google-Cloud-Bigtable.html
 CS262B Advanced Topics in Computer Systems. Spring 2009.
http://www.eecs.berkeley.edu/~culler/cs262b/summary/bigtable.html
 Tudorica, Bogdan George, and Cristian Bucur. "A comparison between several NoSQL databases
with comments and notes." In Roedunet International Conference (RoEduNet), 2011 10th, pp. 1-5. IEEE,
2011.
 Lee, Ken Ka-Yin, Wai-Choi Tang, and Kup-Sze Choi. "Alternatives to relational database: comparison
of NoSQL and XML approaches for clinical data storage." Computer methods and programs in
biomedicine 110, no. 1 (2013): 99-109.
 Nayak, Ameya, Anil Poriya, and Dikshay Poojary. "Type of NOSQL Databases and its Comparison
with Relational Databases." International Journal of Applied Information Systems 5, no. 4 (2013): 16-
19.
Sources
 Vicknair, Chad, Michael Macias, Zhendong Zhao, Xiaofei Nan, Yixin Chen, and Dawn Wilkins. "A comparison of
a graph database and a relational database: a data provenance perspective." In Proceedings of the 48th
annual southeast regional conference, p. 42. ACM, 2010.
 Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar
Chandra, Andrew Fikes, and Robert E. Gruber. "BigTable: A distributed storage system for structured
data." ACM Transactions on Computer Systems (TOCS) 26, no. 2 (2008): 4.
 Leavitt, Neal. "Will NoSQL databases live up to their promise?." Computer 43, no. 2 (2010): 12-14.
 Stonebraker, Michael. "SQL databases v. NoSQL databases." Communications of the ACM 53, no. 4 (2010): 10-
11.
 Pallis, George. "Cloud computing: the new frontier of internet computing." IEEE Internet Computing 5 (2010):
70-73.
 Vossen, Gottfried. "Big data as the new enabler in business and other intelligence." Vietnam Journal of
Computer Science 1, no. 1 (2014): 3-14.
 Zicari, Roberto V. "Big data: Challenges and opportunities." Big data computing (2014): 103-128.
 Moniruzzaman, A. B. M., and Syed Akhter Hossain. "Nosql database: New era of databases for big data
analytics-classification, characteristics and comparison." arXiv preprint arXiv:1307.0191 (2013).
 Thankachan, Tomcy. "Google Big Table" November 27th 2012. PowerPoint presentation
 Jacotin, Romain. "Lecture: The Google Big Table" October 9th 2014. PowerPoint presentation
Sources cont.

Contenu connexe

Tendances

Cloud architecture
Cloud architectureCloud architecture
Cloud architectureAdeel Javaid
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operationsanujaggarwal49
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File SystemRutvik Bapat
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonGrisha Weintraub
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architectureBishal Khanal
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systemsViet-Trung TRAN
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Simplilearn
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellKhalid Imran
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storageHimanshu Sahu
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkDongwon Kim
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computinghuda2018
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoopdarugar
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology LandscapeShivanandaVSeeri
 
Scalability
ScalabilityScalability
Scalabilityfelho
 

Tendances (20)

Cloud architecture
Cloud architectureCloud architecture
Cloud architecture
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
 
Distributed storage system
Distributed storage systemDistributed storage system
Distributed storage system
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and Comparison
 
Mongodb basics and architecture
Mongodb basics and architectureMongodb basics and architecture
Mongodb basics and architecture
 
Introduction to distributed file systems
Introduction to distributed file systemsIntroduction to distributed file systems
Introduction to distributed file systems
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
Virtualization vs. Cloud Computing: What's the Difference?
Virtualization vs. Cloud Computing: What's the Difference?Virtualization vs. Cloud Computing: What's the Difference?
Virtualization vs. Cloud Computing: What's the Difference?
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
A complete guide to azure storage
A complete guide to azure storageA complete guide to azure storage
A complete guide to azure storage
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
 
Data-Intensive Technologies for Cloud Computing
Data-Intensive Technologies for CloudComputingData-Intensive Technologies for CloudComputing
Data-Intensive Technologies for Cloud Computing
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
 
Big Data technology Landscape
Big Data technology LandscapeBig Data technology Landscape
Big Data technology Landscape
 
Scalability
ScalabilityScalability
Scalability
 

Similaire à Google BigTable

Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06temp2004it
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06mrlonganh
 
Bigtable
BigtableBigtable
Bigtableptdorf
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataVipin Batra
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesEditor Jacotech
 
A native enhanced elastic extension tables multi-tenant database
A native enhanced elastic extension tables multi-tenant database A native enhanced elastic extension tables multi-tenant database
A native enhanced elastic extension tables multi-tenant database IJECEIAES
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudBharat Rane
 
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
A New Multi-Dimensional Hyperbolic Structure for Cloud Service IndexingA New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexingijdms
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable영원 서
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011Satya Ramachandran
 
Technical Research Document - Anurag
Technical Research Document - AnuragTechnical Research Document - Anurag
Technical Research Document - Anuraganuragrajandekar
 

Similaire à Google BigTable (20)

Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
 
Bigtable_Paper
Bigtable_PaperBigtable_Paper
Bigtable_Paper
 
Bigtable
BigtableBigtable
Bigtable
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Bigtable osdi06
Bigtable osdi06Bigtable osdi06
Bigtable osdi06
 
Bigtable
Bigtable Bigtable
Bigtable
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
A native enhanced elastic extension tables multi-tenant database
A native enhanced elastic extension tables multi-tenant database A native enhanced elastic extension tables multi-tenant database
A native enhanced elastic extension tables multi-tenant database
 
Google Bigtable
Google BigtableGoogle Bigtable
Google Bigtable
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
 
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
A New Multi-Dimensional Hyperbolic Structure for Cloud Service IndexingA New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
A New Multi-Dimensional Hyperbolic Structure for Cloud Service Indexing
 
Tableau training course
Tableau training courseTableau training course
Tableau training course
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
 
Big table
Big tableBig table
Big table
 
JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011JovianDATA MDX Engine Comad oct 22 2011
JovianDATA MDX Engine Comad oct 22 2011
 
Technical Research Document - Anurag
Technical Research Document - AnuragTechnical Research Document - Anurag
Technical Research Document - Anurag
 
NOSQL
NOSQLNOSQL
NOSQL
 

Plus de New York City College of Technology Computer Systems Technology Colloquium

Plus de New York City College of Technology Computer Systems Technology Colloquium (12)

Ontology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIsOntology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIs
 
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
 
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Cloud Technology: Virtualization
 
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
 
Static Analysis and Verification of C Programs
Static Analysis and Verification of C ProgramsStatic Analysis and Verification of C Programs
Static Analysis and Verification of C Programs
 
Test Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build AccelerationTest Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build Acceleration
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Introduction to new features in java 8
Introduction to new features in java 8Introduction to new features in java 8
Introduction to new features in java 8
 
Android Apps the Right Way
Android Apps the Right WayAndroid Apps the Right Way
Android Apps the Right Way
 
More than Words: Advancing Prosodic Analysis
More than Words: Advancing Prosodic AnalysisMore than Words: Advancing Prosodic Analysis
More than Words: Advancing Prosodic Analysis
 

Dernier

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Google BigTable

  • 2.  Bigtable is a compressed, highly distributed, high performance data storage system  Is used by other Google products, including Google Search, Google Analytics and Google Earth, and a part of the Google’s Platform as a Service (PaaS) What is Bigtable?
  • 3.  Bigtable is primarily designed to scale petabytes of data (for commercial databases such scale is difficult and or expensive to handle) across thousands of commodity machines  Scalability is accomplished with the flexibility to add more resources to the system on the fly without the need to reconfigure the system  Its model and implementation allow to maintain good performance on such volumes of data  The model makes it widely applicable  Cost-effective  Self-managing Why Bigtable?
  • 4.  Bigtable can be loosely compared to a spreadsheet that maintains versions of cells, each with a timestamp. Often it is referred to as a distributed multidimensional sorted map – the map where a row key, a column key, and a timestamp are mapped to a value  Key model elements:  Row key (stored as a string)  Column key  Timestamp  Value that is an uninterrupted array of bytes  Bigtable is a semi-structured store: for the same row key we can have different columns (similar to the spreadsheet) Data Model
  • 5.  Storage is organized by row key (ordered alphabetically), column key and timestamp  Columns are grouped into column families to improve storage characteristics  Each cell can hold multiple version of data Data Model (cont.)
  • 6. Example 1 Below is an example of a social network for United States presidents. Each president can follow posts from other presidents. The following shows a Bigtable table that tracks who each president is following on Prezzy:
  • 7.  The user name (in this case, the president name) is used as the row key  This table has one column family containing multiple column qualifiers  Because new column qualifiers can be added dynamically, it is easy to add new followers  Illustrates the mapping (row:string, column:string, time:int64) → string Example1 -- Explanation
  • 9.  The row key is a reversed URL (explained in the following slides)  The Contents column family contains the page contents  The Anchor column family contains the text of any anchors that reference the page  CNN’s home page is referenced by both the Sports Illustrated and the MY-look home pages, so the row contains columns named anchor:cnnsi.com and anchor:my.look.ca  Each anchor cell has one version; the contents column has several versions Example 2 – Explanation
  • 10.  The Bigtable implementation has three major components:  A library that is linked into every client  One master server  Many tablet servers Implementation
  • 11.  Data is dynamically partitioned based on the row key; each row range creates a tablet, which is the unit of distribution and load balancing  Clients can exploit this property by selecting their row keys so that they get good locality for their data accesses. In Example 2, pages in the same domain are grouped together by reversing the hostname components of the URLs: hadoop.apache.com and hbase.apache.com are stored next to each other as org.apache.hadoop and org.apache.hbase Dynamic Tablet Partitioning
  • 12.  Google Analytics is a service that helps webmasters analyze traffic patterns at their web sites  Tracks website traffic and makes it available to webmasters Application: Google Analytics
  • 13.  Personalized Search is a service that records user queries and clicks when using Google search  Personalized Search stores each user’s data in Bigtable. Each user has a unique user id and is assigned a row named by that user id  Enables a more personalized search experience Application: Personalized Search
  • 14.  Incredible scalability. The Cloud Big Table is designed to scale in direct proportion to the machines in the cluster  Simple administration. The Cloud Big Table handles upgrades, restarts, and replication transparently Benefits
  • 15.  Bigtable is a distributed system for storing data at Google  Bigtable clusters have been in production use since April 2005  In August 2006, more than sixty projects were using Bigtable.  Users like the performance and high availability provided by the Bigtable implementation Conclusion
  • 16.  Users like that they can scale by simply adding more machines to the system as their resource demands change over time  Several additional Bigtable features, such as support for secondary indices and infrastructure for building cross-data-center replicated Bigtables with multiple master replicas, are in the process of development. Conclusion cont.
  • 17.  Bigtable: A Distributed Storage System for Structured Data http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf  Under the covers of the App Engine Database.  http://snarfed.org/datastore_talk.html  Wikipedia: BigTable  https://en.wikipedia.org/wiki/BigTable  Cloud BigTable Beta  https://cloud.google.com/bigtable/docs/  Introducing Google Cloud BigTable  http://googlecloudplatform.blogspot.com/2015/05/introducing-Google-Cloud-Bigtable.html  CS262B Advanced Topics in Computer Systems. Spring 2009. http://www.eecs.berkeley.edu/~culler/cs262b/summary/bigtable.html  Tudorica, Bogdan George, and Cristian Bucur. "A comparison between several NoSQL databases with comments and notes." In Roedunet International Conference (RoEduNet), 2011 10th, pp. 1-5. IEEE, 2011.  Lee, Ken Ka-Yin, Wai-Choi Tang, and Kup-Sze Choi. "Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage." Computer methods and programs in biomedicine 110, no. 1 (2013): 99-109.  Nayak, Ameya, Anil Poriya, and Dikshay Poojary. "Type of NOSQL Databases and its Comparison with Relational Databases." International Journal of Applied Information Systems 5, no. 4 (2013): 16- 19. Sources
  • 18.  Vicknair, Chad, Michael Macias, Zhendong Zhao, Xiaofei Nan, Yixin Chen, and Dawn Wilkins. "A comparison of a graph database and a relational database: a data provenance perspective." In Proceedings of the 48th annual southeast regional conference, p. 42. ACM, 2010.  Chang, Fay, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. "BigTable: A distributed storage system for structured data." ACM Transactions on Computer Systems (TOCS) 26, no. 2 (2008): 4.  Leavitt, Neal. "Will NoSQL databases live up to their promise?." Computer 43, no. 2 (2010): 12-14.  Stonebraker, Michael. "SQL databases v. NoSQL databases." Communications of the ACM 53, no. 4 (2010): 10- 11.  Pallis, George. "Cloud computing: the new frontier of internet computing." IEEE Internet Computing 5 (2010): 70-73.  Vossen, Gottfried. "Big data as the new enabler in business and other intelligence." Vietnam Journal of Computer Science 1, no. 1 (2014): 3-14.  Zicari, Roberto V. "Big data: Challenges and opportunities." Big data computing (2014): 103-128.  Moniruzzaman, A. B. M., and Syed Akhter Hossain. "Nosql database: New era of databases for big data analytics-classification, characteristics and comparison." arXiv preprint arXiv:1307.0191 (2013).  Thankachan, Tomcy. "Google Big Table" November 27th 2012. PowerPoint presentation  Jacotin, Romain. "Lecture: The Google Big Table" October 9th 2014. PowerPoint presentation Sources cont.