SlideShare une entreprise Scribd logo
1  sur  35
NoSQL Night!
Singapore Spring@Pivotal User Group
Clarence J M Tauro
Sr. Instructor
Couchbase
About the Speaker
• Clarence J M Tauro – clarence@couchbase.com
– Senior Instructor, Couchbase
– ~11 Years Professional Teaching and Consulting Experience
– Worked at Pivotal – Instructor/Consultant for Spring/Spring
Security/Spring Web/Enterprise Integration with Spring/Spring
JMS/Spring Web/Spring Batch, Pivotal Hadoop/Cloud Foundry
– PhD in Computer Science from Christ University [thesis
accepted]
– Hard-core Dog lover
Disclaimer
• Disclaimer: The views expressed in this presentation
are our own and do not necessarily reflect the views of
Couchbase
Objectives
• Introduction to NoSQL
• Are ACID Properties always desirable?
• Basically available, Soft state, Eventually consistent
(BASE)
• The CAP Theorem
• Introducing Couchbase
• Couchbase Operations
Introduction
RDBMS - predominant technology
for storing structured data in
web and business applications
“one size fits all” - thinking
concerning data-stores has been
questioned
Apply NoSQL databases for the
persistence layer/Polyglot
Programming
ACID Properties
• ATOMICITY
• CONSISTENCY
• ISOLATION
• DURABILITY
Are ACID Properties always desirable?
• … But what about:
– Latency
– Partition Tolerance
– High Availability
– Scalability
the system is
available, but not
necessarily all items
in it at any given
point in time
after a certain time all
nodes are consistent, but
at any given time this
might not be the case
information (state) the user
put into the system that will go
away if the user doesn't
maintain it
BASE
NoSQL Common Traits
• Non-relational
• Schema-free/Schema-on-read
• Eventual consistency
• Open source
• Distributed
• “web-scale”
The CAP Theorem
• Consistency – can all
nodes see identical data,
at all times?
• Availability – can all
nodes be read from and
written to, at all times?
• Partition Tolerance – will
nodes function normally,
even when the cluster
breaks?
Consistency
Partition
Tolerance
Availability
CHOOSE ANY TWO
The CAP Theorem
• CP: Consistency and Partition Tolerance
- Immediately consistent data across a horizontally scaled
cluster, even with network problems
- Couchbase
• AP: Availability and Partition Tolerance
- Always services requests, across multiple data centers,
even with network problems, data eventually consistent
- Apache HBase or Cassandra, Couchbase (XDCR)
• CA: Consistency and Availability
- Always services requests with immediately consistent
data, in a vertically scaled system
- MySQL, Oracle, Microsoft SQL Server
What do you do with the Data?
Operational Use
•Real time intelligence
•Focus on data flows and
processes
•Extremely fast (in-memory)
reads
•Extremely fast (log append)
writes
•Improve the current
outcome
Analytical Use
•Batched workloads
•Vast data aggregations
•Retrospective analyses
•Focus on data pools
•Improve future outcomes
Hadoop vs. NoSQL
Operational VelocityAnalytical Volume
Real-time
operational database systems
improve current outcomes
Batch-oriented
analytical database systems
improve future outcomes
Hadoop NoSQL
Types of NoSQL
• Key-value stores
• Wide Column stores
• Document stores
• Graph databases
Key-Value Stores
• The most common; not-necessarily the most popular
• Key and a simple value
- Speed
- Scale
- Simplicity
• Find simple values by key extremely fast
Clarenceuser::1234
Melisauser::1235
Michaeluser::1236
Document Stores
• Key and a structured value (document)
- Speed
- Scale
- Flexibility
• Read/write ever-changing data about people, places,
and things, at cloud-scale
user::1234 { name: 'Frank', age: 37, kids: ['Sue', 'Ann', 'Bob'] }
user::1235 { name: 'Carolyn', age: 56, kids: ['Tina'] }
user::1236 { name: 'Tessa', age: 24}
Wide Column Stores
• Key and nested set of tuples
- Write vast volumes of data, with eventually consistent
read access
user::1234
name: text Frank
age: number 37
kid: text
Sue
Ann
Bob
user::1235
name: text Carolyn
age: number 56
kid: text Tina
Graph Databases
• Linked list of keyed objects
- Relationships
• Monitor complex, dynamically networked connections
user::1
234
Frank
37
Sue
Ann
Bob
user::1
235
Carolyn
56
Tina
user::1
236
Tessa
24
Polyglot Programming
• Enterprise will have a variety of different data storage
technologies for different kinds of data
• We need to ask how we want to manipulate the data.
This will help us figure out which persistence
technologies are appropriate
- User Sessions: Couchbase (Memcached)/Redis
- Financial Data: RDBMS
- Shopping Cart: Riak/Couchbase (Memcached)
- Recommendation Systems: Neo4J
- Product Catalog: Couchbase/MongoDB
- Reporting: RDBMS/Couchbase Views
- Analytics: Couchbase/Cassandra
History of Couchbase
NorthScale developed a
key-value storage engine
Apache CouchDB database
project
Membase and CouchOne joined forces in February
2011 to create Couchbase, the first and only
provider of a comprehensive, end-to-end family of
NoSQL database products
What is Couchbase Server?
• Couchbase Server
• Is a “document” database solution
• Has key/value based orientation
• Is geared for JSON
• Has no tables and no fixed schema
• Runs on a networked cluster of nodes
• Is highly scalable
• Is lightning fast read/write
• Has caching and persistence layers
• Automatically fails-over
• Couchbase Server is best suited for fast-changing data
items of relatively small size
JavaScript Object Notation
{
     "firstName": "Clarence",
     "lastName": "Tauro",
     "age": 25,
     "address":
     {
         "streetAddress": "21 2nd Street",
         "city": "Bangalore",
         "state": "KA",
         "postalCode": "560059"
     },
     "phoneNumber":
     [
         {
           "type": "home",
           "number": "988 621-7674"
         }
     ]
}
JSON is a lightweight data-interchange
format easy for humans to read and
write
What is a Couchbase Document?
{
  "visibility": "PRIVATE",
  "name": "Eclectic Summer Mix",
  "userName": "suzyqrocks",
  "type": "org.couchmusic.domain.Playlist",
  "created": 1422138028037,
  "updated": 1422138028072,
  "tracks": []
}
{
  "id": "playlist:12345",
  "rev": "1-0004ebc0000000000",
  "flags": 0,
  "expiration": 0,
  "type": "json"
}
Document Content
(Most recent in RAM
and persisted to disk)
Document Metadata
(All keys unique
and kept in RAM)
Couchbase Server Architecture
• Technology Stack for Data Manager:
­ Couchbase Client SDK (“Smart Client”)
­ Client Query API1
and Query Engine (Views)
­ Cache Layer: RAM Cache
­ Persistence Layer: Couchbase
Couchbase Server Architecture
• Technology Stack for Cluster Manager:
­ Node Level – multiple vBuckets
• Default 1024 vBuckets/number of nodes
­ Cluster Level – multiple nodes (with 1 .. * buckets)1
­ Datacenter Level – multiple clusters (optional XDCR)2
­ Erlang (cluster management and process supervision)3
Couchbase Server Architecture
Anatomy of a Couchbase Application
Couchbase Client Software
Cluster Map
NS Server
EP Engine
NS Server
EP Engine
NS Server
EP Engine
{Server List}
1. REST request 8091
2. HTTP response
5. Create, Read, Update and Delete Documents
Becomes
a Smart
Client
4. Connect CRUD
Data Port 11210
3333 22
Managed Cache
DiskQueue
Disk
Replication
Queue
App Server
Doc 1Doc 1
Doc 1
To other node
Single Node – Couchbase Write Operation
Couchbase Server Node
3333 22
Managed Cache
DiskQueue
Replication
Queue
App Server
Doc 1’
Doc 1
Doc 1’Doc 1
Doc 1’
Disk
To other node
Single Node – Couchbase Update Operation
Couchbase Server Node
GET
Doc1
3333 22
DiskQueue
Replication
Queue
App Server
Doc 1
Doc 1Doc 1
Managed Cache
Disk
To other node
Single Node – Couchbase Read Operation
Couchbase Server Node
3333 22
2
DiskQueue
Replication
Queue
App Server
Couchbase Server Node
Doc 1
Doc 6Doc 5Doc 4Doc 3Doc 2
Doc 1
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Managed Cache
Disk
To other node
Single Node – Couchbase Cache Eviction
3333 22
2
DiskQueue
Replication
Queue
App Server
Couchbase Server Node
Doc 1
Doc 3Doc 5 Doc 2Doc 4
Doc 6 Doc 5 Doc 4 Doc 3 Doc 2
Doc 4
GET
Doc1
Doc 1
Doc 1
Managed Cache
Disk
To other node
Single Node – Couchbase Cache Miss
Other Features of Couchbase 4.0
• Multi-dimensional Scaling
• N1QL
• XDCR
Training
Get Started with Couchbase Server 4.0:
www.couchbase.com/beta
Get Trained on Couchbase: http://training.couchbase.com
CD220: Developing Couchbase NoSQL Applications
Oct 20 – Oct 23 2015
CS300: Couchbase NoSQL Server Administration
Nov 17 – Nov 20 2015
Enroll Today!
Questions?

Contenu connexe

Tendances

Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreFilipe Silva
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster Cloudera, Inc.
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...rhatr
 
Introducing Node.js in an Oracle technology environment (including hands-on)
Introducing Node.js in an Oracle technology environment (including hands-on)Introducing Node.js in an Oracle technology environment (including hands-on)
Introducing Node.js in an Oracle technology environment (including hands-on)Lucas Jellema
 
How companies use NoSQL & Couchbase - NoSQL Now 2014
How companies use NoSQL & Couchbase - NoSQL Now 2014How companies use NoSQL & Couchbase - NoSQL Now 2014
How companies use NoSQL & Couchbase - NoSQL Now 2014Dipti Borkar
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHBaseCon
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardMatthew Blair
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseCecile Le Pape
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with HadoopCloudera, Inc.
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresqlbotsplash.com
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)Todd Lipcon
 
Characteristics of no sql databases
Characteristics of no sql databasesCharacteristics of no sql databases
Characteristics of no sql databasesDipti Borkar
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
Spark streaming with apache kafka
Spark streaming with apache kafkaSpark streaming with apache kafka
Spark streaming with apache kafkapunesparkmeetup
 

Tendances (20)

Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document StoreConnector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
Connector/J Beyond JDBC: the X DevAPI for Java and MySQL as a Document Store
 
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Introducing Node.js in an Oracle technology environment (including hands-on)
Introducing Node.js in an Oracle technology environment (including hands-on)Introducing Node.js in an Oracle technology environment (including hands-on)
Introducing Node.js in an Oracle technology environment (including hands-on)
 
How companies use NoSQL & Couchbase - NoSQL Now 2014
How companies use NoSQL & Couchbase - NoSQL Now 2014How companies use NoSQL & Couchbase - NoSQL Now 2014
How companies use NoSQL & Couchbase - NoSQL Now 2014
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ FlipboardHBaseCon 2015- HBase @ Flipboard
HBaseCon 2015- HBase @ Flipboard
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 
Getting started with postgresql
Getting started with postgresqlGetting started with postgresql
Getting started with postgresql
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Project Voldemort
Project VoldemortProject Voldemort
Project Voldemort
 
Characteristics of no sql databases
Characteristics of no sql databasesCharacteristics of no sql databases
Characteristics of no sql databases
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Spark streaming with apache kafka
Spark streaming with apache kafkaSpark streaming with apache kafka
Spark streaming with apache kafka
 

Similaire à NoSQL_Night

Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Managementsameerfaizan
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...Simon Ambridge
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Fwdays
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL ServicesAmazon Web Services
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkJames Chen
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)VMware Tanzu
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonSpark Summit
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptxbetalab
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarKamesh Pemmaraju
 

Similaire à NoSQL_Night (20)

Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Management
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
 
How and when to use NoSQL
How and when to use NoSQLHow and when to use NoSQL
How and when to use NoSQL
 
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
(DAT204) NoSQL? No Worries: Build Scalable Apps on AWS NoSQL Services
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
Hadoop - Just the Basics for Big Data Rookies (SpringOne2GX 2013)
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
NoSQL and MongoDB
NoSQL and MongoDBNoSQL and MongoDB
NoSQL and MongoDB
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena EdelsonStreaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
 
NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
No sql
No sqlNo sql
No sql
 
Big Data_Architecture.pptx
Big Data_Architecture.pptxBig Data_Architecture.pptx
Big Data_Architecture.pptx
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 

NoSQL_Night

  • 1. NoSQL Night! Singapore Spring@Pivotal User Group Clarence J M Tauro Sr. Instructor Couchbase
  • 2. About the Speaker • Clarence J M Tauro – clarence@couchbase.com – Senior Instructor, Couchbase – ~11 Years Professional Teaching and Consulting Experience – Worked at Pivotal – Instructor/Consultant for Spring/Spring Security/Spring Web/Enterprise Integration with Spring/Spring JMS/Spring Web/Spring Batch, Pivotal Hadoop/Cloud Foundry – PhD in Computer Science from Christ University [thesis accepted] – Hard-core Dog lover
  • 3. Disclaimer • Disclaimer: The views expressed in this presentation are our own and do not necessarily reflect the views of Couchbase
  • 4. Objectives • Introduction to NoSQL • Are ACID Properties always desirable? • Basically available, Soft state, Eventually consistent (BASE) • The CAP Theorem • Introducing Couchbase • Couchbase Operations
  • 5. Introduction RDBMS - predominant technology for storing structured data in web and business applications “one size fits all” - thinking concerning data-stores has been questioned Apply NoSQL databases for the persistence layer/Polyglot Programming
  • 6. ACID Properties • ATOMICITY • CONSISTENCY • ISOLATION • DURABILITY
  • 7. Are ACID Properties always desirable? • … But what about: – Latency – Partition Tolerance – High Availability – Scalability
  • 8. the system is available, but not necessarily all items in it at any given point in time after a certain time all nodes are consistent, but at any given time this might not be the case information (state) the user put into the system that will go away if the user doesn't maintain it BASE
  • 9. NoSQL Common Traits • Non-relational • Schema-free/Schema-on-read • Eventual consistency • Open source • Distributed • “web-scale”
  • 10. The CAP Theorem • Consistency – can all nodes see identical data, at all times? • Availability – can all nodes be read from and written to, at all times? • Partition Tolerance – will nodes function normally, even when the cluster breaks? Consistency Partition Tolerance Availability CHOOSE ANY TWO
  • 11. The CAP Theorem • CP: Consistency and Partition Tolerance - Immediately consistent data across a horizontally scaled cluster, even with network problems - Couchbase • AP: Availability and Partition Tolerance - Always services requests, across multiple data centers, even with network problems, data eventually consistent - Apache HBase or Cassandra, Couchbase (XDCR) • CA: Consistency and Availability - Always services requests with immediately consistent data, in a vertically scaled system - MySQL, Oracle, Microsoft SQL Server
  • 12. What do you do with the Data? Operational Use •Real time intelligence •Focus on data flows and processes •Extremely fast (in-memory) reads •Extremely fast (log append) writes •Improve the current outcome Analytical Use •Batched workloads •Vast data aggregations •Retrospective analyses •Focus on data pools •Improve future outcomes
  • 13. Hadoop vs. NoSQL Operational VelocityAnalytical Volume Real-time operational database systems improve current outcomes Batch-oriented analytical database systems improve future outcomes Hadoop NoSQL
  • 14. Types of NoSQL • Key-value stores • Wide Column stores • Document stores • Graph databases
  • 15. Key-Value Stores • The most common; not-necessarily the most popular • Key and a simple value - Speed - Scale - Simplicity • Find simple values by key extremely fast Clarenceuser::1234 Melisauser::1235 Michaeluser::1236
  • 16. Document Stores • Key and a structured value (document) - Speed - Scale - Flexibility • Read/write ever-changing data about people, places, and things, at cloud-scale user::1234 { name: 'Frank', age: 37, kids: ['Sue', 'Ann', 'Bob'] } user::1235 { name: 'Carolyn', age: 56, kids: ['Tina'] } user::1236 { name: 'Tessa', age: 24}
  • 17. Wide Column Stores • Key and nested set of tuples - Write vast volumes of data, with eventually consistent read access user::1234 name: text Frank age: number 37 kid: text Sue Ann Bob user::1235 name: text Carolyn age: number 56 kid: text Tina
  • 18. Graph Databases • Linked list of keyed objects - Relationships • Monitor complex, dynamically networked connections user::1 234 Frank 37 Sue Ann Bob user::1 235 Carolyn 56 Tina user::1 236 Tessa 24
  • 19. Polyglot Programming • Enterprise will have a variety of different data storage technologies for different kinds of data • We need to ask how we want to manipulate the data. This will help us figure out which persistence technologies are appropriate - User Sessions: Couchbase (Memcached)/Redis - Financial Data: RDBMS - Shopping Cart: Riak/Couchbase (Memcached) - Recommendation Systems: Neo4J - Product Catalog: Couchbase/MongoDB - Reporting: RDBMS/Couchbase Views - Analytics: Couchbase/Cassandra
  • 20. History of Couchbase NorthScale developed a key-value storage engine Apache CouchDB database project Membase and CouchOne joined forces in February 2011 to create Couchbase, the first and only provider of a comprehensive, end-to-end family of NoSQL database products
  • 21. What is Couchbase Server? • Couchbase Server • Is a “document” database solution • Has key/value based orientation • Is geared for JSON • Has no tables and no fixed schema • Runs on a networked cluster of nodes • Is highly scalable • Is lightning fast read/write • Has caching and persistence layers • Automatically fails-over • Couchbase Server is best suited for fast-changing data items of relatively small size
  • 23. What is a Couchbase Document? {   "visibility": "PRIVATE",   "name": "Eclectic Summer Mix",   "userName": "suzyqrocks",   "type": "org.couchmusic.domain.Playlist",   "created": 1422138028037,   "updated": 1422138028072,   "tracks": [] } {   "id": "playlist:12345",   "rev": "1-0004ebc0000000000",   "flags": 0,   "expiration": 0,   "type": "json" } Document Content (Most recent in RAM and persisted to disk) Document Metadata (All keys unique and kept in RAM)
  • 25. • Technology Stack for Data Manager: ­ Couchbase Client SDK (“Smart Client”) ­ Client Query API1 and Query Engine (Views) ­ Cache Layer: RAM Cache ­ Persistence Layer: Couchbase Couchbase Server Architecture
  • 26. • Technology Stack for Cluster Manager: ­ Node Level – multiple vBuckets • Default 1024 vBuckets/number of nodes ­ Cluster Level – multiple nodes (with 1 .. * buckets)1 ­ Datacenter Level – multiple clusters (optional XDCR)2 ­ Erlang (cluster management and process supervision)3 Couchbase Server Architecture
  • 27. Anatomy of a Couchbase Application Couchbase Client Software Cluster Map NS Server EP Engine NS Server EP Engine NS Server EP Engine {Server List} 1. REST request 8091 2. HTTP response 5. Create, Read, Update and Delete Documents Becomes a Smart Client 4. Connect CRUD Data Port 11210
  • 28. 3333 22 Managed Cache DiskQueue Disk Replication Queue App Server Doc 1Doc 1 Doc 1 To other node Single Node – Couchbase Write Operation Couchbase Server Node
  • 29. 3333 22 Managed Cache DiskQueue Replication Queue App Server Doc 1’ Doc 1 Doc 1’Doc 1 Doc 1’ Disk To other node Single Node – Couchbase Update Operation Couchbase Server Node
  • 30. GET Doc1 3333 22 DiskQueue Replication Queue App Server Doc 1 Doc 1Doc 1 Managed Cache Disk To other node Single Node – Couchbase Read Operation Couchbase Server Node
  • 31. 3333 22 2 DiskQueue Replication Queue App Server Couchbase Server Node Doc 1 Doc 6Doc 5Doc 4Doc 3Doc 2 Doc 1 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Managed Cache Disk To other node Single Node – Couchbase Cache Eviction
  • 32. 3333 22 2 DiskQueue Replication Queue App Server Couchbase Server Node Doc 1 Doc 3Doc 5 Doc 2Doc 4 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 4 GET Doc1 Doc 1 Doc 1 Managed Cache Disk To other node Single Node – Couchbase Cache Miss
  • 33. Other Features of Couchbase 4.0 • Multi-dimensional Scaling • N1QL • XDCR
  • 34. Training Get Started with Couchbase Server 4.0: www.couchbase.com/beta Get Trained on Couchbase: http://training.couchbase.com CD220: Developing Couchbase NoSQL Applications Oct 20 – Oct 23 2015 CS300: Couchbase NoSQL Server Administration Nov 17 – Nov 20 2015 Enroll Today!

Notes de l'éditeur

  1. 1. Most modern operating systems want a few gigabytes (Windows usually a bit more than Linux), and there may be other processes running on these nodes such as monitoring agents. There are also needs for IO caching both for views and for the general functioning of the system.  We typically recommend about 60-80% of an system’s RAM to be allocated to Couchbase’s quota, leaving the rest for headroom and memory needs outside of Couchbase itself. 2. Cross Datacenter Replication (XDCR) is covered later in this course. 3. See https://blog.couchbase.com/tag/erlang
  2. The Memcache Client also uses a server list, but as contrasted to the Couchbase Client, there are no REST calls, it is only working over port 11210, and is very fast. This is using a proprietary Memchached protocol.
  3. 1.  A set request comes in from the application . 2.  Couchbase Server responds back that they key is written 3. Couchbase Server then Replicates the data out to memory in the other nodes 4. At the same time it is put the data into a write que to be persisted to disk