SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
Mobile Data with Couchbase Lite !
&!
Big Data HPCC Systems
By Fujio Turner
What is Couchbase Lite ?
What is Couchbase Lite ?
NoSQL JSON Document
Database for Mobile
+
Your Code
Embedded Database
Couchbase Lite 0.5 MB
Why do I need Couchbase Lite ?
Why do I need Couchbase Lite ?
Mobile Myths:
1. Always Available 2. Always High Performing
The mobile network is:
How Couchbase Lite tackles the Mobile Myths
Local data is always faster
How Couchbase Lite tackles the Mobile Myths
Local data is always faster
I need to save the data non-locally
,but
How Couchbase Lite tackles the Mobile Myths
Local data is always faster
I need to save the data non-locally
I need to send data to another mobile devices
,but
and/or
EZ Data Syncing with !
Couchbase Sync Gateway
https://github.com/couchbase/sync_gateway
Channels
{“data”:”yes”}
• Authentication & Sessions
• Definable channel rules
via JavaScript
http(s):// REST server
How Sync Gateway Works
Written in:
Data Flow:
CRUD:
Who is using Couchbase Lite ?
How
Uses Couchbase Lite
https://youtu.be/tYolHnbCavA
What BigData
solution is
ready for
the next
20 plus years ?
LexisNexis is a provider of legal,
tax, regulatory, news, business
information, and analysis to
legal, corporate, government,!
accounting and academic
markets. !
!
!
!
LexisNexis has been in
business since 1977 with over
30,000 employees worldwide. 
What is HPCC Systems?Who is ?
LexisNexis Risk is the division
of the LexisNexis which focuses
on data, Big Data processing,
linking and vertical expertise
and supports HPCC Systems
as an open source project
under Apache 2.0 License.
Comparison
JAVA C++
Petabytes
1-80,000 Jobs/day
Since 2005
Exabytes
Since 2000
Indexed: 2K-3K Jobs/sec*
? ? ? ? ? ?
Thor Roxie
Block Based File Based
In-Memory: 30 - 40 Jobs/min*
Non-Indexed: 4-1,040,000 Jobs/day
 *based on job (size / result set / complexity)
“I’m sub-second
fast.”
“I can query all
or part of your
data.”
Thor Roxie
Single Threaded
Hard Disk
Index(optional)
Multi-Threaded
Hard Disk
Index(optional)
In-memory
SSD
Either/Both
Architecture
BusinessDevelopmentCustomers
1 20
Non-Indexed Full Data Set
http://hpccsystems.com/why-hpcc/benchmarks
300GB File
Kevin CA 45
Mark MI 27
Sara FL 64
Name State Age
How is Data Stored on !
HPCC Systems ?!
Example
Customer Data May 2010
K.. CA 45 M.. MI 27 S.. FL 64
Thor Master
Thor Slaves
Kevin CA 45
Mark MI 27
Sara FL 64
Store Data
File Name
~/customers_2010-05
Data is distributed
evenly in the cluster
with replica copies
and is seen as a
file (example below).
K.. CA 45 M.. MI 27 S.. FL 64
Thor Master
Thor Slaves
Kevin CA 45
Mark MI 27
Sara FL 64
Store Data
Dali
File Location & Job Scheduler
File locations are
stored on disk.
File Name
~/customers_2010-05
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Dali
What state do most people live in?
ESP
1a.
2.
File Location & Job Scheduler
1.a A pre-compiled
query is triggered.
(Mostly used in Roxie)
1b. Ad-hoc query.
!
2.Query is sent to Dali
to get file locations.
1b.
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Dali
What state do most people live in?
ESP3.
File Location & Job Scheduler
3. Job is placed in
que to be sent to
Thor Master. Thor
Master coordinates
job execution on
Thor Slave nodes.
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Dali
What state do most people live in?
ESP
File Location & Job Scheduler
Job are done
locally on slaves
and/or
coordinated by
master globally.
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Dali
What state do most people live in?
ESP
4.
4.
MI 500
CA 120
FL 7
File Location & Job Scheduler
4.Job is returned with
optional grouped by &
sorted by at run time.
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Dali
What state do most people live in?
ESP
MI 500
CA 120
FL 7
File Location & Job Scheduler
SORT!
GROUP!
DEDUP!
JOIN!
MERGE!
BETWEEN!
LENGTH!
REGEX!
ROUND!
SUM!
COUNT!
TRIM!
WHEN!
AVE!
CASE!
NORMALIZE!
DENORMALIZE!
K-MEANS!
more ….
Multiple other actions can be
done on the data in a single job.
Sort
Count
Group
Classification
(ROXIE) 0.27 seconds to (THOR) few hours
Country = ‘US’
Join
Index of
~/facebook_2013
Query is Completed in a Single Job!
Asynchronously
~/facebook_2013
Country = ‘US’
~/twitter_2013
optional
K CA 45 M MI 27 S FL 64Thor Master
Thor Slaves
Kevin CA 45
Mark MI 27
Sara FL 64
CA row #3
MI row #17
MI row #4
FL row #5
Speed - Part 1
Indexing
IndexIndexIndex
• index per file
• customize by field(s)
File Name
~/customers_2010-05
File Name
~/customers_2010-05_index
1 40
Non-Indexed
1 200
To
Indexed
1 40
Non-Indexed
1 200
To
Indexed
male row #345
female row #4
male row #97
female row #267
CA row #3
MI row #17
MI row #4
FL row #5
Example Index Example Index
Speed - Part 2
Roxie
K CA 45 M MI 27 S FL 64Roxie Master
Roxie Slaves
Index In-Memory
Index Index Index
Speed - Part 2
Roxie
K CA 45 M MI 27 S FL 64Roxie Master
Roxie Slaves
Index In-Memory & Part or All Data
Index Index Index
or
Index In-Memory
Speed - Part 2
Roxie
K CA 45 M MI 27 S FL 64Roxie Master
Roxie Slaves
Roxie is Multi-Threaded
Index In-Memory & Part or All Data
or
Index In-Memory
Index Index Index
Speed - Part 2
Roxie
K CA 45 M MI 27 S FL 64Roxie Master
Roxie Slaves
Roxie is Multi-Threaded
Index In-Memory & Part or All Data
or
Index In-Memory
Index Index Index
SSD are OK - write few / read many
Speed - Part 2
Roxie
K CA 45 M MI 27 S FL 64Roxie Master
Roxie Slaves
Roxie is Multi-Threaded
Index In-Memory & Part or All Data
or
Index In-Memory
Index Index Index
2004
Thor Master
Thor Slaves
Dali ESP
Roxie Master
Roxie Slaves
Common Cluster
Data is a mix of structured
and unstructured. Use
Thor to do ETL and send
results to Roxie for user
queries.
HPCC Systems 5.2
New JSON file support
https://github.com/couchbase/sync_gateway/wiki/Webhooks
Flow Data !
From: Sync Gateway !
To: HPCC Systems
{“data”:”yes”}
Sync Gateway’s Webhooks API
lets you catch every JSON
coming into Sync Gateway
{“data”:”yes”} Couchbase Lite to !
HPCC Systems !
Transport
A simple Python web server
that can catch all the HTTP POST
from Sync Gateway and writes it
to a file for HPCC Systems to store.
https://github.com/househippo
Couchbase Lite to HPCC Systems Transport
INSTALL!
in 5 Minutes
Download
Source Code
Learning More - Couchbase Lite
http://couchbase.com/download
https://github.com/couchbase
Mountain View, CA
San Francisco ,CA
http://developer.couchbase.com/
mobile/get-started/get-started-
mobile/index.html
INSTALL!
in 5 Minutes
Download
or
Source Code
https://github.com/hpcc-systems
http://hpccsystems.com/download/
Learning More - HPCC Systems
Atlanta, GA
Mountain View, CA
https://youtu.be/8SV43DCUqJg

Contenu connexe

Tendances

SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for ElasticsearchJodok Batlogg
 
Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using PigDavid Wellman
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationlin bao
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learnedTit Petric
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Cloudera, Inc.
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsChien Chung Shen
 
Embedded R Execution using SQL
Embedded R Execution using SQLEmbedded R Execution using SQL
Embedded R Execution using SQLBrendan Tierney
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekingeProf. Wim Van Criekinge
 
Redis 101 Data Structure
Redis 101 Data StructureRedis 101 Data Structure
Redis 101 Data StructureIsmaeel Enjreny
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Rupak Roy
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for HadoopJim Dowling
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...source{d}
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingMitsuharu Hamba
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meetingJohannes Keizer
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS filesRupak Roy
 
'Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash''Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash'Cloud Elements
 

Tendances (20)

SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using Pig
 
Native erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentationNative erasure coding support inside hdfs presentation
Native erasure coding support inside hdfs presentation
 
Redis/Lessons learned
Redis/Lessons learnedRedis/Lessons learned
Redis/Lessons learned
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
 
Hadoop Essential for Oracle Professionals
Hadoop Essential for Oracle ProfessionalsHadoop Essential for Oracle Professionals
Hadoop Essential for Oracle Professionals
 
Embedded R Execution using SQL
Embedded R Execution using SQLEmbedded R Execution using SQL
Embedded R Execution using SQL
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge2016 bioinformatics i_io_wim_vancriekinge
2016 bioinformatics i_io_wim_vancriekinge
 
Redis 101 Data Structure
Redis 101 Data StructureRedis 101 Data Structure
Redis 101 Data Structure
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem
 
Polyglot metadata for Hadoop
Polyglot metadata for HadoopPolyglot metadata for Hadoop
Polyglot metadata for Hadoop
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...Code as Data workshop: Using source{d} Engine to extract insights from git re...
Code as Data workshop: Using source{d} Engine to extract insights from git re...
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
 
'Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash''Scalable Logging and Analytics with LogStash'
'Scalable Logging and Analytics with LogStash'
 

Similaire à NoSQL Couchbase Lite & BigData HPCC Systems

Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling DropboxC4Media
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaAmazee Labs
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCSheetal Dolas
 
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)lakeFS
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at nightMichael Yarichuk
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)confluent
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobLightbend
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta LakeDatabricks
 
POLARDB: A database architecture for the cloud
POLARDB: A database architecture for the cloudPOLARDB: A database architecture for the cloud
POLARDB: A database architecture for the cloudoysteing
 
Implementing SharePoint on Azure, Lessons Learnt from a Real World Project
Implementing SharePoint on Azure, Lessons Learnt from a Real World ProjectImplementing SharePoint on Azure, Lessons Learnt from a Real World Project
Implementing SharePoint on Azure, Lessons Learnt from a Real World ProjectK.Mohamed Faizal
 
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...FIAT/IFTA
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsITD Systems
 
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)Ortus Solutions, Corp
 
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...StampedeCon
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Fwdays
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL databaseDave Stokes
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerforce
 

Similaire à NoSQL Couchbase Lite & BigData HPCC Systems (20)

Scaling Dropbox
Scaling DropboxScaling Dropbox
Scaling Dropbox
 
Logging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & KibanaLogging with Elasticsearch, Logstash & Kibana
Logging with Elasticsearch, Logstash & Kibana
 
Open Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOCOpen Security Operations Center - OpenSOC
Open Security Operations Center - OpenSOC
 
Intro to hadoop
Intro to hadoopIntro to hadoop
Intro to hadoop
 
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)Ensuring Quality in Data Lakes  (D&D Meetup Feb 22)
Ensuring Quality in Data Lakes (D&D Meetup Feb 22)
 
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAILDNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
DNSSEC - WHAT IS IT ? INSTALL AND CONFIGURE IN CHROOT JAIL
 
Why databases cry at night
Why databases cry at nightWhy databases cry at night
Why databases cry at night
 
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
 
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the JobAkka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
Akka, Spark or Kafka? Selecting The Right Streaming Engine For the Job
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta Lake
 
POLARDB: A database architecture for the cloud
POLARDB: A database architecture for the cloudPOLARDB: A database architecture for the cloud
POLARDB: A database architecture for the cloud
 
Implementing SharePoint on Azure, Lessons Learnt from a Real World Project
Implementing SharePoint on Azure, Lessons Learnt from a Real World ProjectImplementing SharePoint on Azure, Lessons Learnt from a Real World Project
Implementing SharePoint on Azure, Lessons Learnt from a Real World Project
 
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
 
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objectsBacking Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
Backing Data Silo Atack: Alfresco sharding, SOLR for non-flat objects
 
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)
CBDW2014 - NoSQL Development With Couchbase and ColdFusion (CFML)
 
Data Science
Data ScienceData Science
Data Science
 
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
CouchDB at its Core: Global Data Storage and Rich Incremental Indexing at Clo...
 
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
 
The care and feeding of a MySQL database
The care and feeding of a MySQL databaseThe care and feeding of a MySQL database
The care and feeding of a MySQL database
 
Performance & Scalability Improvements in Perforce
Performance & Scalability Improvements in PerforcePerformance & Scalability Improvements in Perforce
Performance & Scalability Improvements in Perforce
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

NoSQL Couchbase Lite & BigData HPCC Systems

  • 1. Mobile Data with Couchbase Lite ! &! Big Data HPCC Systems By Fujio Turner
  • 3. What is Couchbase Lite ? NoSQL JSON Document Database for Mobile
  • 5. Why do I need Couchbase Lite ?
  • 6. Why do I need Couchbase Lite ? Mobile Myths: 1. Always Available 2. Always High Performing The mobile network is:
  • 7. How Couchbase Lite tackles the Mobile Myths Local data is always faster
  • 8. How Couchbase Lite tackles the Mobile Myths Local data is always faster I need to save the data non-locally ,but
  • 9. How Couchbase Lite tackles the Mobile Myths Local data is always faster I need to save the data non-locally I need to send data to another mobile devices ,but and/or
  • 10. EZ Data Syncing with ! Couchbase Sync Gateway https://github.com/couchbase/sync_gateway
  • 11. Channels {“data”:”yes”} • Authentication & Sessions • Definable channel rules via JavaScript http(s):// REST server How Sync Gateway Works Written in: Data Flow: CRUD:
  • 12. Who is using Couchbase Lite ?
  • 14. What BigData solution is ready for the next 20 plus years ?
  • 15. LexisNexis is a provider of legal, tax, regulatory, news, business information, and analysis to legal, corporate, government,! accounting and academic markets. ! ! ! ! LexisNexis has been in business since 1977 with over 30,000 employees worldwide.  What is HPCC Systems?Who is ? LexisNexis Risk is the division of the LexisNexis which focuses on data, Big Data processing, linking and vertical expertise and supports HPCC Systems as an open source project under Apache 2.0 License.
  • 16. Comparison JAVA C++ Petabytes 1-80,000 Jobs/day Since 2005 Exabytes Since 2000 Indexed: 2K-3K Jobs/sec* ? ? ? ? ? ? Thor Roxie Block Based File Based In-Memory: 30 - 40 Jobs/min* Non-Indexed: 4-1,040,000 Jobs/day  *based on job (size / result set / complexity)
  • 17. “I’m sub-second fast.” “I can query all or part of your data.” Thor Roxie Single Threaded Hard Disk Index(optional) Multi-Threaded Hard Disk Index(optional) In-memory SSD Either/Both Architecture
  • 18. BusinessDevelopmentCustomers 1 20 Non-Indexed Full Data Set http://hpccsystems.com/why-hpcc/benchmarks
  • 19. 300GB File Kevin CA 45 Mark MI 27 Sara FL 64 Name State Age How is Data Stored on ! HPCC Systems ?! Example Customer Data May 2010
  • 20. K.. CA 45 M.. MI 27 S.. FL 64 Thor Master Thor Slaves Kevin CA 45 Mark MI 27 Sara FL 64 Store Data File Name ~/customers_2010-05 Data is distributed evenly in the cluster with replica copies and is seen as a file (example below).
  • 21. K.. CA 45 M.. MI 27 S.. FL 64 Thor Master Thor Slaves Kevin CA 45 Mark MI 27 Sara FL 64 Store Data Dali File Location & Job Scheduler File locations are stored on disk. File Name ~/customers_2010-05
  • 22. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Dali What state do most people live in? ESP 1a. 2. File Location & Job Scheduler 1.a A pre-compiled query is triggered. (Mostly used in Roxie) 1b. Ad-hoc query. ! 2.Query is sent to Dali to get file locations. 1b.
  • 23. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Dali What state do most people live in? ESP3. File Location & Job Scheduler 3. Job is placed in que to be sent to Thor Master. Thor Master coordinates job execution on Thor Slave nodes.
  • 24. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Dali What state do most people live in? ESP File Location & Job Scheduler Job are done locally on slaves and/or coordinated by master globally.
  • 25. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Dali What state do most people live in? ESP 4. 4. MI 500 CA 120 FL 7 File Location & Job Scheduler 4.Job is returned with optional grouped by & sorted by at run time.
  • 26. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Dali What state do most people live in? ESP MI 500 CA 120 FL 7 File Location & Job Scheduler SORT! GROUP! DEDUP! JOIN! MERGE! BETWEEN! LENGTH! REGEX! ROUND! SUM! COUNT! TRIM! WHEN! AVE! CASE! NORMALIZE! DENORMALIZE! K-MEANS! more …. Multiple other actions can be done on the data in a single job.
  • 27. Sort Count Group Classification (ROXIE) 0.27 seconds to (THOR) few hours Country = ‘US’ Join Index of ~/facebook_2013 Query is Completed in a Single Job! Asynchronously ~/facebook_2013 Country = ‘US’ ~/twitter_2013 optional
  • 28. K CA 45 M MI 27 S FL 64Thor Master Thor Slaves Kevin CA 45 Mark MI 27 Sara FL 64 CA row #3 MI row #17 MI row #4 FL row #5 Speed - Part 1 Indexing IndexIndexIndex • index per file • customize by field(s) File Name ~/customers_2010-05 File Name ~/customers_2010-05_index
  • 30. 1 40 Non-Indexed 1 200 To Indexed male row #345 female row #4 male row #97 female row #267 CA row #3 MI row #17 MI row #4 FL row #5 Example Index Example Index
  • 31. Speed - Part 2 Roxie K CA 45 M MI 27 S FL 64Roxie Master Roxie Slaves Index In-Memory Index Index Index
  • 32. Speed - Part 2 Roxie K CA 45 M MI 27 S FL 64Roxie Master Roxie Slaves Index In-Memory & Part or All Data Index Index Index or Index In-Memory
  • 33. Speed - Part 2 Roxie K CA 45 M MI 27 S FL 64Roxie Master Roxie Slaves Roxie is Multi-Threaded Index In-Memory & Part or All Data or Index In-Memory Index Index Index
  • 34. Speed - Part 2 Roxie K CA 45 M MI 27 S FL 64Roxie Master Roxie Slaves Roxie is Multi-Threaded Index In-Memory & Part or All Data or Index In-Memory Index Index Index SSD are OK - write few / read many
  • 35. Speed - Part 2 Roxie K CA 45 M MI 27 S FL 64Roxie Master Roxie Slaves Roxie is Multi-Threaded Index In-Memory & Part or All Data or Index In-Memory Index Index Index 2004
  • 36. Thor Master Thor Slaves Dali ESP Roxie Master Roxie Slaves Common Cluster Data is a mix of structured and unstructured. Use Thor to do ETL and send results to Roxie for user queries.
  • 37. HPCC Systems 5.2 New JSON file support
  • 39. {“data”:”yes”} Sync Gateway’s Webhooks API lets you catch every JSON coming into Sync Gateway
  • 40. {“data”:”yes”} Couchbase Lite to ! HPCC Systems ! Transport A simple Python web server that can catch all the HTTP POST from Sync Gateway and writes it to a file for HPCC Systems to store.
  • 42. INSTALL! in 5 Minutes Download Source Code Learning More - Couchbase Lite http://couchbase.com/download https://github.com/couchbase Mountain View, CA San Francisco ,CA http://developer.couchbase.com/ mobile/get-started/get-started- mobile/index.html
  • 43. INSTALL! in 5 Minutes Download or Source Code https://github.com/hpcc-systems http://hpccsystems.com/download/ Learning More - HPCC Systems Atlanta, GA Mountain View, CA https://youtu.be/8SV43DCUqJg