SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Apache Phoenix with 
Actor Model (Akka.io) 
for Real-time Big Data 
Programming Stack 
Why we still need SQL for Big Data ? 
How to make Big Data more responsive and faster ? 
By http://nguyentantrieu.info 
Tech Lead at eClick team - FPT Online
Contents 
1. What is Big data and Why ? 
2. When standard relational database (Oracle,MySQL, ...) is 
not good enough 
3. Common problems in big data system 
4. Introducing open-source tools in Big Data System 
a. Apache Phoenix for ad-hoc query 
b. Actor Model and Akka.io for reactive data processing
What Does Big Data Actually Mean? 
“Big data means data 
that cannot fit easily into 
a standard relational database.” 
Hal Varian- Chief Economist, Google 
http://www.brookings.edu/blogs/techtank/posts/2014/09/11-big-data-definition
When standard relational database 
(Oracle,MySQL, ...) is not good enough 
the “analytic system” MySQL database from a 
startup, tracking all actions in mobile games: 
iOS, Android, ...
Complex analytic system and the “scale” pain
Definition from the crowd 
“Big data is a term describing the storage 
and analysis of large and or complex 
data sets using a series of techniques 
including, but not limited to: NoSQL, 
MapReduce and machine learning.” 
Jonathan Stuart Ward and Adam Barker 
Source: 
http://arxiv.org/abs/1309.5821 
http://www.technologyreview.com/view/519851/the-big-data-conundrum-how-to-define- 
it/
“Chaotic” fact and the demand 
80% of that data is unstructured or “chaotic” 
Photos, videos and social media posts - data that says so much 
about us - but cannot be analyzed via traditional methods 
Demand: 
“Finding order among chaos”
3 common problems in Big Data System 
1. Size: the volume of the datasets is a critical 
factor. 
2. Complexity: the structure, behaviour and 
permutations of the datasets is a critical 
factor. 
3. Technologies: the tools and techniques 
which are used to process a sizable or 
complex dataset is a critical factor.
Introducing open-source tools in Big Data System 
Apache Phoenix 
as SQL ad-hoc query 
engine 
Actor Model as nano-service 
for reactive data 
computation 
in the dawn of “Fast data”
Some innovative tools were born 
in the dawn of Big Data Age
But could an elephant fly without wings ?
But a phoenix can fly !
What is Apache Phoenix ? 
Apache Phoenix is a SQL skin over HBase. 
It means scaling Phoenix just like scale-up and 
scale-out the Hbase
Phoenix 
SQL Engine
Interesting features of Apache Phoenix 
● Embedded JDBC driver implements the majority of java.sql interfaces, 
including the metadata APIs. 
● Allows columns to be modeled as a multi-part row key or key/value cells. 
● Full query support with predicate push down and optimal scan key 
formation. 
● DDL support: CREATE TABLE, DROP TABLE, and ALTER TABLE for 
adding/removing columns. 
● Versioned schema repository. Snapshot queries use the schema that was 
in place when data was written. 
● DML support: UPSERT VALUES for row-by-row insertion, UPSERT 
SELECT for mass data transfer between the same or different tables, and 
DELETE for deleting rows. 
● Limited transaction support through client-side batching. 
● Single table only - no joins yet and secondary indexes are a work in 
progress. 
● Follows ANSI SQL standards whenever possible 
● Requires HBase v 0.94.2 or above 
● 100% Java
the Phoenix table schema
Setting JDBC Phoenix Driver
Phoenix and SQL tool in Eclipse 4
Phoenix vs Hive 
(running over HDFS and HBase) 
http://phoenix.apache.org/performance.html
Actor Model in the dawn of “Fast data”
http://youtu.be/TnLiEWglqHk - Google I/O 2014 - The dawn of "Fast Data"
The paper: MillWheel: Fault-Tolerant Stream 
Processing at Internet Scale
What is actor model ? 
● Carl Hewitt defined the Actor 
Model in 1973 as a mathematical 
theory that treats “Actors” as the 
universal primitives of concurrent 
digital computation. 
● A fitting model for heavily-parallel 
processing in a cloud environment
What actor model ?
is the framework for 
implementing Actor computation
Inspired by MillWheel of Google and Storm of 
Twitter, I have developed my own framework, the 
“Rfx” (Reactive Functor Extension) with Akka as 
core
The pipeline of finding social trends 
in real-time analytics
Facebook Social Trending from a website
Quick demo 
Using Akka (Rfx) and Apache Phoenix 
for Social Media Real-time Analytics
Links for self-study and research 
Actor Model and Programming: 
● http://nguyentantrieu.info/blog/the-architecture-for-real-time-event-processing-with- 
reactive-actor-model 
● http://www.slideshare.net/drorbr/the-actor-model-towards-better-concurrency 
● http://www.infoq.com/articles/reactive-cloud-actors 
● http://www.mc2ads.com/p/rfx-for-big-data-developer.html 
Apache Phoenix 
● http://java.dzone.com/articles/apache-phoenix-sql-driver 
● http://phoenix.apache.org/Phoenix-in-15-minutes-or-less.html 
Big Data and Data Science 
● http://www.mc2ads.com and http://www.mc2ads.org 
● http://datascience101.wordpress.com 
● http://lambda-architecture.net 
● http://www.bigdata-startups.com 
● https://www.coursera.org/course/datasci

Contenu connexe

Tendances

Apache Falcon - Simplifying Managing Data Jobs on Hadoop
Apache Falcon - Simplifying Managing Data Jobs on HadoopApache Falcon - Simplifying Managing Data Jobs on Hadoop
Apache Falcon - Simplifying Managing Data Jobs on Hadoop
DataWorks Summit
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Spark Summit
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon
 
Hadoop first ETL on Apache Falcon
Hadoop first ETL on Apache FalconHadoop first ETL on Apache Falcon
Hadoop first ETL on Apache Falcon
DataWorks Summit
 

Tendances (20)

Apache Falcon - Simplifying Managing Data Jobs on Hadoop
Apache Falcon - Simplifying Managing Data Jobs on HadoopApache Falcon - Simplifying Managing Data Jobs on Hadoop
Apache Falcon - Simplifying Managing Data Jobs on Hadoop
 
Hive present-and-feature-shanghai
Hive present-and-feature-shanghaiHive present-and-feature-shanghai
Hive present-and-feature-shanghai
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
 
Running Zeppelin in Enterprise
Running Zeppelin in EnterpriseRunning Zeppelin in Enterprise
Running Zeppelin in Enterprise
 
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ SalesforceHBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
HBaseCon2017 HBase/Phoenix @ Scale @ Salesforce
 
Hadoop first ETL on Apache Falcon
Hadoop first ETL on Apache FalconHadoop first ETL on Apache Falcon
Hadoop first ETL on Apache Falcon
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Creating the Internet of Your Things
Creating the Internet of Your ThingsCreating the Internet of Your Things
Creating the Internet of Your Things
 
Getting involved with Open Source at the ASF
Getting involved with Open Source at the ASFGetting involved with Open Source at the ASF
Getting involved with Open Source at the ASF
 
Sharing metadata across the data lake and streams
Sharing metadata across the data lake and streamsSharing metadata across the data lake and streams
Sharing metadata across the data lake and streams
 
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
Apache Falcon at Hadoop Summit 2013
Apache Falcon at Hadoop Summit 2013Apache Falcon at Hadoop Summit 2013
Apache Falcon at Hadoop Summit 2013
 
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
 
Transactional SQL in Apache Hive
Transactional SQL in Apache HiveTransactional SQL in Apache Hive
Transactional SQL in Apache Hive
 
Apache Atlas: Governance for your Data
Apache Atlas: Governance for your DataApache Atlas: Governance for your Data
Apache Atlas: Governance for your Data
 
HBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBaseHBaseCon2017 Democratizing HBase
HBaseCon2017 Democratizing HBase
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
 
Data-Center Replication with Apache Accumulo
Data-Center Replication with Apache AccumuloData-Center Replication with Apache Accumulo
Data-Center Replication with Apache Accumulo
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 

En vedette

En vedette (10)

High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
High Scale Relational Storage at Salesforce Built with Apache HBase and Apach...
 
HBase Secondary Indexing
HBase Secondary Indexing HBase Secondary Indexing
HBase Secondary Indexing
 
Salesforce External Objects for Big Data
Salesforce External Objects for Big DataSalesforce External Objects for Big Data
Salesforce External Objects for Big Data
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix Hortonworks Technical Workshop: HBase and Apache Phoenix
Hortonworks Technical Workshop: HBase and Apache Phoenix
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data WarehouseApache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
 

Similaire à Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming Stack

Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
confluent
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) Had
Deborah Gastineau
 

Similaire à Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming Stack (20)

Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data AnalyticsAnalysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
 
Atlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slidesAtlanta Data Science Meetup | Qubole slides
Atlanta Data Science Meetup | Qubole slides
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streams
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and ImplyAchieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
 
Why apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics FrameworksWhy apache Flink is the 4G of Big Data Analytics Frameworks
Why apache Flink is the 4G of Big Data Analytics Frameworks
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
 
Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014Hadoop and IoT Sinergija 2014
Hadoop and IoT Sinergija 2014
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
The Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) HadThe Recent Pronouncement Of The World Wide Web (Www) Had
The Recent Pronouncement Of The World Wide Web (Www) Had
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Big data apache spark + scala
Big data   apache spark + scalaBig data   apache spark + scala
Big data apache spark + scala
 
Symphony Driver Essay
Symphony Driver EssaySymphony Driver Essay
Symphony Driver Essay
 
Building and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache AirflowBuilding and deploying LLM applications with Apache Airflow
Building and deploying LLM applications with Apache Airflow
 

Plus de Trieu Nguyen

[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
Trieu Nguyen
 

Plus de Trieu Nguyen (20)

Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdfBuilding Your Customer Data Platform with LEO CDP in Travel Industry.pdf
Building Your Customer Data Platform with LEO CDP in Travel Industry.pdf
 
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel BusinessBuilding Your Customer Data Platform with LEO CDP - Spa and Hotel Business
Building Your Customer Data Platform with LEO CDP - Spa and Hotel Business
 
Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP Building Your Customer Data Platform with LEO CDP
Building Your Customer Data Platform with LEO CDP
 
How to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDPHow to track and improve Customer Experience with LEO CDP
How to track and improve Customer Experience with LEO CDP
 
[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP[Notes] Customer 360 Analytics with LEO CDP
[Notes] Customer 360 Analytics with LEO CDP
 
Leo CDP - Pitch Deck
Leo CDP - Pitch DeckLeo CDP - Pitch Deck
Leo CDP - Pitch Deck
 
LEO CDP - What's new in 2022
LEO CDP  - What's new in 2022LEO CDP  - What's new in 2022
LEO CDP - What's new in 2022
 
Lộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sảnLộ trình triển khai LEO CDP cho ngành bất động sản
Lộ trình triển khai LEO CDP cho ngành bất động sản
 
Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?Why is LEO CDP important for digital business ?
Why is LEO CDP important for digital business ?
 
From Dataism to Customer Data Platform
From Dataism to Customer Data PlatformFrom Dataism to Customer Data Platform
From Dataism to Customer Data Platform
 
Data collection, processing & organization with USPA framework
Data collection, processing & organization with USPA frameworkData collection, processing & organization with USPA framework
Data collection, processing & organization with USPA framework
 
Part 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technologyPart 1: Introduction to digital marketing technology
Part 1: Introduction to digital marketing technology
 
Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?Why is Customer Data Platform (CDP) ?
Why is Customer Data Platform (CDP) ?
 
How to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation PlatformHow to build a Personalized News Recommendation Platform
How to build a Personalized News Recommendation Platform
 
How to grow your business in the age of digital marketing 4.0
How to grow your business  in the age of digital marketing 4.0How to grow your business  in the age of digital marketing 4.0
How to grow your business in the age of digital marketing 4.0
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)Concepts, use cases and principles to build big data systems (1)
Concepts, use cases and principles to build big data systems (1)
 
Open OTT - Video Content Platform
Open OTT - Video Content PlatformOpen OTT - Video Content Platform
Open OTT - Video Content Platform
 
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data AnalysisApache Hadoop and Spark: Introduction and Use Cases for Data Analysis
Apache Hadoop and Spark: Introduction and Use Cases for Data Analysis
 
Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)Introduction to Recommendation Systems (Vietnam Web Submit)
Introduction to Recommendation Systems (Vietnam Web Submit)
 

Dernier

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 

Dernier (20)

ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming Stack

  • 1. Apache Phoenix with Actor Model (Akka.io) for Real-time Big Data Programming Stack Why we still need SQL for Big Data ? How to make Big Data more responsive and faster ? By http://nguyentantrieu.info Tech Lead at eClick team - FPT Online
  • 2. Contents 1. What is Big data and Why ? 2. When standard relational database (Oracle,MySQL, ...) is not good enough 3. Common problems in big data system 4. Introducing open-source tools in Big Data System a. Apache Phoenix for ad-hoc query b. Actor Model and Akka.io for reactive data processing
  • 3. What Does Big Data Actually Mean? “Big data means data that cannot fit easily into a standard relational database.” Hal Varian- Chief Economist, Google http://www.brookings.edu/blogs/techtank/posts/2014/09/11-big-data-definition
  • 4. When standard relational database (Oracle,MySQL, ...) is not good enough the “analytic system” MySQL database from a startup, tracking all actions in mobile games: iOS, Android, ...
  • 5. Complex analytic system and the “scale” pain
  • 6. Definition from the crowd “Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning.” Jonathan Stuart Ward and Adam Barker Source: http://arxiv.org/abs/1309.5821 http://www.technologyreview.com/view/519851/the-big-data-conundrum-how-to-define- it/
  • 7. “Chaotic” fact and the demand 80% of that data is unstructured or “chaotic” Photos, videos and social media posts - data that says so much about us - but cannot be analyzed via traditional methods Demand: “Finding order among chaos”
  • 8. 3 common problems in Big Data System 1. Size: the volume of the datasets is a critical factor. 2. Complexity: the structure, behaviour and permutations of the datasets is a critical factor. 3. Technologies: the tools and techniques which are used to process a sizable or complex dataset is a critical factor.
  • 9. Introducing open-source tools in Big Data System Apache Phoenix as SQL ad-hoc query engine Actor Model as nano-service for reactive data computation in the dawn of “Fast data”
  • 10. Some innovative tools were born in the dawn of Big Data Age
  • 11. But could an elephant fly without wings ?
  • 12.
  • 13. But a phoenix can fly !
  • 14. What is Apache Phoenix ? Apache Phoenix is a SQL skin over HBase. It means scaling Phoenix just like scale-up and scale-out the Hbase
  • 16. Interesting features of Apache Phoenix ● Embedded JDBC driver implements the majority of java.sql interfaces, including the metadata APIs. ● Allows columns to be modeled as a multi-part row key or key/value cells. ● Full query support with predicate push down and optimal scan key formation. ● DDL support: CREATE TABLE, DROP TABLE, and ALTER TABLE for adding/removing columns. ● Versioned schema repository. Snapshot queries use the schema that was in place when data was written. ● DML support: UPSERT VALUES for row-by-row insertion, UPSERT SELECT for mass data transfer between the same or different tables, and DELETE for deleting rows. ● Limited transaction support through client-side batching. ● Single table only - no joins yet and secondary indexes are a work in progress. ● Follows ANSI SQL standards whenever possible ● Requires HBase v 0.94.2 or above ● 100% Java
  • 17.
  • 20. Phoenix and SQL tool in Eclipse 4
  • 21. Phoenix vs Hive (running over HDFS and HBase) http://phoenix.apache.org/performance.html
  • 22. Actor Model in the dawn of “Fast data”
  • 23. http://youtu.be/TnLiEWglqHk - Google I/O 2014 - The dawn of "Fast Data"
  • 24. The paper: MillWheel: Fault-Tolerant Stream Processing at Internet Scale
  • 25. What is actor model ? ● Carl Hewitt defined the Actor Model in 1973 as a mathematical theory that treats “Actors” as the universal primitives of concurrent digital computation. ● A fitting model for heavily-parallel processing in a cloud environment
  • 27. is the framework for implementing Actor computation
  • 28. Inspired by MillWheel of Google and Storm of Twitter, I have developed my own framework, the “Rfx” (Reactive Functor Extension) with Akka as core
  • 29. The pipeline of finding social trends in real-time analytics
  • 30. Facebook Social Trending from a website
  • 31. Quick demo Using Akka (Rfx) and Apache Phoenix for Social Media Real-time Analytics
  • 32. Links for self-study and research Actor Model and Programming: ● http://nguyentantrieu.info/blog/the-architecture-for-real-time-event-processing-with- reactive-actor-model ● http://www.slideshare.net/drorbr/the-actor-model-towards-better-concurrency ● http://www.infoq.com/articles/reactive-cloud-actors ● http://www.mc2ads.com/p/rfx-for-big-data-developer.html Apache Phoenix ● http://java.dzone.com/articles/apache-phoenix-sql-driver ● http://phoenix.apache.org/Phoenix-in-15-minutes-or-less.html Big Data and Data Science ● http://www.mc2ads.com and http://www.mc2ads.org ● http://datascience101.wordpress.com ● http://lambda-architecture.net ● http://www.bigdata-startups.com ● https://www.coursera.org/course/datasci