SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
User-Defined Functions & Materialized Views
DuyHai DOAN
User Define Functions
•  Why ?
•  Detailed Impl
•  UDAs
•  Gotchas
© 2015 DataStax, All Rights Reserved.
2
Rationale
•  Push computation server-side
•  save network bandwidth (1000 nodes!)
•  simplify client-side code
•  provide standard & useful function (sum, avg …)
•  accelerate analytics use-case (pre-aggregation for Spark)
© 2015 DataStax, All Rights Reserved.
 3
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 4
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 5
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
CREATE OR REPLACE FUNCTION IF NOT EXISTS
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 6
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
An UDF is keyspace-wide
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 7
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
Param name to refer to in the code
Type = CQL3 type
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 8
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // j
AS ‘
// source code here
‘;
Always called
Null-check mandatory in code
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 9
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language // jav
AS ‘
// source code here
‘;
If any input is null, code block is
skipped and return null
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 10
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
CQL types
•  primitives (boolean, int, …)
•  collections (list, set, map)
•  tuples
•  UDT
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 11
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘; JVM supported languages
•  Java, Scala
•  Javascript (slow)
•  Groovy, Jython, JRuby
•  Clojure ( JSR 223 impl issue)
How to create an UDF ?
© 2015 DataStax, All Rights Reserved.
 12
CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS]
[keyspace.]functionName (param1 type1, param2 type2, …)
CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT
RETURN returnType
LANGUAGE language
AS ‘
// source code here
‘;
Avoid simple quotes ( ‘ ) or escape them
© 2015 DataStax, All Rights Reserved.
 13
Simple UDF Demo
max()
UDF creation for C* 2.2 and 3.0
•  UDF definition sent to coordinator
•  Check for user permissions (ALTER, DROP, EXECUTE)
•  Coordinator compiles it using ECJ compiler (not Javac because of licence
issue).
•  If language != Java, use other compilers (you should ship the jars on C*
servers)
•  Announces SchemaChange = CREATED, Target = FUNCTION to all nodes
•  Code loaded into child classloader of current classloader
© 2015 DataStax, All Rights Reserved.
 14
UDF creation in C* 3.0
•  Compiled byte code is verified
•  static and instance code block
•  declaring methods/constructor à no anonymous class creation
•  declaring inner class
•  use of synchronized
•  accessing forbidden classes, ex: java.net.NetworkInterface
•  accessing forbidden methods, ex: Class.forName()
•  …
•  Code loaded into seperated classloader
•  verify loaded classes against whitelist/backlist
© 2015 DataStax, All Rights Reserved.
 15
UDF execution in C* 2.2
•  No safeguard, arbitrary code execution server side …
•  Flag “enable_user_defined_functions” in cassandra.yaml turned OFF by
default
•  Keep your UDFs pure (stateless)
•  Avoid expensive computation (for loop …)
© 2015 DataStax, All Rights Reserved.
 16
UDF execution in C* 3.0
•  If "enable_user_defined_functions_threads" turned ON (default)
•  execute UDF in separated ThreadPool
•  if duration longer than "user_defined_function_warn_timeout", retry with new
timeout = fail_timeout – warn_timeout
•  if duration longer than "user_defined_function_fail_timeout", exception
•  Else
•  execute in same thread
© 2015 DataStax, All Rights Reserved.
 17
UDF creation best practices
© 2015 DataStax, All Rights Reserved.
 18
•  Keep your UDFs pure (stateless)
•  No expensive computation (huge for loop)
•  Prefer Java language for perf
UDA
© 2015 DataStax, All Rights Reserved.
 19
•  Real use-case for UDF
•  Aggregation server-side à huge network bandwidth saving
•  Provide similar behavior for Group By, Sum, Avg etc …
How to create an UDA ?
© 2015 DataStax, All Rights Reserved.
 20
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Only type, no param name
State type
Initial state type
How to create an UDA ?
© 2015 DataStax, All Rights Reserved.
 21
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Accumulator function. Signature:
accumulatorFunction(stateType, type1, type2, …)
RETURNS stateType
How to create an UDA ?
© 2015 DataStax, All Rights Reserved.
 22
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
Optional final function. Signature:
finalFunction(stateType)
How to create an UDA ?
© 2015 DataStax, All Rights Reserved.
 23
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
UDA return type ?
If finalFunction
•  return type of finalFunction
Else
•  return stateType
How to create an UDA ?
© 2015 DataStax, All Rights Reserved.
 24
CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS]
[keyspace.]aggregateName(type1, type2, …)
SFUNC accumulatorFunction
STYPE stateType
[FINALFUNC finalFunction]
INITCOND initCond;
If accumulatorFunction RETURNS NULL ON NULL INPUT
•  initCond should be defined
Else
•  initCond can be null (default value)
UDA execution
© 2015 DataStax, All Rights Reserved.
 25
•  Start with state = initCond
•  For each row
•  call accumulator function with previous state + params
•  update state with accumulator function output
•  If no finalFunction
•  return last state
•  Else
•  apply finalFunction and returns its output
© 2015 DataStax, All Rights Reserved.
 26
UDA Demo
Group By
Gotchas
© 2015 DataStax, All Rights Reserved.
 27
C* C*
C*
C*
UDF
①
②
③
④ UDF execution⑤
Gotchas
© 2015 DataStax, All Rights Reserved.
 28
C* C*
C*
C*
UDF
①
②
③
④ UDF execution⑤
Why do not apply UDF on replica node ?
Gotchas
© 2015 DataStax, All Rights Reserved.
 29
C* C*
C*
C*
UDA
①
② & ③
④
•  apply accumulatorFunction
•  apply finalFunction
⑤
② & ③
② & ③
1.  Cannot apply
accumulatorFunction
on replicas
2.  UDA applied AFTER
read-repair
Gotchas
© 2015 DataStax, All Rights Reserved.
 30
C* C*
C*
C*
UDA
①
② & ③
④
•  apply accumulatorFunction
•  apply finalFunction
⑤
② & ③
② & ③
Avoid UDA on large number of rows
•  WHERE partitionKey IN (…)
•  WHERE clustering >= AND clusterting <=
Future applications of UDFs & UDAs
•  Functional index (CASSANDRA-7458)
•  Partial index (CASSANDRA-7391)
•  WHERE clause filtering with UDF
•  …
© 2015 DataStax, All Rights Reserved.
 31
© 2015 DataStax, All Rights Reserved.
 32
Q & A
! "
Materialized Views
•  Why ?
•  Detailed Impl
•  Gotchas
© 2015 DataStax, All Rights Reserved.
33
Why Materialized Views ?
•  Relieve the pain of manual denormalization
© 2015 DataStax, All Rights Reserved.
 34
CREATE TABLE user(
id int PRIMARY KEY,
country text,
…
);
CREATE TABLE user_by_country(
country text,
id int,
…,
PRIMARY KEY(country, id)
);
Materialzed View In Action
© 2015 DataStax, All Rights Reserved.
 35
CREATE MATERIALIZED VIEW user_by_country
AS SELECT country, id, firstname, lastname
FROM user
WHERE country IS NOT NULL AND id IS NOT NULL
PRIMARY KEY(country, id)
CREATE TABLE user_by_country (
country text,
id int,
firstname text,
lastname text,
PRIMARY KEY(country, id));
Materialzed View Syntax
© 2015 DataStax, All Rights Reserved.
 36
CREATE MATERIALIZED VIEW [IF NOT EXISTS]
keyspace_name.view_name
AS SELECT column1, column2, ...
FROM keyspace_name.table_name
WHERE column1 IS NOT NULL AND column2 IS NOT NULL ...
PRIMARY KEY(column1, column2, ...)
Must select all primary key columns of base table
•  IS NOT NULL condition for now
•  more complex conditions in future
•  at least all primary key columns of base table
(ordering can be different)
•  maximum 1 column NOT pk from base table
•  1:1 relationship between base row & MV row
© 2015 DataStax, All Rights Reserved.
 37
Materialzed View Demo
user_by_country
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 38
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
①
create BatchLog
for update
OPTIONAL STEP
activated by
–Dcassandra.mv_enable_coordinator_batchlog=true
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 39
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
②
•  send mutation to all replicas
•  waiting for ack(s) with CL
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 40
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
③
Acquire local lock on
base table partition
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 41
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
④
Local read to fetch current values
SELECT * FROM user WHERE id=1
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 42
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑤
Create BatchLog with
•  DELETE FROM user_by_country
WHERE country = ‘old_value’
•  INSERT INTO
user_by_country(country, id, …)
VALUES(‘FR’, 1, ...)
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 43
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑥
Execute async BatchLog
to paired view replica
with CL = ONE
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 44
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑦
Apply base table updade locally
SET COUNTRY=‘FR’
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 45
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑧
Release local lock
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 46
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑨
Return ack to
coordinator
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 47
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
①⓪
If CL ack(s)
received, ack client
Materialized View Impl
© 2015 DataStax, All Rights Reserved.
 48
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
①①
If QUORUM ack(s)
received, remove
BatchLog
OPTIONAL STEP
activated by
–Dcassandra.mv_enable_coordinator_batchlog=true
Materialzed Views impl explained
What is paired replica ?
•  Base primary replica for id=1 is paired with MV primary replica for
country=‘FR’
•  Base secondary replica for id=1 is paired with MV secondary replica for
country=‘FR’
•  etc …
© 2015 DataStax, All Rights Reserved.
 49
Materialzed Views impl explained
Why local lock on base replica ?
•  to provide serializability on concurrent updates
Why BatchLog on base replica ?
•  necessary for eventual durability
•  replay the MV delete + insert until successful
Why BatchLog on base replica uses CL = ONE ?
•  each base replica is responsible for update of its paired MV replica
•  CL > ONE will create un-necessary duplicated mutations
© 2015 DataStax, All Rights Reserved.
 50
Materialzed Views impl explained
Why BatchLog on coordinator with QUORUM ?
•  necessary to cover some edge-cases
© 2015 DataStax, All Rights Reserved.
 51
MV Failure Cases: concurrent updates
© 2015 DataStax, All Rights Reserved.
 52
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
t0
t1
t2
Without local lock
MV Failure Cases: concurrent updates
© 2015 DataStax, All Rights Reserved.
 53
Read base row (country=‘UK’)
•  DELETE FROM mv WHERE
country=‘UK’
•  INSERT INTO mv …(country)
VALUES(‘US’)
•  Send async BatchLog
•  Apply update country=‘US’
1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’
Read base row (country=‘US’)
•  DELETE FROM mv WHERE
country=‘US’
•  INSERT INTO mv …(country)
VALUES(‘FR’)
•  Send async BatchLog
•  Apply update country=‘FR’
With local lock
🔒
🔓 🔒
🔓
MV Failure Cases: failed updates to MV
© 2015 DataStax, All Rights Reserved.
 54
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
⑥
Execute async BatchLog
to paired view replica
with CL = ONE
✘
MV replica down
MV Failure Cases: failed updates to MV
© 2015 DataStax, All Rights Reserved.
 55
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
BatchLog
replay
MV replica up
MV Failure Cases: base replica(s) down
© 2015 DataStax, All Rights Reserved.
 56
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
②
•  send mutation to all replicas
•  waiting for ack(s) with CL
✘
✘
⁇⁇
✔︎
Base replicas down
MV Failure Cases: base replica(s) down
© 2015 DataStax, All Rights Reserved.
 57
C*
C*
C*
C*
C* C*
C* C*
UPDATE user
SET country=‘FR’
WHERE id=1
✔✔
✔︎
Base replicas up
BatchLog
replay
Materialized Views Gotchas
•  Write performance
•  local lock
•  local read-before-write for MV (most of perf hits here)
•  local batchlog for MV
•  coordinator batchlog (OPTIONAL)
•  ☞ you only pay this price once whatever number of MV
•  Consistency level
•  CL honoured for base table, ONE for MV + local batchlog
•  QUORUM for coordinator batchlog (OPTIONAL)
© 2015 DataStax, All Rights Reserved.
 58
Materialized Views Gotchas
•  Beware of hot spots !!!
•  MV user_by_gender 😱
•  Repair, read-repair, index rebuild, decomission …
•  repair on base replica à update on MV paired replica
•  repair on MV replica possible
•  read-repair on MV behaves as normal read-repair
© 2015 DataStax, All Rights Reserved.
 59
Materialized Views Gotchas
•  Schema migration
•  cannot drop column from base table used by MV
•  can add column to base table, initial value = null from MV
•  cannot drop base table, drop all MVs first
© 2015 DataStax, All Rights Reserved.
 60
© 2015 DataStax, All Rights Reserved.
 61
Q & A
! "
© 2015 DataStax, All Rights Reserved.
 62
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/
Thank You

Contenu connexe

Tendances

A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to ScalaTim Underwood
 
Spark real world use cases and optimizations
Spark real world use cases and optimizationsSpark real world use cases and optimizations
Spark real world use cases and optimizationsGal Marder
 
Spark Programming
Spark ProgrammingSpark Programming
Spark ProgrammingTaewook Eom
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Modern Data Stack France
 
Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Nilesh Jayanandana
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot CampTroy Miles
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introductionJason Vance
 
2014 holden - databricks umd scala crash course
2014   holden - databricks umd scala crash course2014   holden - databricks umd scala crash course
2014 holden - databricks umd scala crash courseHolden Karau
 
Webエンジニアから見たiOS5
Webエンジニアから見たiOS5Webエンジニアから見たiOS5
Webエンジニアから見たiOS5Satoshi Asano
 
Introduction to Spark with Scala
Introduction to Spark with ScalaIntroduction to Spark with Scala
Introduction to Spark with ScalaHimanshu Gupta
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in ScalaAlex Payne
 
Alternatives of JPA/Hibernate
Alternatives of JPA/HibernateAlternatives of JPA/Hibernate
Alternatives of JPA/HibernateSunghyouk Bae
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介scalaconfjp
 
Redis in Practice: Scenarios, Performance and Practice with PHP
Redis in Practice: Scenarios, Performance and Practice with PHPRedis in Practice: Scenarios, Performance and Practice with PHP
Redis in Practice: Scenarios, Performance and Practice with PHPChen Huang
 

Tendances (20)

Scala coated JVM
Scala coated JVMScala coated JVM
Scala coated JVM
 
Spark workshop
Spark workshopSpark workshop
Spark workshop
 
A Brief Intro to Scala
A Brief Intro to ScalaA Brief Intro to Scala
A Brief Intro to Scala
 
Spark real world use cases and optimizations
Spark real world use cases and optimizationsSpark real world use cases and optimizations
Spark real world use cases and optimizations
 
Spark Programming
Spark ProgrammingSpark Programming
Spark Programming
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
 
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
Hadoop meetup : HUGFR Construire le cluster le plus rapide pour l'analyse des...
 
Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot Camp
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction
 
2014 holden - databricks umd scala crash course
2014   holden - databricks umd scala crash course2014   holden - databricks umd scala crash course
2014 holden - databricks umd scala crash course
 
Webエンジニアから見たiOS5
Webエンジニアから見たiOS5Webエンジニアから見たiOS5
Webエンジニアから見たiOS5
 
Introduction to Spark with Scala
Introduction to Spark with ScalaIntroduction to Spark with Scala
Introduction to Spark with Scala
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
Alternatives of JPA/Hibernate
Alternatives of JPA/HibernateAlternatives of JPA/Hibernate
Alternatives of JPA/Hibernate
 
XQuery in the Cloud
XQuery in the CloudXQuery in the Cloud
XQuery in the Cloud
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介
Introduction to Spark SQL and Catalyst / Spark SQLおよびCalalystの紹介
 
Redis in Practice: Scenarios, Performance and Practice with PHP
Redis in Practice: Scenarios, Performance and Practice with PHPRedis in Practice: Scenarios, Performance and Practice with PHP
Redis in Practice: Scenarios, Performance and Practice with PHP
 

En vedette

Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYC
Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYCJeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYC
Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYCMLconf
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...MongoDB
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUGDuyhai Doan
 
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesBig Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesMongoDB
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014Duyhai Doan
 
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaDuyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016Duyhai Doan
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016Duyhai Doan
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalystTakuya UESHIN
 
11 Shocking Stats That Will Transform Your Marketing Strategy
11 Shocking Stats That Will Transform Your Marketing Strategy 11 Shocking Stats That Will Transform Your Marketing Strategy
11 Shocking Stats That Will Transform Your Marketing Strategy Sailthru
 
Acquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, FastAcquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, FastSailthru
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinDataStax Academy
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jugDuyhai Doan
 
Introduction to KillrChat
Introduction to KillrChatIntroduction to KillrChat
Introduction to KillrChatDuyhai Doan
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data ModelingDuyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 

En vedette (20)

Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYC
Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYCJeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYC
Jeremy Stanley, EVP/Data Scientist, Sailthru at MLconf NYC
 
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
Big Data Analytics 2: Leveraging Customer Behavior to Enhance Relevancy in Pe...
 
What's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache SparkWhat's next for Big Data? -- Apache Spark
What's next for Big Data? -- Apache Spark
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
 
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer ProfilesBig Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
Big Data Analytics 1: Driving Personalized Experiences Using Customer Profiles
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
 
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
20140908 spark sql & catalyst
20140908 spark sql & catalyst20140908 spark sql & catalyst
20140908 spark sql & catalyst
 
11 Shocking Stats That Will Transform Your Marketing Strategy
11 Shocking Stats That Will Transform Your Marketing Strategy 11 Shocking Stats That Will Transform Your Marketing Strategy
11 Shocking Stats That Will Transform Your Marketing Strategy
 
Acquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, FastAcquire, Grow & Retain Customers, Fast
Acquire, Grow & Retain Customers, Fast
 
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadinC* Summit 2013: The World's Next Top Data Model by Patrick McFadin
C* Summit 2013: The World's Next Top Data Model by Patrick McFadin
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Introduction to KillrChat
Introduction to KillrChatIntroduction to KillrChat
Introduction to KillrChat
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 

Similaire à Cassandra UDF and Materialized Views

Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Windows Developer
 
CDK Meetup: Rule the World through IaC
CDK Meetup: Rule the World through IaCCDK Meetup: Rule the World through IaC
CDK Meetup: Rule the World through IaCsmalltown
 
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-FunctionsIntegration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-FunctionsBizTalk360
 
TypeScript for Java Developers
TypeScript for Java DevelopersTypeScript for Java Developers
TypeScript for Java DevelopersYakov Fain
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningMichael Rys
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData InfluxData
 
Hack your db before the hackers do
Hack your db before the hackers doHack your db before the hackers do
Hack your db before the hackers dofangjiafu
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
2019 indit blackhat_honeypot your database server
2019 indit blackhat_honeypot your database server2019 indit blackhat_honeypot your database server
2019 indit blackhat_honeypot your database serverGeorgi Kodinov
 
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMayank Prasad
 
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...CanSecWest
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVMSylvain Wallez
 
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...Sang Don Kim
 
Hand-On Lab: CA Release Automation Rapid Development Kit and SDK
Hand-On Lab: CA Release Automation Rapid Development Kit and SDKHand-On Lab: CA Release Automation Rapid Development Kit and SDK
Hand-On Lab: CA Release Automation Rapid Development Kit and SDKCA Technologies
 
Graal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian WimmerGraal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian WimmerThomas Wuerthinger
 
Inno db 5_7_features
Inno db 5_7_featuresInno db 5_7_features
Inno db 5_7_featuresTinku Ajit
 
Rider - Taking ReSharper out of Process
Rider - Taking ReSharper out of ProcessRider - Taking ReSharper out of Process
Rider - Taking ReSharper out of Processcitizenmatt
 
Ice mini guide
Ice mini guideIce mini guide
Ice mini guideAdy Liu
 
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]David Buck
 

Similaire à Cassandra UDF and Materialized Views (20)

Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015
 
CDK Meetup: Rule the World through IaC
CDK Meetup: Rule the World through IaCCDK Meetup: Rule the World through IaC
CDK Meetup: Rule the World through IaC
 
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-FunctionsIntegration-Monday-Stateful-Programming-Models-Serverless-Functions
Integration-Monday-Stateful-Programming-Models-Serverless-Functions
 
TypeScript for Java Developers
TypeScript for Java DevelopersTypeScript for Java Developers
TypeScript for Java Developers
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
 
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
 
Hack your db before the hackers do
Hack your db before the hackers doHack your db before the hackers do
Hack your db before the hackers do
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
2019 indit blackhat_honeypot your database server
2019 indit blackhat_honeypot your database server2019 indit blackhat_honeypot your database server
2019 indit blackhat_honeypot your database server
 
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRsMySQL-Performance Schema- What's new in MySQL-5.7 DMRs
MySQL-Performance Schema- What's new in MySQL-5.7 DMRs
 
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
Writing MySQL UDFs
Writing MySQL UDFsWriting MySQL UDFs
Writing MySQL UDFs
 
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
[Td 2015] what is new in visual c++ 2015 and future directions(ulzii luvsanba...
 
Hand-On Lab: CA Release Automation Rapid Development Kit and SDK
Hand-On Lab: CA Release Automation Rapid Development Kit and SDKHand-On Lab: CA Release Automation Rapid Development Kit and SDK
Hand-On Lab: CA Release Automation Rapid Development Kit and SDK
 
Graal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian WimmerGraal Tutorial at CGO 2015 by Christian Wimmer
Graal Tutorial at CGO 2015 by Christian Wimmer
 
Inno db 5_7_features
Inno db 5_7_featuresInno db 5_7_features
Inno db 5_7_features
 
Rider - Taking ReSharper out of Process
Rider - Taking ReSharper out of ProcessRider - Taking ReSharper out of Process
Rider - Taking ReSharper out of Process
 
Ice mini guide
Ice mini guideIce mini guide
Ice mini guide
 
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]
InvokeDynamic for Mere Mortals [JavaOne 2015 CON7682]
 

Plus de Duyhai Doan

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandraDuyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplDuyhai Doan
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysDuyhai Doan
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentationDuyhai Doan
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandraDuyhai Doan
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Duyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemDuyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGDuyhai Doan
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDuyhai Doan
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceDuyhai Doan
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesDuyhai Doan
 
Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015Duyhai Doan
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisDuyhai Doan
 

Plus de Duyhai Doan (20)

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev days
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentation
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandra
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 
Fast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ INGFast track to getting started with DSE Max @ ING
Fast track to getting started with DSE Max @ ING
 
Distributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeConDistributed algorithms for big data @ GeeCon
Distributed algorithms for big data @ GeeCon
 
Spark cassandra integration, theory and practice
Spark cassandra integration, theory and practiceSpark cassandra integration, theory and practice
Spark cassandra integration, theory and practice
 
Spark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-CasesSpark cassandra connector.API, Best Practices and Use-Cases
Spark cassandra connector.API, Best Practices and Use-Cases
 
Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015Algorithmes distribues pour le big data @ DevoxxFR 2015
Algorithmes distribues pour le big data @ DevoxxFR 2015
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
 

Dernier

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Dernier (20)

Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Cassandra UDF and Materialized Views

  • 1. User-Defined Functions & Materialized Views DuyHai DOAN
  • 2. User Define Functions •  Why ? •  Detailed Impl •  UDAs •  Gotchas © 2015 DataStax, All Rights Reserved. 2
  • 3. Rationale •  Push computation server-side •  save network bandwidth (1000 nodes!) •  simplify client-side code •  provide standard & useful function (sum, avg …) •  accelerate analytics use-case (pre-aggregation for Spark) © 2015 DataStax, All Rights Reserved. 3
  • 4. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 4 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALL ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘;
  • 5. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 5 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; CREATE OR REPLACE FUNCTION IF NOT EXISTS
  • 6. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 6 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; An UDF is keyspace-wide
  • 7. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 7 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; Param name to refer to in the code Type = CQL3 type
  • 8. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 8 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // j AS ‘ // source code here ‘; Always called Null-check mandatory in code
  • 9. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 9 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language // jav AS ‘ // source code here ‘; If any input is null, code block is skipped and return null
  • 10. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 10 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; CQL types •  primitives (boolean, int, …) •  collections (list, set, map) •  tuples •  UDT
  • 11. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 11 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; JVM supported languages •  Java, Scala •  Javascript (slow) •  Groovy, Jython, JRuby •  Clojure ( JSR 223 impl issue)
  • 12. How to create an UDF ? © 2015 DataStax, All Rights Reserved. 12 CREATE [OR REPLACE] FUNCTION [IF NOT EXISTS] [keyspace.]functionName (param1 type1, param2 type2, …) CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT RETURN returnType LANGUAGE language AS ‘ // source code here ‘; Avoid simple quotes ( ‘ ) or escape them
  • 13. © 2015 DataStax, All Rights Reserved. 13 Simple UDF Demo max()
  • 14. UDF creation for C* 2.2 and 3.0 •  UDF definition sent to coordinator •  Check for user permissions (ALTER, DROP, EXECUTE) •  Coordinator compiles it using ECJ compiler (not Javac because of licence issue). •  If language != Java, use other compilers (you should ship the jars on C* servers) •  Announces SchemaChange = CREATED, Target = FUNCTION to all nodes •  Code loaded into child classloader of current classloader © 2015 DataStax, All Rights Reserved. 14
  • 15. UDF creation in C* 3.0 •  Compiled byte code is verified •  static and instance code block •  declaring methods/constructor à no anonymous class creation •  declaring inner class •  use of synchronized •  accessing forbidden classes, ex: java.net.NetworkInterface •  accessing forbidden methods, ex: Class.forName() •  … •  Code loaded into seperated classloader •  verify loaded classes against whitelist/backlist © 2015 DataStax, All Rights Reserved. 15
  • 16. UDF execution in C* 2.2 •  No safeguard, arbitrary code execution server side … •  Flag “enable_user_defined_functions” in cassandra.yaml turned OFF by default •  Keep your UDFs pure (stateless) •  Avoid expensive computation (for loop …) © 2015 DataStax, All Rights Reserved. 16
  • 17. UDF execution in C* 3.0 •  If "enable_user_defined_functions_threads" turned ON (default) •  execute UDF in separated ThreadPool •  if duration longer than "user_defined_function_warn_timeout", retry with new timeout = fail_timeout – warn_timeout •  if duration longer than "user_defined_function_fail_timeout", exception •  Else •  execute in same thread © 2015 DataStax, All Rights Reserved. 17
  • 18. UDF creation best practices © 2015 DataStax, All Rights Reserved. 18 •  Keep your UDFs pure (stateless) •  No expensive computation (huge for loop) •  Prefer Java language for perf
  • 19. UDA © 2015 DataStax, All Rights Reserved. 19 •  Real use-case for UDF •  Aggregation server-side à huge network bandwidth saving •  Provide similar behavior for Group By, Sum, Avg etc …
  • 20. How to create an UDA ? © 2015 DataStax, All Rights Reserved. 20 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Only type, no param name State type Initial state type
  • 21. How to create an UDA ? © 2015 DataStax, All Rights Reserved. 21 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Accumulator function. Signature: accumulatorFunction(stateType, type1, type2, …) RETURNS stateType
  • 22. How to create an UDA ? © 2015 DataStax, All Rights Reserved. 22 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; Optional final function. Signature: finalFunction(stateType)
  • 23. How to create an UDA ? © 2015 DataStax, All Rights Reserved. 23 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; UDA return type ? If finalFunction •  return type of finalFunction Else •  return stateType
  • 24. How to create an UDA ? © 2015 DataStax, All Rights Reserved. 24 CREATE [OR REPLACE] AGGREGATE [IF NOT EXISTS] [keyspace.]aggregateName(type1, type2, …) SFUNC accumulatorFunction STYPE stateType [FINALFUNC finalFunction] INITCOND initCond; If accumulatorFunction RETURNS NULL ON NULL INPUT •  initCond should be defined Else •  initCond can be null (default value)
  • 25. UDA execution © 2015 DataStax, All Rights Reserved. 25 •  Start with state = initCond •  For each row •  call accumulator function with previous state + params •  update state with accumulator function output •  If no finalFunction •  return last state •  Else •  apply finalFunction and returns its output
  • 26. © 2015 DataStax, All Rights Reserved. 26 UDA Demo Group By
  • 27. Gotchas © 2015 DataStax, All Rights Reserved. 27 C* C* C* C* UDF ① ② ③ ④ UDF execution⑤
  • 28. Gotchas © 2015 DataStax, All Rights Reserved. 28 C* C* C* C* UDF ① ② ③ ④ UDF execution⑤ Why do not apply UDF on replica node ?
  • 29. Gotchas © 2015 DataStax, All Rights Reserved. 29 C* C* C* C* UDA ① ② & ③ ④ •  apply accumulatorFunction •  apply finalFunction ⑤ ② & ③ ② & ③ 1.  Cannot apply accumulatorFunction on replicas 2.  UDA applied AFTER read-repair
  • 30. Gotchas © 2015 DataStax, All Rights Reserved. 30 C* C* C* C* UDA ① ② & ③ ④ •  apply accumulatorFunction •  apply finalFunction ⑤ ② & ③ ② & ③ Avoid UDA on large number of rows •  WHERE partitionKey IN (…) •  WHERE clustering >= AND clusterting <=
  • 31. Future applications of UDFs & UDAs •  Functional index (CASSANDRA-7458) •  Partial index (CASSANDRA-7391) •  WHERE clause filtering with UDF •  … © 2015 DataStax, All Rights Reserved. 31
  • 32. © 2015 DataStax, All Rights Reserved. 32 Q & A ! "
  • 33. Materialized Views •  Why ? •  Detailed Impl •  Gotchas © 2015 DataStax, All Rights Reserved. 33
  • 34. Why Materialized Views ? •  Relieve the pain of manual denormalization © 2015 DataStax, All Rights Reserved. 34 CREATE TABLE user( id int PRIMARY KEY, country text, … ); CREATE TABLE user_by_country( country text, id int, …, PRIMARY KEY(country, id) );
  • 35. Materialzed View In Action © 2015 DataStax, All Rights Reserved. 35 CREATE MATERIALIZED VIEW user_by_country AS SELECT country, id, firstname, lastname FROM user WHERE country IS NOT NULL AND id IS NOT NULL PRIMARY KEY(country, id) CREATE TABLE user_by_country ( country text, id int, firstname text, lastname text, PRIMARY KEY(country, id));
  • 36. Materialzed View Syntax © 2015 DataStax, All Rights Reserved. 36 CREATE MATERIALIZED VIEW [IF NOT EXISTS] keyspace_name.view_name AS SELECT column1, column2, ... FROM keyspace_name.table_name WHERE column1 IS NOT NULL AND column2 IS NOT NULL ... PRIMARY KEY(column1, column2, ...) Must select all primary key columns of base table •  IS NOT NULL condition for now •  more complex conditions in future •  at least all primary key columns of base table (ordering can be different) •  maximum 1 column NOT pk from base table •  1:1 relationship between base row & MV row
  • 37. © 2015 DataStax, All Rights Reserved. 37 Materialzed View Demo user_by_country
  • 38. Materialized View Impl © 2015 DataStax, All Rights Reserved. 38 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ① create BatchLog for update OPTIONAL STEP activated by –Dcassandra.mv_enable_coordinator_batchlog=true
  • 39. Materialized View Impl © 2015 DataStax, All Rights Reserved. 39 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ② •  send mutation to all replicas •  waiting for ack(s) with CL
  • 40. Materialized View Impl © 2015 DataStax, All Rights Reserved. 40 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ③ Acquire local lock on base table partition
  • 41. Materialized View Impl © 2015 DataStax, All Rights Reserved. 41 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ④ Local read to fetch current values SELECT * FROM user WHERE id=1
  • 42. Materialized View Impl © 2015 DataStax, All Rights Reserved. 42 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑤ Create BatchLog with •  DELETE FROM user_by_country WHERE country = ‘old_value’ •  INSERT INTO user_by_country(country, id, …) VALUES(‘FR’, 1, ...)
  • 43. Materialized View Impl © 2015 DataStax, All Rights Reserved. 43 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑥ Execute async BatchLog to paired view replica with CL = ONE
  • 44. Materialized View Impl © 2015 DataStax, All Rights Reserved. 44 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑦ Apply base table updade locally SET COUNTRY=‘FR’
  • 45. Materialized View Impl © 2015 DataStax, All Rights Reserved. 45 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑧ Release local lock
  • 46. Materialized View Impl © 2015 DataStax, All Rights Reserved. 46 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑨ Return ack to coordinator
  • 47. Materialized View Impl © 2015 DataStax, All Rights Reserved. 47 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ①⓪ If CL ack(s) received, ack client
  • 48. Materialized View Impl © 2015 DataStax, All Rights Reserved. 48 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ①① If QUORUM ack(s) received, remove BatchLog OPTIONAL STEP activated by –Dcassandra.mv_enable_coordinator_batchlog=true
  • 49. Materialzed Views impl explained What is paired replica ? •  Base primary replica for id=1 is paired with MV primary replica for country=‘FR’ •  Base secondary replica for id=1 is paired with MV secondary replica for country=‘FR’ •  etc … © 2015 DataStax, All Rights Reserved. 49
  • 50. Materialzed Views impl explained Why local lock on base replica ? •  to provide serializability on concurrent updates Why BatchLog on base replica ? •  necessary for eventual durability •  replay the MV delete + insert until successful Why BatchLog on base replica uses CL = ONE ? •  each base replica is responsible for update of its paired MV replica •  CL > ONE will create un-necessary duplicated mutations © 2015 DataStax, All Rights Reserved. 50
  • 51. Materialzed Views impl explained Why BatchLog on coordinator with QUORUM ? •  necessary to cover some edge-cases © 2015 DataStax, All Rights Reserved. 51
  • 52. MV Failure Cases: concurrent updates © 2015 DataStax, All Rights Reserved. 52 Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ t0 t1 t2 Without local lock
  • 53. MV Failure Cases: concurrent updates © 2015 DataStax, All Rights Reserved. 53 Read base row (country=‘UK’) •  DELETE FROM mv WHERE country=‘UK’ •  INSERT INTO mv …(country) VALUES(‘US’) •  Send async BatchLog •  Apply update country=‘US’ 1) UPDATE … SET country=‘US’ 2) UPDATE … SET country=‘FR’ Read base row (country=‘US’) •  DELETE FROM mv WHERE country=‘US’ •  INSERT INTO mv …(country) VALUES(‘FR’) •  Send async BatchLog •  Apply update country=‘FR’ With local lock 🔒 🔓 🔒 🔓
  • 54. MV Failure Cases: failed updates to MV © 2015 DataStax, All Rights Reserved. 54 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ⑥ Execute async BatchLog to paired view replica with CL = ONE ✘ MV replica down
  • 55. MV Failure Cases: failed updates to MV © 2015 DataStax, All Rights Reserved. 55 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 BatchLog replay MV replica up
  • 56. MV Failure Cases: base replica(s) down © 2015 DataStax, All Rights Reserved. 56 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ② •  send mutation to all replicas •  waiting for ack(s) with CL ✘ ✘ ⁇⁇ ✔︎ Base replicas down
  • 57. MV Failure Cases: base replica(s) down © 2015 DataStax, All Rights Reserved. 57 C* C* C* C* C* C* C* C* UPDATE user SET country=‘FR’ WHERE id=1 ✔✔ ✔︎ Base replicas up BatchLog replay
  • 58. Materialized Views Gotchas •  Write performance •  local lock •  local read-before-write for MV (most of perf hits here) •  local batchlog for MV •  coordinator batchlog (OPTIONAL) •  ☞ you only pay this price once whatever number of MV •  Consistency level •  CL honoured for base table, ONE for MV + local batchlog •  QUORUM for coordinator batchlog (OPTIONAL) © 2015 DataStax, All Rights Reserved. 58
  • 59. Materialized Views Gotchas •  Beware of hot spots !!! •  MV user_by_gender 😱 •  Repair, read-repair, index rebuild, decomission … •  repair on base replica à update on MV paired replica •  repair on MV replica possible •  read-repair on MV behaves as normal read-repair © 2015 DataStax, All Rights Reserved. 59
  • 60. Materialized Views Gotchas •  Schema migration •  cannot drop column from base table used by MV •  can add column to base table, initial value = null from MV •  cannot drop base table, drop all MVs first © 2015 DataStax, All Rights Reserved. 60
  • 61. © 2015 DataStax, All Rights Reserved. 61 Q & A ! "
  • 62. © 2015 DataStax, All Rights Reserved. 62 @doanduyhai duy_hai.doan@datastax.com https://academy.datastax.com/ Thank You