SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Aerospike Modeling
User Segmentation with Maps and Bitfields
2 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
3 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ Cassandra databases, including derivatives such as ScyllaDB, have a
needle in a haystack problem
▪ In C* each user ID – segment ID pair is in its own row
▪ This affects performance when you need low latency key-value operations
▪ In Aerospike we keep all the segments together in a single record
tl;dr
4 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ In digital advertising user profiles stores assist with audience segmentation
▪ The goal is to pull user segments for a specific user as fast as possible
▪ Modeling this use case is generally applicable to other forms of online
personalization
User Profile Stores
5 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
CREATE TABLE userspace.user_segments (
user_id uuid,
segment_id int,
attr smallint,
attr2 smallint,
PRIMARY KEY ((user_id, segment_id), user_id)
)
▪ On average 1000 segments per profile
▪ 50 billion cookies means 50 trillion rows
▪ Large latency to find 1000 segments of a user from a huge number of rows
Modeling in Cassandra
6 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
{segmentID: [segment-TTL, {attr1, attr2}]}
{ 8457: [8889*, {}],
12845: [8889, {}],
42199: [8889, {}],
43696: [8889, {}],
}
▪ * Segment TTL uses local epoch (hours since epoch)
▪ The map ordering options are UNORDERED, K-ORDERED and KV-ORDERED
▪ Choosing K-ORDERED gives the best performance for data on SSD
Modeling in Aerospike
7 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ We can easily upsert into the map new user segments as they are
processed (https://github.com/aerospike-examples/modeling-user-segmentation)
Advantages
8 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ We can use get_by_value_interval to filter segments that have a
specific ‘freshness’
Advantages
9 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ We can use the map remove_by_value_interval operation to trim
expired segments
▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s
segments from the user profile store. Just get the record.
Advantages
10 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ We can use the map remove_by_value_interval operation to trim
expired segments, called as a background scan operation (>= 4.7)
▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s
segments from the user profile store
Advantages
11 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
List operations supported by the server. Method names in the clients might be different.
• General Write Flags: (create_only, update_only, no_fail, partial)
• resize()
• insert(), remove(), set()
• or(), and(), xor(), not()
• lshift(), rshift()
• add(), subtract(), set-integer()
• get(), count()
• lscan(), rscan()
• get-integer()
Bitwise Operations
12 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
▪ Represent the segments as a continuous bitfield
▪ Each integer is a bit position. Set the bit for a segment the user is in
▪ Bitwise operations to check server-side if user is in multiple segments
▪ Compresses extremely well in Enterprise Edition
▪ Caveat: can't apply a TTL to the segments
Modeling with Bitfields
13 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
List & Map API
▪ https://www.aerospike.com/docs/guide/cdt-list.html
▪ https://www.aerospike.com/docs/guide/cdt-map.html
▪ https://www.aerospike.com/docs/guide/cdt-context.html
▪ https://www.aerospike.com/docs/guide/cdt-ordering.html
▪ https://aerospike-python-client.readthedocs.io/en/latest/aerospike_helpers.operations.html
▪ https://www.aerospike.com/apidocs/java/com/aerospike/client/cdt/ListOperation.html
Code Samples
▪ https://github.com/aerospike-examples/modeling-user-segmentation
Aerospike Training
▪ https://www.aerospike.com/training/
More material you can explore:
Thank You!
Any questions?
ronen@aerospike.com

Contenu connexe

Similaire à Aerospike Data Modeling - Meetup Dec 2019

Externalized Distributed Configuration Management with Spring Cloud Config-Se...
Externalized Distributed Configuration Management with Spring Cloud Config-Se...Externalized Distributed Configuration Management with Spring Cloud Config-Se...
Externalized Distributed Configuration Management with Spring Cloud Config-Se...Nikhil Hiremath
 
Application Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternApplication Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternTom Laszewski
 
IDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsIDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsDATAVERSITY
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...Trivadis
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfAmazon Web Services
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfAmazon Web Services
 
Less13 performance
Less13 performanceLess13 performance
Less13 performanceAmit Bhalla
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningBobby Curtis
 
Performance Schema and Sys Schema in MySQL 5.7
Performance Schema and Sys Schema in MySQL 5.7Performance Schema and Sys Schema in MySQL 5.7
Performance Schema and Sys Schema in MySQL 5.7Mark Leith
 
MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014) MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014) Frazer Clement
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedOmid Vahdaty
 
In Mind Cloud - Product Release - 1904
In Mind Cloud - Product Release - 1904In Mind Cloud - Product Release - 1904
In Mind Cloud - Product Release - 1904In Mind Cloud
 
MXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetMXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetApache MXNet
 
Oracle 12.2 - My Favorite Top 5 New or Improved Features
Oracle 12.2 - My Favorite Top 5 New or Improved FeaturesOracle 12.2 - My Favorite Top 5 New or Improved Features
Oracle 12.2 - My Favorite Top 5 New or Improved FeaturesSolarWinds
 
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)Mirko Ortensi
 
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18Pawel Maczka
 
MySQL 8.0 - Security Features
MySQL 8.0 - Security FeaturesMySQL 8.0 - Security Features
MySQL 8.0 - Security FeaturesHarin Vadodaria
 

Similaire à Aerospike Data Modeling - Meetup Dec 2019 (20)

Externalized Distributed Configuration Management with Spring Cloud Config-Se...
Externalized Distributed Configuration Management with Spring Cloud Config-Se...Externalized Distributed Configuration Management with Spring Cloud Config-Se...
Externalized Distributed Configuration Management with Spring Cloud Config-Se...
 
Application Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternApplication Modernization using the Strangler Pattern
Application Modernization using the Strangler Pattern
 
IDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data EnvironmentsIDERA Slides: Managing Complex Data Environments
IDERA Slides: Managing Complex Data Environments
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdf
 
Breaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdfBreaking the Monolith road to containers.pdf
Breaking the Monolith road to containers.pdf
 
Less13 performance
Less13 performanceLess13 performance
Less13 performance
 
Oracle GoldenGate Performance Tuning
Oracle GoldenGate Performance TuningOracle GoldenGate Performance Tuning
Oracle GoldenGate Performance Tuning
 
Performance Schema and Sys Schema in MySQL 5.7
Performance Schema and Sys Schema in MySQL 5.7Performance Schema and Sys Schema in MySQL 5.7
Performance Schema and Sys Schema in MySQL 5.7
 
MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014) MySQL Cluster Asynchronous replication (2014)
MySQL Cluster Asynchronous replication (2014)
 
Aerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data DemystifiedAerospike meetup july 2019 | Big Data Demystified
Aerospike meetup july 2019 | Big Data Demystified
 
In Mind Cloud - Product Release - 1904
In Mind Cloud - Product Release - 1904In Mind Cloud - Product Release - 1904
In Mind Cloud - Product Release - 1904
 
Self Driving Storage
Self Driving StorageSelf Driving Storage
Self Driving Storage
 
MXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNetMXNet Paris Workshop - Intro To MXNet
MXNet Paris Workshop - Intro To MXNet
 
Oracle 12.2 - My Favorite Top 5 New or Improved Features
Oracle 12.2 - My Favorite Top 5 New or Improved FeaturesOracle 12.2 - My Favorite Top 5 New or Improved Features
Oracle 12.2 - My Favorite Top 5 New or Improved Features
 
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)MySQL Performance Tuning: The Perfect Scalability (OOW2019)
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
 
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
IBM Spectrum Protect and IBM Spectrum Protect Plus - What's new! June '18
 
MySQL 8.0 - Security Features
MySQL 8.0 - Security FeaturesMySQL 8.0 - Security Features
MySQL 8.0 - Security Features
 

Plus de Aerospike

Aerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad LeevAerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad LeevAerospike
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Aerospike
 
Contentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron HoffmanContentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron HoffmanAerospike
 
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...Aerospike
 
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020Aerospike
 
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...Aerospike
 
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020Aerospike
 
Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019Aerospike
 
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike
 
JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019Aerospike
 

Plus de Aerospike (10)

Aerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad LeevAerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
Aerospike-AppsFlyer COVID-19 Crisis Growth Elad Leev
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
 
Contentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron HoffmanContentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
Contentsquare Aerospike Usage and COVID-19 Impact - Doron Hoffman
 
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
Handling Increasing Load and Reducing Costs During COVID-19 Crisis - Oshrat &...
 
Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020Aerospike Meetup - Introduction - Ami - 04 March 2020
Aerospike Meetup - Introduction - Ami - 04 March 2020
 
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
Aerospike Meetup - Real Time Insights using Spark with Aerospike - Zohar - 04...
 
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
Aerospike Meetup - Nielsen Customer Story - Alex - 04 March 2020
 
Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019Aerospike Roadmap Overview - Meetup Dec 2019
Aerospike Roadmap Overview - Meetup Dec 2019
 
Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019Aerospike Nested CDTs - Meetup Dec 2019
Aerospike Nested CDTs - Meetup Dec 2019
 
JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019JDBC Driver for Aerospike - Meetup Dec 2019
JDBC Driver for Aerospike - Meetup Dec 2019
 

Dernier

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Dernier (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Aerospike Data Modeling - Meetup Dec 2019

  • 1. Aerospike Modeling User Segmentation with Maps and Bitfields
  • 2. 2 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
  • 3. 3 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ Cassandra databases, including derivatives such as ScyllaDB, have a needle in a haystack problem ▪ In C* each user ID – segment ID pair is in its own row ▪ This affects performance when you need low latency key-value operations ▪ In Aerospike we keep all the segments together in a single record tl;dr
  • 4. 4 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ In digital advertising user profiles stores assist with audience segmentation ▪ The goal is to pull user segments for a specific user as fast as possible ▪ Modeling this use case is generally applicable to other forms of online personalization User Profile Stores
  • 5. 5 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. CREATE TABLE userspace.user_segments ( user_id uuid, segment_id int, attr smallint, attr2 smallint, PRIMARY KEY ((user_id, segment_id), user_id) ) ▪ On average 1000 segments per profile ▪ 50 billion cookies means 50 trillion rows ▪ Large latency to find 1000 segments of a user from a huge number of rows Modeling in Cassandra
  • 6. 6 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. {segmentID: [segment-TTL, {attr1, attr2}]} { 8457: [8889*, {}], 12845: [8889, {}], 42199: [8889, {}], 43696: [8889, {}], } ▪ * Segment TTL uses local epoch (hours since epoch) ▪ The map ordering options are UNORDERED, K-ORDERED and KV-ORDERED ▪ Choosing K-ORDERED gives the best performance for data on SSD Modeling in Aerospike
  • 7. 7 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can easily upsert into the map new user segments as they are processed (https://github.com/aerospike-examples/modeling-user-segmentation) Advantages
  • 8. 8 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use get_by_value_interval to filter segments that have a specific ‘freshness’ Advantages
  • 9. 9 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use the map remove_by_value_interval operation to trim expired segments ▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s segments from the user profile store. Just get the record. Advantages
  • 10. 10 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use the map remove_by_value_interval operation to trim expired segments, called as a background scan operation (>= 4.7) ▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s segments from the user profile store Advantages
  • 11. 11 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. List operations supported by the server. Method names in the clients might be different. • General Write Flags: (create_only, update_only, no_fail, partial) • resize() • insert(), remove(), set() • or(), and(), xor(), not() • lshift(), rshift() • add(), subtract(), set-integer() • get(), count() • lscan(), rscan() • get-integer() Bitwise Operations
  • 12. 12 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ Represent the segments as a continuous bitfield ▪ Each integer is a bit position. Set the bit for a segment the user is in ▪ Bitwise operations to check server-side if user is in multiple segments ▪ Compresses extremely well in Enterprise Edition ▪ Caveat: can't apply a TTL to the segments Modeling with Bitfields
  • 13. 13 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. List & Map API ▪ https://www.aerospike.com/docs/guide/cdt-list.html ▪ https://www.aerospike.com/docs/guide/cdt-map.html ▪ https://www.aerospike.com/docs/guide/cdt-context.html ▪ https://www.aerospike.com/docs/guide/cdt-ordering.html ▪ https://aerospike-python-client.readthedocs.io/en/latest/aerospike_helpers.operations.html ▪ https://www.aerospike.com/apidocs/java/com/aerospike/client/cdt/ListOperation.html Code Samples ▪ https://github.com/aerospike-examples/modeling-user-segmentation Aerospike Training ▪ https://www.aerospike.com/training/ More material you can explore: