SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
© 2014 Aerospike. All rights reserved. Confidential 1
What Enterprises Can Learn
from Real-time Bidding
How, and why, to achieve
Operational Big Data
Brian Bulkowski
CTO and co-founder
Aerospike
© 2014 Aerospike. All rights reserved. Confidential 2
REQUIREMENTS FOR INTERNET ENTERPRISES
© 2014 Aerospike. All rights reserved. Confidential 3
Introduction to Advertising: Real-time Bidding
© 2014 Aerospike. All rights reserved. Confidential 4
North American RTB speeds & feeds
■  1 to 6 billion cookies tracked
■  Some companies track 200M, some track 20B
■  Each bidder has their own data pool
■  Data is your weapon
■  Recent searches, behavior, IP addresses
■  Audience clusters (K-cluster, K-means) from offline Hadoop
■  “Remnant” from Google, Yahoo is about 0.6 million / sec
■  Facebook exchange: about 0.6 million / sec
■  “other” is 0.5 million / sec
Currently about 3.0M / sec in North American
© 2014 Aerospike. All rights reserved. Confidential 5
Advertising requirements
■  100 millisecond or 150 millisecond ad delivery
■  De-facto standard set in 2004 by Washington Post and others
■  North America is 70 to 90 milliseconds wide
■  Two or three data centers
■  Auction is limited to 30 milliseconds
■  Typically closes in 5 milliseconds
■  Winners have more data, better models – in 5 milliseconds
© 2014 Aerospike. All rights reserved. Confidential 6
MILLIONS OF CONSUMERS
BILLIONS OF DEVICES
APP SERVERS
DATA
WAREHOUSEINSIGHTS
Advertising Technology Stack
WRITE CONTEXT
OPERATIONAL DB
WRITE REAL-TIME CONTEXT
READ RECENT CONTENT
PROFILE STORE
Cookies, email, deviceID, IP address, location,
segments, clicks, likes, tweets, search terms...
REAL-TIME ANALYTICS
Best sellers, top scores, trending tweets
BATCH ANALYTICS
Discover patterns,
segment data:
location patterns,
audience affinity
© 2014 Aerospike. All rights reserved. Confidential 7
Financial Services – Intraday Positions
LEGACY DATABASE
(MAINFRAME)
Read/Write
Start of Day
Data Loading
End of Day
Reconciliation
Query
REAL-TIME
DATA FEED
ACCOUNT
POSITIONS
XDR
10M+ user records
Primary key access
1M+ TPS planned
Finance App
Records App
RT Reporting App
© 2014 Aerospike. All rights reserved. Confidential 8
Social Media
MYSQL or POSTGRES
(ROTATIONAL DISK)
Recent user
generated content
Java application tier
Data abstraction
and sharding
MODIFIED REDIS
(SSD ENABLED)
Content and
Historical data
© 2014 Aerospike. All rights reserved. Confidential 9
Travel Portal
PRICING DATABASE
(RATE LIMITED)
Poll for
Pricing
Changes
PRICING
DATA
Store
Latest
Price
SESSION
MANAGEMENT
Session
Data
Read
Price
XDR
Airlines forced interstate
banking
Legacy mainframe
technology
Multi-company
reservation and pricing
Requirement: 1M TPS
allowing overhead
Travel App
© 2014 Aerospike. All rights reserved. Confidential 10
SOURCE
DEVICE/ USER
QOS & Real-Time Billing for Telcos
■  In-switch Per HTTP request Billing
■ US Telcos: 200M subscribers, 50 metros
■  In-memory use case
Hot Standby
Execute Request
Real-time
Checks
DESTINATION
Update
Device
User
Settings
Request
XDR
Real-time Auth. QoS Billing
Config Module App
© 2014 Aerospike. All rights reserved. Confidential 11
MILLIONS OF CONSUMERS
BILLIONS OF DEVICES
APP SERVERS
BATCH
ANALYTICSINSIGHTS
BATCH ANALYTICS
The New Architecture
WRITE CONTEXT
TRANSACTIONS &
HOT ANALYTICS
© 2014 Aerospike. All rights reserved. Confidential 12
Old Architecture ( scale out in 2000 )
Request routing and sharding
APP SERVERS
CACHE
DATABASE
STORAGE
CONTENT
DELIVERY NETWORK
LOAD BALANCER
© 2014 Aerospike. All rights reserved. Confidential 13
Modern Scale Out Architecture
Load balancer
Simple stateless
APP SERVERS
IN-MEMORY NoSQL
RESEARCH
WAREHOUSE
CONTENT
DELIVERY NETWORK
LOAD BALANCER
Long term cold storageFast stateless
© 2014 Aerospike. All rights reserved. Confidential 14
Modern Scale Out Architecture
Load balancer
Simple stateless
APP SERVERS
IN-MEMORY NoSQL
RESEARCH
WAREHOUSE
CONTENT
DELIVERY NETWORK
LOAD BALANCER
Long term cold storageFast stateless
HDFS BASED
© 2014 Aerospike. All rights reserved. Confidential 15
Build a data layer
Use open source
Focus on Key Value
Use In-memory NoSQL
Use Flash
© 2014 Aerospike. All rights reserved. Confidential 16
How Fast You Can Go
And
How To Do It
© 2014 Aerospike. All rights reserved. Confidential 17
SHARED-NOTHING SYSTEM:100% DATA
AVAILABILITY
■  Every node in a cluster is identical,
handles both transactions and long
running tasks
■  Data is replicated synchronously with
immediate consistency within the cluster
■  Data is replicated asynchronously
across data centers
OHIO Data Center
© 2014 Aerospike. All rights reserved. Confidential 18
LESSONS
1.  Optimize key-value code paths
■  No hot spots (e.g., robust DHT)
■  Scales up easily (e.g., easy to size)
■  Avoids points of failure (e.g., single node type)
■  Binary protocol
2.  Code in C
■  Read() / Write() / Linux AIO ( don’t trust a library )
■  Multithread !
■  Direct device
■  C++ could work, leads to structural complexity
3.  Memory allocation matters
■  Stack based allocators
■  Own stack allocator
■  JEMalloc for pools
© 2014 Aerospike. All rights reserved. Confidential 19
LESSONS (cont’d)
4.  Innovation: masters in a shared nothing system
■  Fast cluster organization
■  Fast transaction capabilities
■  Can be CP or AP - and resolve data accurately
5.  Clients are hard
■  Fast stable connection pools are hard
■  API design matters
■  Slow languages need Aerospike more
6.  Aggregations / queries required
■  Row oriented
■  Secondary indexes as filters, and MR style
■  Data transformation hurts
© 2014 Aerospike. All rights reserved. Confidential 20
LESSONS (cont’d)
7.  Network interrupts are painful
■  TCP is still better than UDP
■  Some great hardware out there (solarflare)
■  Network queues, RRD
■  New PCI-e, other interfaces are not there yet
■  Larger clusters solve interface issues
8.  “Large data types” (better documents)
■  Ordered List, Map, Set, Stack
■  Time series and documents
■  Beyond per-row storage layout, more optimal than document
9.  UDFs for flexibility
© 2014 Aerospike. All rights reserved. Confidential 21
WRITING RELIABILY WITH HIGH PERFORMANCE
1.  Write sent to row master
2.  Latch against simultaneous writes
3.  Apply write to master memory and replica
memory synchronously
4.  Queue operations to disk
5.  Signal completed transaction (optional
storage commit wait)
6.  Master applies conflict resolution policy
(rollback/ rollforward)
master replica
1.  Cluster discovers new node via gossip
protocol
2.  Paxos vote determines new data
organization
3.  Partition migrations scheduled
4.  When a partition migration starts, write
journal starts on destination
5.  Partition moves atomically
6.  Journal is applied and source data deleted
transactions
continue
Writing with Immediate Consistency Adding a Node
© 2014 Aerospike. All rights reserved. Confidential 22
Key Value Store + Lists, Maps
■  Namespaces (policy containers)
■  Determine storage - DRAM or Flash
■  Determine replication factor
■  Contain records and sets
■  Sets (tables) of records
■  Arbitrary grouping
■  Records (rows) of key/bins
■  Block size (128k – 2MB)
■  Bin with same name can contain values
of different types
■  String, integer, bytes (raw, blob, etc)
■  list ( an ordered collection of values )
■  map ( a collection of keys and values )
■  Bins can be added anytime
■  Meta data
■  Generation counter so apps can ensure that a
record was not modified since last read
■  Time-to-live value for auto expiration, keeping most
recent context or "hot" data, aging out historical
context
© 2014 Aerospike. All rights reserved. Confidential 23
KVS + Lists, Maps + Queries + UDFs
STREAM
AGGREGATIONS
(INDEXED MAP-REDUCE)
Pipe Query results
through UDFs
■  Filter, Transform,
Aggregate.. Map,
Reduce
■  Enforce security
■  UDFs in Lua to
■ CRUD a record
■ Calculation based
on data within a
record
■ Iterate through a
set / namespace of
records
■  UDFs for real-time
analytics and
aggregations
© 2014 Aerospike. All rights reserved. Confidential 24
SQL & NoSQL
➤ Secondary index
§  Equality, Range, IN (,,,), Compound
§  e.g. WHERE group_id = 1234,
WHERE last_activity > 1349293398, WHERE
branch_id IN (5,6,7,8)
➤ Filters
§  SQL: Where clause with non-indexed “AND”s
(e.g. “AND gender=‘M’ ”)
§  NOSQL: Map step
➤ Aggregation
§  SQL: GROUP BY, ORDER BY, LIMIT, OFFSET
§  NOSQL: Reduce step
Secondary Key
Primary Key
Record
Filter Map
Aggregate
DRAM
SSD
Aggregate
Client
Client
Server
Reduce
Aggregate
Query
© 2014 Aerospike. All rights reserved. Confidential 25
Flash storage
The Power of Flash Storage
© 2014 Aerospike. All rights reserved. Confidential 26
DATABASE
OS FILE SYSTEM
PAGE CACHE
BLOCK INTERFACE
SSD HDD
BLOCK INTERFACE
SSD SSD
OPEN NVM
SSD
Ask me and I’ll tell you the answer.Ask me. I’ll look up the answer and then tell it to
you.
DATABASE
HYBRID MEMORY SYSTEM™
•  Direct device access
•  Large Block Writes
•  Indexes in DRAM
•  Highly Parallelized
•  Log-structured FS “copy-on-write”
•  Fast restart with shared memory
FLASH OPTIMIZED HIGH
PERFORMANCE
© 2014 Aerospike. All rights reserved. Confidential 27© 2012 Aerospike. All rights reserved. Pg. 27
Measure your drives!
Aerospike Certification Tool (ACT)
http://github.com/aerospike/act
Transactional database workload
Reads: 1.5KB
(can’t batch / cache reads, random)
Writes: 128K blocks
(log based layout)
(plus defragmentation)
Turn up the load until
latency is over required SLA
© 2014 Aerospike. All rights reserved. Confidential 28
Aerospike’s Flash Experience
■  Know your Flash
■ ACT benchmark http://github.com/aerospike/act
■ Read-write benchmark results back to 2011
■  All clouds support flash now
■ New EC2 instances
■ Google Compute
■ Internap, Softlayer, GoGrid…
■  Write durability usually not a problem with modern flash
■ Durability is high (5 “drive writes per day” for 5 years, etc)
■ Read performance suffers under write load anyway
© 2014 Aerospike. All rights reserved. Confidential 29
Aerospike’s Flash Experience
■  Densities increasing
■ 100G 2 years ago à 800G today
■ SATA vs PCI-E
■ Appliances: 50T per 1U this year
■  Prices still dropping: perhaps $1/G next year
■  Intel P3700 results
■ 250K per device @ $2.5 / G
■ Old standard: Micron P320h 500K @ $8 / G
■  “Wide SATA”
■ 20 SATA drives
■ LSI “pass through mode”
■ 250K+ per server
© 2014 Aerospike. All rights reserved. Confidential 30
Use Open Source
© 2014 Aerospike. All rights reserved. Confidential 31
Aerospike: the trusted In-Memory NoSQL
Performance
• Over ten trillion transactions per month
• 99% of transactions < 2 ms
• 150K TPS per server
Scalability
• Billions of Internet users
• Clustered Software
• Maintenance without downtime
• Scale up & scale out
Reliability
• 50 customers; zero down-time
• Immediate Consistency
• Rapid Failover; Data Center Replication
Price/Performance
• Makes impossible projects affordable
• Flash-optimized
• 1/10 the servers required
© 2014 Aerospike. All rights reserved. Confidential 32

Contenu connexe

Tendances

Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory StorageDataWorks Summit
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance DataWorks Summit/Hadoop Summit
 
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure CodingLess is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure CodingZhe Zhang
 
Hadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionHadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionErik Krogen
 
Tachyon meetup slides.
Tachyon meetup slides.Tachyon meetup slides.
Tachyon meetup slides.David Groozman
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio, Inc.
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature DataWorks Summit
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraDataWorks Summit
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDBSage Weil
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSHadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSErik Krogen
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringErik Krogen
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryChris Nauroth
 

Tendances (20)

Democratizing Memory Storage
Democratizing Memory StorageDemocratizing Memory Storage
Democratizing Memory Storage
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
 
HDFS Tiered Storage
HDFS Tiered StorageHDFS Tiered Storage
HDFS Tiered Storage
 
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure CodingLess is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
 
Hadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop EncryptionHadoop Meetup Jan 2019 - Hadoop Encryption
Hadoop Meetup Jan 2019 - Hadoop Encryption
 
Tachyon meetup slides.
Tachyon meetup slides.Tachyon meetup slides.
Tachyon meetup slides.
 
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the CloudAlluxio+Presto: An Architecture for Fast SQL in the Cloud
Alluxio+Presto: An Architecture for Fast SQL in the Cloud
 
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
Reduce Storage Costs by 5x Using The New HDFS Tiered Storage Feature
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
Ceph and RocksDB
Ceph and RocksDBCeph and RocksDB
Ceph and RocksDB
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFSHadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
Hadoop Meetup Jan 2019 - Mounting Remote Stores in HDFS
 
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ TwitterCross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage TieringHadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
Hadoop Meetup Jan 2019 - Router-Based Federation and Storage Tiering
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
 

En vedette

Tadas Pivorius. Married to Cassandra
Tadas Pivorius. Married to CassandraTadas Pivorius. Married to Cassandra
Tadas Pivorius. Married to CassandraVolha Banadyseva
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Optimizing your job apply pages with the LinkedIn profile API
Optimizing your job apply pages with the LinkedIn profile APIOptimizing your job apply pages with the LinkedIn profile API
Optimizing your job apply pages with the LinkedIn profile APIIvo Brett
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo dbMongoDB
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial ServicesAerospike
 
Building Your First Application with MongoDB
Building Your First Application with MongoDBBuilding Your First Application with MongoDB
Building Your First Application with MongoDBMongoDB
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBStennie Steneker
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinarMongoDB
 
Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisMongoDB
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud OperationEdureka!
 
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsReal World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsMongoDB
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDBMongoDB
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingGuido Schmutz
 
How Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBHow Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBMongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBEdureka!
 
Extended 360 degree view of customer
Extended 360 degree view of customerExtended 360 degree view of customer
Extended 360 degree view of customerTrisha Dutta
 

En vedette (20)

Redis vs Aerospike
Redis vs AerospikeRedis vs Aerospike
Redis vs Aerospike
 
Tadas Pivorius. Married to Cassandra
Tadas Pivorius. Married to CassandraTadas Pivorius. Married to Cassandra
Tadas Pivorius. Married to Cassandra
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Optimizing your job apply pages with the LinkedIn profile API
Optimizing your job apply pages with the LinkedIn profile APIOptimizing your job apply pages with the LinkedIn profile API
Optimizing your job apply pages with the LinkedIn profile API
 
Building your first app with mongo db
Building your first app with mongo dbBuilding your first app with mongo db
Building your first app with mongo db
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial Services
 
Introduction to mongoDB
Introduction to mongoDBIntroduction to mongoDB
Introduction to mongoDB
 
Building Your First Application with MongoDB
Building Your First Application with MongoDBBuilding Your First Application with MongoDB
Building Your First Application with MongoDB
 
Agile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDBAgile Schema Design: An introduction to MongoDB
Agile Schema Design: An introduction to MongoDB
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinar
 
Creating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data AnalysisCreating a Single View Part 1: Overview and Data Analysis
Creating a Single View Part 1: Overview and Data Analysis
 
MongoDB- Crud Operation
MongoDB- Crud OperationMongoDB- Crud Operation
MongoDB- Crud Operation
 
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel RobertsReal World MongoDB: Use Cases from Financial Services by Daniel Roberts
Real World MongoDB: Use Cases from Financial Services by Daniel Roberts
 
High Performance Applications with MongoDB
High Performance Applications with MongoDBHigh Performance Applications with MongoDB
High Performance Applications with MongoDB
 
Oracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream ProcessingOracle Stream Analytics - Simplifying Stream Processing
Oracle Stream Analytics - Simplifying Stream Processing
 
How Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDBHow Financial Services Organizations Use MongoDB
How Financial Services Organizations Use MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Extended 360 degree view of customer
Extended 360 degree view of customerExtended 360 degree view of customer
Extended 360 degree view of customer
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
Customer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° viewCustomer Event Hub - the modern Customer 360° view
Customer Event Hub - the modern Customer 360° view
 

Similaire à What enterprises can learn from Real Time Bidding

Brian Bulkowski : what startups can learn from real-time bidding
Brian Bulkowski : what startups can learn from real-time biddingBrian Bulkowski : what startups can learn from real-time bidding
Brian Bulkowski : what startups can learn from real-time biddingAerospike
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveAerospike, Inc.
 
Real-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiReal-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiData Con LA
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesRonen Botzer
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...In-Memory Computing Summit
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyStuart Pook
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsAshish Mrig
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...Alluxio, Inc.
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-finalHaluk Ulubay
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataLviv Startup Club
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Lviv Startup Club
 
Kudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast DataKudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast Datamichaelguia
 
Big data at United Airlines
Big data at United AirlinesBig data at United Airlines
Big data at United AirlinesDataWorks Summit
 

Similaire à What enterprises can learn from Real Time Bidding (20)

Brian Bulkowski : what startups can learn from real-time bidding
Brian Bulkowski : what startups can learn from real-time biddingBrian Bulkowski : what startups can learn from real-time bidding
Brian Bulkowski : what startups can learn from real-time bidding
 
Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
Big Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's PerspectiveBig Data Learnings from a Vendor's Perspective
Big Data Learnings from a Vendor's Perspective
 
Real-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian BulkowskiReal-Time Analytics in Transactional Applications by Brian Bulkowski
Real-Time Analytics in Transactional Applications by Brian Bulkowski
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
NoSQL in Real-time Architectures
NoSQL in Real-time ArchitecturesNoSQL in Real-time Architectures
NoSQL in Real-time Architectures
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Data core overview - haluk-final
Data core overview - haluk-finalData core overview - haluk-final
Data core overview - haluk-final
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 
Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)Saviak lviv ai-2019-e-mail (1)
Saviak lviv ai-2019-e-mail (1)
 
Kudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast DataKudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast Data
 
Big data at United Airlines
Big data at United AirlinesBig data at United Airlines
Big data at United Airlines
 
Kudu austin oct 2015.pptx
Kudu austin oct 2015.pptxKudu austin oct 2015.pptx
Kudu austin oct 2015.pptx
 

Dernier

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 

Dernier (20)

Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 

What enterprises can learn from Real Time Bidding

  • 1. © 2014 Aerospike. All rights reserved. Confidential 1 What Enterprises Can Learn from Real-time Bidding How, and why, to achieve Operational Big Data Brian Bulkowski CTO and co-founder Aerospike
  • 2. © 2014 Aerospike. All rights reserved. Confidential 2 REQUIREMENTS FOR INTERNET ENTERPRISES
  • 3. © 2014 Aerospike. All rights reserved. Confidential 3 Introduction to Advertising: Real-time Bidding
  • 4. © 2014 Aerospike. All rights reserved. Confidential 4 North American RTB speeds & feeds ■  1 to 6 billion cookies tracked ■  Some companies track 200M, some track 20B ■  Each bidder has their own data pool ■  Data is your weapon ■  Recent searches, behavior, IP addresses ■  Audience clusters (K-cluster, K-means) from offline Hadoop ■  “Remnant” from Google, Yahoo is about 0.6 million / sec ■  Facebook exchange: about 0.6 million / sec ■  “other” is 0.5 million / sec Currently about 3.0M / sec in North American
  • 5. © 2014 Aerospike. All rights reserved. Confidential 5 Advertising requirements ■  100 millisecond or 150 millisecond ad delivery ■  De-facto standard set in 2004 by Washington Post and others ■  North America is 70 to 90 milliseconds wide ■  Two or three data centers ■  Auction is limited to 30 milliseconds ■  Typically closes in 5 milliseconds ■  Winners have more data, better models – in 5 milliseconds
  • 6. © 2014 Aerospike. All rights reserved. Confidential 6 MILLIONS OF CONSUMERS BILLIONS OF DEVICES APP SERVERS DATA WAREHOUSEINSIGHTS Advertising Technology Stack WRITE CONTEXT OPERATIONAL DB WRITE REAL-TIME CONTEXT READ RECENT CONTENT PROFILE STORE Cookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms... REAL-TIME ANALYTICS Best sellers, top scores, trending tweets BATCH ANALYTICS Discover patterns, segment data: location patterns, audience affinity
  • 7. © 2014 Aerospike. All rights reserved. Confidential 7 Financial Services – Intraday Positions LEGACY DATABASE (MAINFRAME) Read/Write Start of Day Data Loading End of Day Reconciliation Query REAL-TIME DATA FEED ACCOUNT POSITIONS XDR 10M+ user records Primary key access 1M+ TPS planned Finance App Records App RT Reporting App
  • 8. © 2014 Aerospike. All rights reserved. Confidential 8 Social Media MYSQL or POSTGRES (ROTATIONAL DISK) Recent user generated content Java application tier Data abstraction and sharding MODIFIED REDIS (SSD ENABLED) Content and Historical data
  • 9. © 2014 Aerospike. All rights reserved. Confidential 9 Travel Portal PRICING DATABASE (RATE LIMITED) Poll for Pricing Changes PRICING DATA Store Latest Price SESSION MANAGEMENT Session Data Read Price XDR Airlines forced interstate banking Legacy mainframe technology Multi-company reservation and pricing Requirement: 1M TPS allowing overhead Travel App
  • 10. © 2014 Aerospike. All rights reserved. Confidential 10 SOURCE DEVICE/ USER QOS & Real-Time Billing for Telcos ■  In-switch Per HTTP request Billing ■ US Telcos: 200M subscribers, 50 metros ■  In-memory use case Hot Standby Execute Request Real-time Checks DESTINATION Update Device User Settings Request XDR Real-time Auth. QoS Billing Config Module App
  • 11. © 2014 Aerospike. All rights reserved. Confidential 11 MILLIONS OF CONSUMERS BILLIONS OF DEVICES APP SERVERS BATCH ANALYTICSINSIGHTS BATCH ANALYTICS The New Architecture WRITE CONTEXT TRANSACTIONS & HOT ANALYTICS
  • 12. © 2014 Aerospike. All rights reserved. Confidential 12 Old Architecture ( scale out in 2000 ) Request routing and sharding APP SERVERS CACHE DATABASE STORAGE CONTENT DELIVERY NETWORK LOAD BALANCER
  • 13. © 2014 Aerospike. All rights reserved. Confidential 13 Modern Scale Out Architecture Load balancer Simple stateless APP SERVERS IN-MEMORY NoSQL RESEARCH WAREHOUSE CONTENT DELIVERY NETWORK LOAD BALANCER Long term cold storageFast stateless
  • 14. © 2014 Aerospike. All rights reserved. Confidential 14 Modern Scale Out Architecture Load balancer Simple stateless APP SERVERS IN-MEMORY NoSQL RESEARCH WAREHOUSE CONTENT DELIVERY NETWORK LOAD BALANCER Long term cold storageFast stateless HDFS BASED
  • 15. © 2014 Aerospike. All rights reserved. Confidential 15 Build a data layer Use open source Focus on Key Value Use In-memory NoSQL Use Flash
  • 16. © 2014 Aerospike. All rights reserved. Confidential 16 How Fast You Can Go And How To Do It
  • 17. © 2014 Aerospike. All rights reserved. Confidential 17 SHARED-NOTHING SYSTEM:100% DATA AVAILABILITY ■  Every node in a cluster is identical, handles both transactions and long running tasks ■  Data is replicated synchronously with immediate consistency within the cluster ■  Data is replicated asynchronously across data centers OHIO Data Center
  • 18. © 2014 Aerospike. All rights reserved. Confidential 18 LESSONS 1.  Optimize key-value code paths ■  No hot spots (e.g., robust DHT) ■  Scales up easily (e.g., easy to size) ■  Avoids points of failure (e.g., single node type) ■  Binary protocol 2.  Code in C ■  Read() / Write() / Linux AIO ( don’t trust a library ) ■  Multithread ! ■  Direct device ■  C++ could work, leads to structural complexity 3.  Memory allocation matters ■  Stack based allocators ■  Own stack allocator ■  JEMalloc for pools
  • 19. © 2014 Aerospike. All rights reserved. Confidential 19 LESSONS (cont’d) 4.  Innovation: masters in a shared nothing system ■  Fast cluster organization ■  Fast transaction capabilities ■  Can be CP or AP - and resolve data accurately 5.  Clients are hard ■  Fast stable connection pools are hard ■  API design matters ■  Slow languages need Aerospike more 6.  Aggregations / queries required ■  Row oriented ■  Secondary indexes as filters, and MR style ■  Data transformation hurts
  • 20. © 2014 Aerospike. All rights reserved. Confidential 20 LESSONS (cont’d) 7.  Network interrupts are painful ■  TCP is still better than UDP ■  Some great hardware out there (solarflare) ■  Network queues, RRD ■  New PCI-e, other interfaces are not there yet ■  Larger clusters solve interface issues 8.  “Large data types” (better documents) ■  Ordered List, Map, Set, Stack ■  Time series and documents ■  Beyond per-row storage layout, more optimal than document 9.  UDFs for flexibility
  • 21. © 2014 Aerospike. All rights reserved. Confidential 21 WRITING RELIABILY WITH HIGH PERFORMANCE 1.  Write sent to row master 2.  Latch against simultaneous writes 3.  Apply write to master memory and replica memory synchronously 4.  Queue operations to disk 5.  Signal completed transaction (optional storage commit wait) 6.  Master applies conflict resolution policy (rollback/ rollforward) master replica 1.  Cluster discovers new node via gossip protocol 2.  Paxos vote determines new data organization 3.  Partition migrations scheduled 4.  When a partition migration starts, write journal starts on destination 5.  Partition moves atomically 6.  Journal is applied and source data deleted transactions continue Writing with Immediate Consistency Adding a Node
  • 22. © 2014 Aerospike. All rights reserved. Confidential 22 Key Value Store + Lists, Maps ■  Namespaces (policy containers) ■  Determine storage - DRAM or Flash ■  Determine replication factor ■  Contain records and sets ■  Sets (tables) of records ■  Arbitrary grouping ■  Records (rows) of key/bins ■  Block size (128k – 2MB) ■  Bin with same name can contain values of different types ■  String, integer, bytes (raw, blob, etc) ■  list ( an ordered collection of values ) ■  map ( a collection of keys and values ) ■  Bins can be added anytime ■  Meta data ■  Generation counter so apps can ensure that a record was not modified since last read ■  Time-to-live value for auto expiration, keeping most recent context or "hot" data, aging out historical context
  • 23. © 2014 Aerospike. All rights reserved. Confidential 23 KVS + Lists, Maps + Queries + UDFs STREAM AGGREGATIONS (INDEXED MAP-REDUCE) Pipe Query results through UDFs ■  Filter, Transform, Aggregate.. Map, Reduce ■  Enforce security ■  UDFs in Lua to ■ CRUD a record ■ Calculation based on data within a record ■ Iterate through a set / namespace of records ■  UDFs for real-time analytics and aggregations
  • 24. © 2014 Aerospike. All rights reserved. Confidential 24 SQL & NoSQL ➤ Secondary index §  Equality, Range, IN (,,,), Compound §  e.g. WHERE group_id = 1234, WHERE last_activity > 1349293398, WHERE branch_id IN (5,6,7,8) ➤ Filters §  SQL: Where clause with non-indexed “AND”s (e.g. “AND gender=‘M’ ”) §  NOSQL: Map step ➤ Aggregation §  SQL: GROUP BY, ORDER BY, LIMIT, OFFSET §  NOSQL: Reduce step Secondary Key Primary Key Record Filter Map Aggregate DRAM SSD Aggregate Client Client Server Reduce Aggregate Query
  • 25. © 2014 Aerospike. All rights reserved. Confidential 25 Flash storage The Power of Flash Storage
  • 26. © 2014 Aerospike. All rights reserved. Confidential 26 DATABASE OS FILE SYSTEM PAGE CACHE BLOCK INTERFACE SSD HDD BLOCK INTERFACE SSD SSD OPEN NVM SSD Ask me and I’ll tell you the answer.Ask me. I’ll look up the answer and then tell it to you. DATABASE HYBRID MEMORY SYSTEM™ •  Direct device access •  Large Block Writes •  Indexes in DRAM •  Highly Parallelized •  Log-structured FS “copy-on-write” •  Fast restart with shared memory FLASH OPTIMIZED HIGH PERFORMANCE
  • 27. © 2014 Aerospike. All rights reserved. Confidential 27© 2012 Aerospike. All rights reserved. Pg. 27 Measure your drives! Aerospike Certification Tool (ACT) http://github.com/aerospike/act Transactional database workload Reads: 1.5KB (can’t batch / cache reads, random) Writes: 128K blocks (log based layout) (plus defragmentation) Turn up the load until latency is over required SLA
  • 28. © 2014 Aerospike. All rights reserved. Confidential 28 Aerospike’s Flash Experience ■  Know your Flash ■ ACT benchmark http://github.com/aerospike/act ■ Read-write benchmark results back to 2011 ■  All clouds support flash now ■ New EC2 instances ■ Google Compute ■ Internap, Softlayer, GoGrid… ■  Write durability usually not a problem with modern flash ■ Durability is high (5 “drive writes per day” for 5 years, etc) ■ Read performance suffers under write load anyway
  • 29. © 2014 Aerospike. All rights reserved. Confidential 29 Aerospike’s Flash Experience ■  Densities increasing ■ 100G 2 years ago à 800G today ■ SATA vs PCI-E ■ Appliances: 50T per 1U this year ■  Prices still dropping: perhaps $1/G next year ■  Intel P3700 results ■ 250K per device @ $2.5 / G ■ Old standard: Micron P320h 500K @ $8 / G ■  “Wide SATA” ■ 20 SATA drives ■ LSI “pass through mode” ■ 250K+ per server
  • 30. © 2014 Aerospike. All rights reserved. Confidential 30 Use Open Source
  • 31. © 2014 Aerospike. All rights reserved. Confidential 31 Aerospike: the trusted In-Memory NoSQL Performance • Over ten trillion transactions per month • 99% of transactions < 2 ms • 150K TPS per server Scalability • Billions of Internet users • Clustered Software • Maintenance without downtime • Scale up & scale out Reliability • 50 customers; zero down-time • Immediate Consistency • Rapid Failover; Data Center Replication Price/Performance • Makes impossible projects affordable • Flash-optimized • 1/10 the servers required
  • 32. © 2014 Aerospike. All rights reserved. Confidential 32