SlideShare une entreprise Scribd logo
1  sur  33
01/16/2014

CASSANDRA AND
RIAK AT
BESTBUY.COM
WHO WE ARE
• Best Buy is the world’s largest multi-channel consumer
electronics retailer, with stores in the United States, Canada, China

and Mexico.
• We are the 10th largest online retailer in the United States
• More than 1.6 billion visitors come to our stores and BestBuy.com

each year.
• Our My Best Buy loyalty program is among the largest loyalty
programs of its kind, with more than 40 million active members.

• We provide customers with outstanding choice, unbiased advice
and unmatched support for the tech needs.
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
A UNIQUE CUSTOMER PROMISE
• THE LATEST DEVICES AND SERVICES, ALL IN ONE
PLACE

• IMPARTIAL & KNOWLEDGEABLE ADVICE
• COMPETITIVE PRICES
• THE ABILITY TO SHOP WHEN AND WHERE YOU
WANT
• SUPPORT FOR THE LIFE OF YOUR PRODUCTS

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
JOEL
CRABB

KANNAN
SWAMINATHAN

Chief Architect,

Director, Web

BestBuy.com

Architecture

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
NOSQL AT BESTBUY.COM
• Schema-less design
—Key–value systems
—Sparse column systems
—Non-relational data

• Distributed nature
—Eventual consistency
—Active-Active across clouds and datacenters
—High reliability
—Horizontal scaling
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
BESTBUY.COM

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
SCALABILITY GOALS
• Near-Infinite
• Bursts

• 7X traffic spikes

• Bursts > 50,000 rps
• #4 in eCommerce traffic during
2013 Holiday

Walmart
Best Buy

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
FLEXIBILITY
GOALS
Low cost of change
Fast concepts to site
Daily releases
Multiple versions

One day of work vs. two months
RELIABILITY GOALS
• 100% availability
• Zero defects

• Achieved 100% cloud
uptime during Holiday

• ~ 2s response times

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CLOUD ARCHITECTURE CONCEPTS
• Clouds fail; plan for it
—Multiple availability zones
—Multiple regions
—Multiple vendors

• Datacenter connections fail; plan for it
—Serve pages completely from cloud
—Browse-only fallback mode

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CLOUD ARCHITECTURE CONCEPTS
CDN: Global Traffic Manager

Browse Cloud

Browse Cloud

Vendor 1

Vendor 2

Best Buy Datacenter
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
RIAK

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
PRODUCT CATALOG - REQUIREMENTS
• Business Requirements
—Easily add new attributes to products
—Use existing product content feeding systems
—Provide enhanced product data to all of Best Buy

• Technical Requirements
—Scale to BestBuy.com’s needs
—One-way replication from DC to cloud

—Provide a Product Catalog API usable by Best Buy
applications internally (DC) and externally (cloud)

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
BROWSE CLOUD - ARCHITECTURE

Persistent Cache

Load Balancer
Web App

Web App

Service Aggregator

Product Data

Product Data

Datacenter

Legacy Services and Product Data

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
WHY RIAK FOR THE PRODUCT CATALOG
• Key-value Store
—Easily add new attributes
—Different attribute sets for different product categories

• Hub and Spoke Replication
—Multiple cloud instances can be out-of-synch for seconds or
minutes; this is OK
—Connection to Datacenter can be lost

• Resilience within all tiers
—Riak’s ring architecture allows single instances to fail with
little impact to the system
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
WHY RIAK FOR THE PRODUCT CATALOG
• Legacy Bridge
—Extract data from legacy system
—Expose to anyone who needs the data
—Extends decision to retire legacy system

• Input into Product Evolution
—Best Buy is stretching Riak in multiple areas
—Feedback on Search and Replication
—Direct connection to Riak engineers

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
PRODUCT CATALOG DEPLOYMENT
Cloud Replicant

Cloud Replicant

V1.2.1
Each Cloud
Replicant is a
self-contained
instance which
can intermittently
lose connectivity
to the Datacenter

Datacenter Master

V1.4.2

January 16, 2014

@Copyright 2014 – Best Buy, Inc. All rights reserved.
PRODUCT CATALOG - DATA FLOW
Cloud

Web
Application

Granular
Service

Aggregator
Service

NoSQL Product Data

Replication

Extraction

Datacenter
Legacy Product Data

NoSQL Product Data
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
RIAK LESSONS LEARNED
• Search eats up the CPU (Pre-Yokozuna)

Search Endpoint Removed

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
RIAK LESSONS LEARNED
• Multi-datacenter replication is hard at scale
• Object replication fails silently (v1.4.2)

1.5 hours for partial update

4000 objects/sec

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
DID THE PRODUCT CATALOG SUCCEED?
• Scale – generally < 5 ms response times in cloud
• Scope – provides APIs in cloud and DC

Single zone
1.5 hours response update
failure, for partial
times jump to 8-10
ms

4 ms @ 95%
4000 objects/sec

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CASSANDRA

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CUSTOMER GRAPH - REQUIREMENTS
• Business Requirements
—Create a single identity provider
—Build an adaptive model to support a 360 degree view of the
customer, customer segmentation and multi-channel
personalization
—Extensible framework to support federated identity interoperability
with external providers

• Technical Requirements
—Scale to BestBuy.com’s needs with a distributed service oriented

architecture
—Support risk based authentication and secure service interfaces for
consumers
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CUSTOMER GRAPH - ARCHITECTURE
CDN + GTM + WAF

Load Balancer

API

Load Balancer

API

Topology: NTS
RF: 3
Partitioner: Random

Data Center 1 / Region 1

API

API

Cassandra Ring

Data Center n / Region n
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CUSTOMER GRAPH - CONCEPTUAL

Create Connection:
POST /customers/Kannan/following/customers/Joel
Get all Kannan’s connections:
GET /customers/Kannan/connections
Get Joel’s “followers”:
GET customers/Joel/followers
Get customers connecting to a specific store:
GET /stores/{store id}/connecting/customer
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CUSTOMER GRAPH – CONCEPTUAL

Topics

Products

Blogs

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
WHY CASSANDRA?
• No single point of failure

• Delivers on Atomicity, Isolation & Durability
• Eventual Consistency
• Tunable Consistency for Reads vs. Writes
• Linear Scalability
• Querying a column slice or a range of row keys

• Data can have expiration set
• Reliable multiple data center replication
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
LESSONS LEARNED – DATA MODELING
Use composite column names; static composite types

Column names are stored physically sorted and
indexed
Store “values” as column names

Wide rows in conjunction with composite columns can
be used to build indices, but…
For larger data sets, distribute the columns among rows
A secondary index is best modeled as a separate column
family
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
LESSONS LEARNED – DATA MODELING
Do not store an entity as a single column blob
Cannot index and query on entity attributes
Updates to an entity attribute would require a read and
then a write

Mutate just the required columns on an entity row
Do not read and then write

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
LESSONS LEARNED - MONITORING
Heap Size and Use
 GC

Mutation Stage (Writes) & Read Stage (Reads)
 Active and Pending

AE Service, Stream & Message-Streaming-Pool Stages
 Especially during scheduled repair or rebuild (while adding nodes to a
new data center)
 Streaming requires keep-alive connections (watch out for firewalls
terminating idle established connections; update periodic tcp keep-alive
ping rate)

Compaction Stage & Compaction Count
 IO Wait, Limits - nofiles, nproc

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
IMPLEMENTATION STRATEGY
• Start with a project team
—Riak – Product Catalog
—Cassandra – Customer Graph

• Build expertise in development and operations
• As usage grows, create a Platform team
—Combine engineering into one team
—Project team can then focus on business features
—Focus on automation

January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
CONCLUSIONS
• Riak and Cassandra have both been successful
—Riak – No complete outages in two years
—Cassandra – Flawless in its first Holiday in 2013

• Differences
—Replication patterns are different
—Write and read treatment is different
—Deployment pattern are different

• High Availability
—Both have highly resilient architectures
—Both scale linearly
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.
January 16, 2014
@Copyright 2014 – Best Buy, Inc. All rights reserved.

Contenu connexe

Tendances

Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
DataWorks Summit
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
DataWorks Summit
 

Tendances (20)

Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Kafka Retry and DLQ
Kafka Retry and DLQKafka Retry and DLQ
Kafka Retry and DLQ
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
Big data real time architectures
Big data real time architecturesBig data real time architectures
Big data real time architectures
 
Apache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the CloudApache Spark on K8S Best Practice and Performance in the Cloud
Apache Spark on K8S Best Practice and Performance in the Cloud
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the EnterpriseUsing Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
Using Spark Streaming and NiFi for the Next Generation of ETL in the Enterprise
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 

En vedette

Be an agile architect
Be an agile architectBe an agile architect
Be an agile architect
joelcrabb
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
DataStax
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
DataStax
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
Sergey Bushik
 

En vedette (20)

How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
How does Riak compare to Cassandra? [Cassandra London User Group July 2011]
 
The BestBuy.com Cloud Architecture
The BestBuy.com Cloud ArchitectureThe BestBuy.com Cloud Architecture
The BestBuy.com Cloud Architecture
 
Cassandra in e-commerce
Cassandra in e-commerceCassandra in e-commerce
Cassandra in e-commerce
 
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
Twitter + Lambda Architecture (Spark, Kafka, FLume, Cassandra) + Machine Lear...
 
Be an agile architect
Be an agile architectBe an agile architect
Be an agile architect
 
A cloud computing primer for non-technical executives
A cloud computing primer for non-technical executivesA cloud computing primer for non-technical executives
A cloud computing primer for non-technical executives
 
The Upstream Game, 2hr version
The Upstream Game, 2hr versionThe Upstream Game, 2hr version
The Upstream Game, 2hr version
 
Taller de Text Mining en Twitter con R
Taller de Text Mining en Twitter con RTaller de Text Mining en Twitter con R
Taller de Text Mining en Twitter con R
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
 
Target: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data ModelingTarget: Escaping Disco-Era Data Modeling
Target: Escaping Disco-Era Data Modeling
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Pattern of Innovation
Pattern of InnovationPattern of Innovation
Pattern of Innovation
 
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
Building a Distributed Reservation System with Cassandra (Andrew Baker & Jeff...
 
An eCommerce Cloud Implementation Primer
An eCommerce Cloud Implementation PrimerAn eCommerce Cloud Implementation Primer
An eCommerce Cloud Implementation Primer
 
Best Buy Web 2.0
Best Buy Web 2.0Best Buy Web 2.0
Best Buy Web 2.0
 
Evaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for BenchmarkingEvaluating NoSQL Performance: Time for Benchmarking
Evaluating NoSQL Performance: Time for Benchmarking
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Day 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConfDay 2 General Session Presentations RedisConf
Day 2 General Session Presentations RedisConf
 

Similaire à Cassandra and Riak at BestBuy.com

Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
Riverbed Technology
 
Riverbed SteelCentral Simplifies Web App Monitoring
Riverbed SteelCentral Simplifies Web App MonitoringRiverbed SteelCentral Simplifies Web App Monitoring
Riverbed SteelCentral Simplifies Web App Monitoring
Lori Bertelli
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
IntelAPAC
 

Similaire à Cassandra and Riak at BestBuy.com (20)

ASTQB washington-sept-2015
ASTQB washington-sept-2015ASTQB washington-sept-2015
ASTQB washington-sept-2015
 
Big Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast DataBig Data LDN 2016: When Big Data Meets Fast Data
Big Data LDN 2016: When Big Data Meets Fast Data
 
Presentación Paco Bermejo - La Noche del Sector Financiero
Presentación Paco Bermejo - La Noche del Sector FinancieroPresentación Paco Bermejo - La Noche del Sector Financiero
Presentación Paco Bermejo - La Noche del Sector Financiero
 
Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus Webinar
 
Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
Riverbed SteelCentral AppResponse 9.0 NetProfiler and NetShark 10.6
 
Riverbed SteelCentral Simplifies Web App Monitoring
Riverbed SteelCentral Simplifies Web App MonitoringRiverbed SteelCentral Simplifies Web App Monitoring
Riverbed SteelCentral Simplifies Web App Monitoring
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Application patterns
Application patternsApplication patterns
Application patterns
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
 
RTBkit Introduction & Best Practices
RTBkit Introduction & Best PracticesRTBkit Introduction & Best Practices
RTBkit Introduction & Best Practices
 
Primavera Analytics 16.1 is Released - Everything You Need To Know
Primavera Analytics 16.1 is Released - Everything You Need To KnowPrimavera Analytics 16.1 is Released - Everything You Need To Know
Primavera Analytics 16.1 is Released - Everything You Need To Know
 
What's New In Primavera Analytics 16.1
What's New In Primavera Analytics 16.1What's New In Primavera Analytics 16.1
What's New In Primavera Analytics 16.1
 
Out With the Old, in With the Open-source: Brainshark's Complete CMS Migration
Out With the Old, in With the Open-source: Brainshark's Complete CMS MigrationOut With the Old, in With the Open-source: Brainshark's Complete CMS Migration
Out With the Old, in With the Open-source: Brainshark's Complete CMS Migration
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
 
Intel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStackIntel Cloud Foundry and OpenStack
Intel Cloud Foundry and OpenStack
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
 
Recipes for API Ninjas
Recipes for API NinjasRecipes for API Ninjas
Recipes for API Ninjas
 
Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014Top10 list planningpostgresdeployment.2014
Top10 list planningpostgresdeployment.2014
 
APNIC services and Policy Development Process | IDNOG 5
APNIC services and Policy Development Process | IDNOG 5APNIC services and Policy Development Process | IDNOG 5
APNIC services and Policy Development Process | IDNOG 5
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Cassandra and Riak at BestBuy.com

  • 2. WHO WE ARE • Best Buy is the world’s largest multi-channel consumer electronics retailer, with stores in the United States, Canada, China and Mexico. • We are the 10th largest online retailer in the United States • More than 1.6 billion visitors come to our stores and BestBuy.com each year. • Our My Best Buy loyalty program is among the largest loyalty programs of its kind, with more than 40 million active members. • We provide customers with outstanding choice, unbiased advice and unmatched support for the tech needs. January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 3. A UNIQUE CUSTOMER PROMISE • THE LATEST DEVICES AND SERVICES, ALL IN ONE PLACE • IMPARTIAL & KNOWLEDGEABLE ADVICE • COMPETITIVE PRICES • THE ABILITY TO SHOP WHEN AND WHERE YOU WANT • SUPPORT FOR THE LIFE OF YOUR PRODUCTS January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 4. JOEL CRABB KANNAN SWAMINATHAN Chief Architect, Director, Web BestBuy.com Architecture January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 5. NOSQL AT BESTBUY.COM • Schema-less design —Key–value systems —Sparse column systems —Non-relational data • Distributed nature —Eventual consistency —Active-Active across clouds and datacenters —High reliability —Horizontal scaling January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 6. BESTBUY.COM January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 7. SCALABILITY GOALS • Near-Infinite • Bursts • 7X traffic spikes • Bursts > 50,000 rps • #4 in eCommerce traffic during 2013 Holiday Walmart Best Buy January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 8. FLEXIBILITY GOALS Low cost of change Fast concepts to site Daily releases Multiple versions One day of work vs. two months
  • 9. RELIABILITY GOALS • 100% availability • Zero defects • Achieved 100% cloud uptime during Holiday • ~ 2s response times January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 10. CLOUD ARCHITECTURE CONCEPTS • Clouds fail; plan for it —Multiple availability zones —Multiple regions —Multiple vendors • Datacenter connections fail; plan for it —Serve pages completely from cloud —Browse-only fallback mode January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 11. CLOUD ARCHITECTURE CONCEPTS CDN: Global Traffic Manager Browse Cloud Browse Cloud Vendor 1 Vendor 2 Best Buy Datacenter January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 12. RIAK January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 13. PRODUCT CATALOG - REQUIREMENTS • Business Requirements —Easily add new attributes to products —Use existing product content feeding systems —Provide enhanced product data to all of Best Buy • Technical Requirements —Scale to BestBuy.com’s needs —One-way replication from DC to cloud —Provide a Product Catalog API usable by Best Buy applications internally (DC) and externally (cloud) January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 14. BROWSE CLOUD - ARCHITECTURE Persistent Cache Load Balancer Web App Web App Service Aggregator Product Data Product Data Datacenter Legacy Services and Product Data January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 15. WHY RIAK FOR THE PRODUCT CATALOG • Key-value Store —Easily add new attributes —Different attribute sets for different product categories • Hub and Spoke Replication —Multiple cloud instances can be out-of-synch for seconds or minutes; this is OK —Connection to Datacenter can be lost • Resilience within all tiers —Riak’s ring architecture allows single instances to fail with little impact to the system January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 16. WHY RIAK FOR THE PRODUCT CATALOG • Legacy Bridge —Extract data from legacy system —Expose to anyone who needs the data —Extends decision to retire legacy system • Input into Product Evolution —Best Buy is stretching Riak in multiple areas —Feedback on Search and Replication —Direct connection to Riak engineers January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 17. PRODUCT CATALOG DEPLOYMENT Cloud Replicant Cloud Replicant V1.2.1 Each Cloud Replicant is a self-contained instance which can intermittently lose connectivity to the Datacenter Datacenter Master V1.4.2 January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 18. PRODUCT CATALOG - DATA FLOW Cloud Web Application Granular Service Aggregator Service NoSQL Product Data Replication Extraction Datacenter Legacy Product Data NoSQL Product Data January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 19. RIAK LESSONS LEARNED • Search eats up the CPU (Pre-Yokozuna) Search Endpoint Removed January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 20. RIAK LESSONS LEARNED • Multi-datacenter replication is hard at scale • Object replication fails silently (v1.4.2) 1.5 hours for partial update 4000 objects/sec January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 21. DID THE PRODUCT CATALOG SUCCEED? • Scale – generally < 5 ms response times in cloud • Scope – provides APIs in cloud and DC Single zone 1.5 hours response update failure, for partial times jump to 8-10 ms 4 ms @ 95% 4000 objects/sec January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 22. CASSANDRA January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 23. CUSTOMER GRAPH - REQUIREMENTS • Business Requirements —Create a single identity provider —Build an adaptive model to support a 360 degree view of the customer, customer segmentation and multi-channel personalization —Extensible framework to support federated identity interoperability with external providers • Technical Requirements —Scale to BestBuy.com’s needs with a distributed service oriented architecture —Support risk based authentication and secure service interfaces for consumers January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 24. CUSTOMER GRAPH - ARCHITECTURE CDN + GTM + WAF Load Balancer API Load Balancer API Topology: NTS RF: 3 Partitioner: Random Data Center 1 / Region 1 API API Cassandra Ring Data Center n / Region n January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 25. CUSTOMER GRAPH - CONCEPTUAL Create Connection: POST /customers/Kannan/following/customers/Joel Get all Kannan’s connections: GET /customers/Kannan/connections Get Joel’s “followers”: GET customers/Joel/followers Get customers connecting to a specific store: GET /stores/{store id}/connecting/customer January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 26. CUSTOMER GRAPH – CONCEPTUAL Topics Products Blogs January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 27. WHY CASSANDRA? • No single point of failure • Delivers on Atomicity, Isolation & Durability • Eventual Consistency • Tunable Consistency for Reads vs. Writes • Linear Scalability • Querying a column slice or a range of row keys • Data can have expiration set • Reliable multiple data center replication January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 28. LESSONS LEARNED – DATA MODELING Use composite column names; static composite types Column names are stored physically sorted and indexed Store “values” as column names Wide rows in conjunction with composite columns can be used to build indices, but… For larger data sets, distribute the columns among rows A secondary index is best modeled as a separate column family January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 29. LESSONS LEARNED – DATA MODELING Do not store an entity as a single column blob Cannot index and query on entity attributes Updates to an entity attribute would require a read and then a write Mutate just the required columns on an entity row Do not read and then write January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 30. LESSONS LEARNED - MONITORING Heap Size and Use  GC Mutation Stage (Writes) & Read Stage (Reads)  Active and Pending AE Service, Stream & Message-Streaming-Pool Stages  Especially during scheduled repair or rebuild (while adding nodes to a new data center)  Streaming requires keep-alive connections (watch out for firewalls terminating idle established connections; update periodic tcp keep-alive ping rate) Compaction Stage & Compaction Count  IO Wait, Limits - nofiles, nproc January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 31. IMPLEMENTATION STRATEGY • Start with a project team —Riak – Product Catalog —Cassandra – Customer Graph • Build expertise in development and operations • As usage grows, create a Platform team —Combine engineering into one team —Project team can then focus on business features —Focus on automation January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 32. CONCLUSIONS • Riak and Cassandra have both been successful —Riak – No complete outages in two years —Cassandra – Flawless in its first Holiday in 2013 • Differences —Replication patterns are different —Write and read treatment is different —Deployment pattern are different • High Availability —Both have highly resilient architectures —Both scale linearly January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.
  • 33. January 16, 2014 @Copyright 2014 – Best Buy, Inc. All rights reserved.