SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Azure + DSE Powers O365 Per-User Store
1 Introduction
2 What We Built
3 How We Built It Using Cassandra + Spark
4 Why We Built It On Azure + DSE
5 How Can You Build It
6 Wrap Up
2© 2015. All Rights Reserved.
Introduction
3© 2015. All Rights Reserved.
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Silvano Coriani
Azure
Email: scoriani@microsoft.com
Twitter: @scoriani
Introduction – Office 365
4© 2015. All Rights Reserved.
Email
Collaboration
Document Authoring
Social Networking
Calendaring
File Storage
Business Intelligence
Etc…
Introduction – Azure
5© 2015. All Rights Reserved.
Azure is Microsoft’s cloud computing platform, a growing collection of
integrated services—analytics, computing, database, mobile,
networking, storage, and web—for moving faster, achieving more, and
saving money.
What We Built
6© 2015. All Rights Reserved.
A way to understand our users and organizations at a deeper level!
• Are users happy with the service they are receiving?
• Are users fully utilizing the services they are paying us for?
• Are users hitting issues that we can proactively help them with?
• How has a user’s experience been over their lifetime?
• Can we discover insights that we aren’t even aware of?
This requires ingesting and storing a lot of data. We need to be able
to perform fast, scalable analytics on that data, or we will discover
issues too late!
Questions:
What We Built (contd)
7© 2015. All Rights Reserved.
• Running in the cloud
• Highly scalable
• Initial ingestion of 50k events/sec, growing rapidly
• Millisecond latency for reads/writes
• High Availability
• Tunable Consistency
• Real-time and batch analytics
• Machine learning
• Storing aggregated data for 1+ years
One month to get it built…..
Platform Requirements:
Topology:
1 Physical Datacenter (Eventually will be geo-replicated)
2 Logical Datacenters (Cassandra, Analytics[Spark])
All machines are in a virtual network (VNet) and are assigned internal static IPs.
No inbound access to Cassandra machines from outside the VNet.
How We Built It Using Cassandra + Spark (contd)
8© 2015. All Rights Reserved.
Physical Datacenter
VNet
S
Logical DC
C*
Logical DC
Configuration (Azure G4 machines):
• Ubuntu
• 16 cores
• 224 GB RAM
• 3TB local SSD
Snitch: GossipingPropertyFileSnitch
Replication: All keyspaces use NetworkTopologyStrategy with replication factor=3 in each
DC.
How We Built It Using Cassandra + Spark (contd)
9© 2015. All Rights Reserved.
Node Type Node Count Heap Size GC Type
Cassandra 12 12 GB G1
Spark 12 20 GB G1
How We Built It Using Cassandra + Spark (contd)
© 2015. All Rights Reserved. 10
How We Built It Using Cassandra + Spark (contd)
11© 2015. All Rights Reserved.
Ingestion:
RESTAPI
O365
Queue (event hub)
Ingestion
Worker
(Azure worker role
using DataStax
C# driver)
Cassandra
All data is ingested into the Cassandra DC
All read APIs read from the Cassandra DC
All ingested data is PII scrubbed
APIs are the only way to get data in or out of Cassandra.
How We Built It Using Cassandra + Spark (contd)
12© 2015. All Rights Reserved.
What Data Do We Ingest?
CREATE TABLE userdatasetraw (
userid text,
timepk timestamp,
device text,
createdtime timestamp,
errorcode text,
errordetail text,
omstid text,
useragent text,
PRIMARY KEY ((userid, timepk, device), createdtime)
) WITH CLUSTERING ORDER BY (createdtime DESC) AND
COMPACTION={'base_time_seconds':'50','class':'DateTieredCompactionStrategy','max_sstable_age_days':'0.25‘….
userid timepk device createdtime errorcode errordetail omstid useragent
102033ffa4a7
079e7c
14411520000
00
8000000A 14412151330
00
Failure InvalidOperati
on
15321c64d-
0A92-4f4e7-
bcc8-
2aeb89354ff2
6
null
How We Built It Using Cassandra + Spark (contd)
13© 2015. All Rights Reserved.
What Data Do We Ingest?
CREATE TABLE tenantsubscription (
omstidtext,
skuid text,
city text static,
comculture text static,
exchgtid text static,
exousercnt int static,
geoareacd text static,
isconcierge boolean static,
istrial boolean,
liccount int,
lyousercnt int static,
numtrailsubs int,
orgname text static,
sku text,
spousercnt int static,
statechangedt timestamp,
subautorenew boolean,
subcreated timestamp,
subexpiry timestamp,
substate text,
tdssynctime timestamp,
usedliccount int,
PRIMARY KEY ((omstid), skuid)
)
omstid skuid city exousrcnt liccnt subautorenew sku
129001cd1-
21dc-4706-
96cfe-
1f632522d3
65ae
6fe22a85e-
b296-42f0-
b187-
1b91e9394b
900
BANGKOK 1000 1500 false OFFICE 365
ENTERPRIS
E E3
How We Built It Using Cassandra + Spark (contd)
14© 2015. All Rights Reserved.
Aggregations
Spark Batch Jobs to roll up UserDatasetRaw into 1 hour and 24 hour aggregates.
Other jobs use the 1 hour and 24 hours aggregates and join them with the tenant
subscription and dataset tables to calculate insights:
Result:
1. Great feedback from customers and support agents!
2. Saved customers money!
3. Save customers from being locked out of their service!
4. Proactively fix user experience (detected customer misconfiguration)!
How We Built It Using Cassandra + Spark (contd)
15© 2015. All Rights Reserved.
What Problems Did We Run Into?
• Bad Data Modelling:
• Partitions getting too large (1-2 GB) which raised the “compacting large row”
warning and led to OOM errors (Cassandra 2.0).
• Not using DateTieredCompactionStrategy for time-series data.
• Bad Configuration:
• OS limits configured too low (ulimit, nofiles, etc…)
• Number of concurrent compactors and flush writers too low.
• Not using G1 garbage collection with large heaps.
• Not paying enough attention to blocked flush writers and dropped mutations.
• Allowing SSTable count to get too high, causing OOM errors.
Why We Built It On Azure + DSE
16© 2015. All Rights Reserved.
Why Azure?
• We didn’t want to manage bare metal and the overhead it brings.
• Easy to add capacity without ordering hardware and rack
space.
• We have used Azure for other services for ~5 years.
• We have built tools for deploying and managing services.
• Great track record with Azure support.
• We love to try out new things on the Azure platform!
Why We Built It On Azure + DSE (contd)
17© 2015. All Rights Reserved.
Why Cassandra?
The Good
• Low Latency ✓
• Linear Scale ✓
• Highly Available ✓
• Aggregations (Spark/Spark Streaming) ✓
• Machine Learning (Spark ML) ✓
• No Enforcement of Full Consistency ✓ ✓ ✓
The Not-So-Good
• No Hosted Option in Azure ✗
• Have to Install and Configure it Ourselves ✗
Why We Built It On Azure + DSE (contd)
18© 2015. All Rights Reserved.
Why DataStax Enterprise?
Training:
Cassandra can be complex and its success depends on various design decision.
Getting training from the experts is invaluable to ensuring our success.
Integration:
DataStax has built integration between Cassandra and Spark (as well as other
products) and provides a tested package that we can depend on. Ops Center UI.
Support:
Cassandra is new to us. There is nothing better than being able to send off a
message when something goes wrong, and getting experts to help solve the
problem.
How Can You Build It
19
© 2015. All Rights Reserved.
- Marketplace
- Simplified set of deployment options
- Bring Your Own License
- Production Cluster or Dev Sandbox
- 4, 12, 36 or 90 nodes
- Pick your VM type and size
- Single VNET
- OpsCenter:
http://{cluster}.{region}.cloudapp.azure.com:8888
- Define your own deployment
1. Group cluster resources based on
common lifecycle
1. E.g. separate infrastructure components from
compute nodes
2. Define compute and storage options for
nodes in the cluster
1. Pick your VM type and size
2. Ephemeral vs persistent disks
(Standard/Premium)
3. Snapshots
3. Define networking options
1. VNETs configuration
2. Cross-DC (VNET to VNET) connectivity
4. Performance considerations
1. Compute
2. Storage
3. Networking
Azure Resource Manager principles
AZURE RESOURCE MANAGER API
• RBAC-based
• Template-driven
• Declarative and imperative
• Idempotent
• Multi-service
• Multi-region
• Extensible
Resource Group
 container for multiple resources
 resources exist in one* resource group
 resource groups can span regions
 resource groups can span services
RESOURCE GROUP
Deployment
 tracks template execution
 created within a resource group
 allows nested deployments
• Template describes the topology (outside the box)
• Template extensions can initiate state configuration (inside the box)
• Multiple extensions available for Windows and Linux VMs
– DSC
– Chef
– Puppet
– Custom Scripts
– AppService + WebDeploy
– SQLDB + BACPAC
Inside the Box vs. Outside the Box
Common Use Cases for ARM Templates
• Enterprises and System Integrators
– Delivering a capability or cloud capacity (building block templates, e.g. DSE)
– Delivering an end to end application (solution templates)
• Cloud Service Vendors (CSVs)
– Support different multi-tenancy approaches
• Distinct deployments per customer
– Within the CSV’s subscription
– “Bring Your Own Subscription” model that uses customer subscriptions
• Scale units within a central multi-tenant system
• Marketplace integration
• All deploy known configurations/skus/t-shirt sizes
– Lots of variables makes free form less desirable
– T-shirt Sizes / SKUs are the common approach
Design and deploy a building block template
Go to http://github.com/azure/azure-quickstart-templates
to find 100s of quick start deployment templates for finished solutions.
DataStax is evolving ARM deployment templates in this
github repo to include DSE specific capabilities (e.g.
multi-region topology) for those who want to manage
their own deployment.
Deploying DataStax with the Azure CLI
Deploying DataStax with Azure Marketplace
Compute and storage options for nodes in the cluster
• Compute families for production clusters
– D-Series, G-Series (Xeon® E5 v3)
• Local SSD disks
– DS-Series, GS-Series
• Premium Storage optimized, host caching for reads
• Storage options for nodes
– Maintain data and logs on local ephemeral SSD disks
• ~100k IOPs and 1.5 GB/sec on G5
– Leverage Premium Storage Disks for persistent data and logs
• P10, P20, P30 (128GB to 1TB, up to 5000 IOPs and 200MB/sec)
• Striped volumes to balance storage size, throughput and costs
• Max 64TB, 80000 IOPs and 1GB/sec per node
– Use Standard Storage for backup snapshots
• Low cost, geo-replicated
Networking deployment options
• Supporting your replication topology (NetworkTopologyStrategy), including geo-
replication, for disaster recovery or workload segregation purposes
• Within a VNET, bandwidth is a function of VM type/size
– Up to 20Gbps for G5
• Cross-region VNET gateways
– Standard (100Mbps) or High Performance (200Mbps), No-Crypto option
– Latency impact proportional to distance
Contact Us
27© 2015. All Rights Reserved.
Sean Usher
Office 365
Email: seusher@microsoft.com
Twitter: @seanushermsft
Silvano Coriani
Azure
Email: scoriani@microsoft.com
Twitter: @scoriani
Thank you

Contenu connexe

Tendances

Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...DataStax Academy
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...DataStax
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchDataStax Academy
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayDataStax Academy
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsDataStax Academy
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7DataStax
 
Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%ScyllaDB
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionDataStax Academy
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...DataStax
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandraScyllaDB
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityHiromitsu Komatsu
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...DataStax
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownDataStax
 
PagerDuty: Span the WAN? Yes you can!
PagerDuty: Span the WAN? Yes you can!PagerDuty: Span the WAN? Yes you can!
PagerDuty: Span the WAN? Yes you can!DataStax Academy
 
DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackAnirvan Chakraborty
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackDataStax Academy
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaScyllaDB
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsScyllaDB
 

Tendances (20)

Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
 
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
Productizing a Cassandra-Based Solution (Brij Bhushan Ravat, Ericsson) | C* S...
 
Migration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a HitchMigration Best Practices: From RDBMS to Cassandra without a Hitch
Migration Best Practices: From RDBMS to Cassandra without a Hitch
 
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at EbayCassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
 
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra DeploymentsBattery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
Battery Ventures: Simulating and Visualizing Large Scale Cassandra Deployments
 
Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7Introducing DataStax Enterprise 4.7
Introducing DataStax Enterprise 4.7
 
Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%Webinar: How to Shrink Your Datacenter Footprint by 50%
Webinar: How to Shrink Your Datacenter Footprint by 50%
 
Webinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in ProductionWebinar: Diagnosing Apache Cassandra Problems in Production
Webinar: Diagnosing Apache Cassandra Problems in Production
 
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
C* Capacity Forecasting (Ajay Upadhyay, Jyoti Shandil, Arun Agrawal, Netflix)...
 
mParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from CassandramParticle's Journey to Scylla from Cassandra
mParticle's Journey to Scylla from Cassandra
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
Making Every Drop Count: How i20 Addresses the Water Crisis with the IoT and ...
 
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd KnownCassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
 
PagerDuty: Span the WAN? Yes you can!
PagerDuty: Span the WAN? Yes you can!PagerDuty: Span the WAN? Yes you can!
PagerDuty: Span the WAN? Yes you can!
 
DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?DataStax C*ollege Credit: What and Why NoSQL?
DataStax C*ollege Credit: What and Why NoSQL?
 
Real-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stackReal-time personal trainer on the SMACK stack
Real-time personal trainer on the SMACK stack
 
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStackCisco: Cassandra adoption on Cisco UCS & OpenStack
Cisco: Cassandra adoption on Cisco UCS & OpenStack
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
Building Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and KafkaBuilding Event Streaming Architectures on Scylla and Kafka
Building Event Streaming Architectures on Scylla and Kafka
 
The True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS OptionsThe True Cost of NoSQL DBaaS Options
The True Cost of NoSQL DBaaS Options
 

En vedette

Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...DataStax
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraChristopher Batey
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemVarad Meru
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax Academy
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandraJon Haddad
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core ConceptsJon Haddad
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax Academy
 
DataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax Academy
 
DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)DataStax Academy
 
Instaclustr: Securing Cassandra
Instaclustr: Securing CassandraInstaclustr: Securing Cassandra
Instaclustr: Securing CassandraDataStax Academy
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkJon Haddad
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Jon Haddad
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraJon Haddad
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Cake Solutions: Cassandra as event sourced journal for big data analytics
Cake Solutions: Cassandra as event sourced journal for big data analyticsCake Solutions: Cassandra as event sourced journal for big data analytics
Cake Solutions: Cassandra as event sourced journal for big data analyticsDataStax Academy
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftCassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftJon Haddad
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoJon Haddad
 

En vedette (20)

Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
 
LJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache CassandraLJC: Fault tolerance with Apache Cassandra
LJC: Fault tolerance with Apache Cassandra
 
Cassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage SystemCassandra - A Decentralized Structured Storage System
Cassandra - A Decentralized Structured Storage System
 
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
DataStax: How to Roll Cassandra into Production Without Losing your Health, M...
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
Crash course intro to cassandra
Crash course   intro to cassandraCrash course   intro to cassandra
Crash course intro to cassandra
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
DataStax & O'Reilly Media: Large Scale Data Analytics with Spark and Cassandr...
 
DataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra OpsDataStax: 7 Deadly Sins for Cassandra Ops
DataStax: 7 Deadly Sins for Cassandra Ops
 
DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)DataStax: Making Cassandra Fail (for effective testing)
DataStax: Making Cassandra Fail (for effective testing)
 
Instaclustr: Securing Cassandra
Instaclustr: Securing CassandraInstaclustr: Securing Cassandra
Instaclustr: Securing Cassandra
 
Enter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy SparkEnter the Snake Pit for Fast and Easy Spark
Enter the Snake Pit for Fast and Easy Spark
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - CassandraDiagnosing Problems in Production - Cassandra
Diagnosing Problems in Production - Cassandra
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax: Enabling Search in your Cassandra Application with DataStax Enterprise
 
Cake Solutions: Cassandra as event sourced journal for big data analytics
Cake Solutions: Cassandra as event sourced journal for big data analyticsCake Solutions: Cassandra as event sourced journal for big data analytics
Cake Solutions: Cassandra as event sourced journal for big data analytics
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftCassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica Coloft
 
Cassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day TorontoCassandra Core Concepts - Cassandra Day Toronto
Cassandra Core Concepts - Cassandra Day Toronto
 

Similaire à Azure + DataStax Enterprise Powers Office 365 Per User Store

Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreDataStax Academy
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierDataStax
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
 
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...Jason Himmelstein
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
VisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyVisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyLeonid Nekhymchuk
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?TechWell
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureMark Tabladillo
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
How to grow to a modern workplace in 16 steps with microsoft 365
How to grow to a modern workplace in 16 steps with microsoft 365How to grow to a modern workplace in 16 steps with microsoft 365
How to grow to a modern workplace in 16 steps with microsoft 365Tim Hermie ☁️
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...DataStax
 
Building a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksBuilding a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksDatabricks
 
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...IDERA Software
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauSam Palani
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...DataStax Academy
 

Similaire à Azure + DataStax Enterprise Powers Office 365 Per User Store (20)

Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User StoreAzure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
 
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life EasierWebinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
Webinar: DataStax Enterprise 5.0 What’s New and How It’ll Make Your Life Easier
 
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksLessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at Databricks
 
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...
Navigating the turbulence on take-off: Setting up SharePoint on Azure IaaS th...
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
VisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case studyVisiQuate: Azure cloud migration case study
VisiQuate: Azure cloud migration case study
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
Can Your Mobile Infrastructure Survive 1 Million Concurrent Users?
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Exploring sql server 2016
Exploring sql server 2016Exploring sql server 2016
Exploring sql server 2016
 
How to grow to a modern workplace in 16 steps with microsoft 365
How to grow to a modern workplace in 16 steps with microsoft 365How to grow to a modern workplace in 16 steps with microsoft 365
How to grow to a modern workplace in 16 steps with microsoft 365
 
Sandeep Grandhi (1)
Sandeep Grandhi (1)Sandeep Grandhi (1)
Sandeep Grandhi (1)
 
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
Building a Pluggable Analytics Stack with Cassandra (Jim Peregord, Element Co...
 
Building a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with DatabricksBuilding a Turbo-fast Data Warehousing Platform with Databricks
Building a Turbo-fast Data Warehousing Platform with Databricks
 
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...Idera live 2021:   Managing Databases in the Cloud - the First Step, a Succes...
Idera live 2021: Managing Databases in the Cloud - the First Step, a Succes...
 
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & TableauBig Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
Big Data Analytics on the Cloud Oracle Applications AWS Redshift & Tableau
 
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
Cassandra Summit 2014: Internet of Complex Things Analytics with Apache Cassa...
 
harish_resume
harish_resumeharish_resume
harish_resume
 

Plus de DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

Plus de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Dernier

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Dernier (20)

How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Azure + DataStax Enterprise Powers Office 365 Per User Store

  • 1. Azure + DSE Powers O365 Per-User Store
  • 2. 1 Introduction 2 What We Built 3 How We Built It Using Cassandra + Spark 4 Why We Built It On Azure + DSE 5 How Can You Build It 6 Wrap Up 2© 2015. All Rights Reserved.
  • 3. Introduction 3© 2015. All Rights Reserved. Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Silvano Coriani Azure Email: scoriani@microsoft.com Twitter: @scoriani
  • 4. Introduction – Office 365 4© 2015. All Rights Reserved. Email Collaboration Document Authoring Social Networking Calendaring File Storage Business Intelligence Etc…
  • 5. Introduction – Azure 5© 2015. All Rights Reserved. Azure is Microsoft’s cloud computing platform, a growing collection of integrated services—analytics, computing, database, mobile, networking, storage, and web—for moving faster, achieving more, and saving money.
  • 6. What We Built 6© 2015. All Rights Reserved. A way to understand our users and organizations at a deeper level! • Are users happy with the service they are receiving? • Are users fully utilizing the services they are paying us for? • Are users hitting issues that we can proactively help them with? • How has a user’s experience been over their lifetime? • Can we discover insights that we aren’t even aware of? This requires ingesting and storing a lot of data. We need to be able to perform fast, scalable analytics on that data, or we will discover issues too late! Questions:
  • 7. What We Built (contd) 7© 2015. All Rights Reserved. • Running in the cloud • Highly scalable • Initial ingestion of 50k events/sec, growing rapidly • Millisecond latency for reads/writes • High Availability • Tunable Consistency • Real-time and batch analytics • Machine learning • Storing aggregated data for 1+ years One month to get it built….. Platform Requirements:
  • 8. Topology: 1 Physical Datacenter (Eventually will be geo-replicated) 2 Logical Datacenters (Cassandra, Analytics[Spark]) All machines are in a virtual network (VNet) and are assigned internal static IPs. No inbound access to Cassandra machines from outside the VNet. How We Built It Using Cassandra + Spark (contd) 8© 2015. All Rights Reserved. Physical Datacenter VNet S Logical DC C* Logical DC
  • 9. Configuration (Azure G4 machines): • Ubuntu • 16 cores • 224 GB RAM • 3TB local SSD Snitch: GossipingPropertyFileSnitch Replication: All keyspaces use NetworkTopologyStrategy with replication factor=3 in each DC. How We Built It Using Cassandra + Spark (contd) 9© 2015. All Rights Reserved. Node Type Node Count Heap Size GC Type Cassandra 12 12 GB G1 Spark 12 20 GB G1
  • 10. How We Built It Using Cassandra + Spark (contd) © 2015. All Rights Reserved. 10
  • 11. How We Built It Using Cassandra + Spark (contd) 11© 2015. All Rights Reserved. Ingestion: RESTAPI O365 Queue (event hub) Ingestion Worker (Azure worker role using DataStax C# driver) Cassandra All data is ingested into the Cassandra DC All read APIs read from the Cassandra DC All ingested data is PII scrubbed APIs are the only way to get data in or out of Cassandra.
  • 12. How We Built It Using Cassandra + Spark (contd) 12© 2015. All Rights Reserved. What Data Do We Ingest? CREATE TABLE userdatasetraw ( userid text, timepk timestamp, device text, createdtime timestamp, errorcode text, errordetail text, omstid text, useragent text, PRIMARY KEY ((userid, timepk, device), createdtime) ) WITH CLUSTERING ORDER BY (createdtime DESC) AND COMPACTION={'base_time_seconds':'50','class':'DateTieredCompactionStrategy','max_sstable_age_days':'0.25‘…. userid timepk device createdtime errorcode errordetail omstid useragent 102033ffa4a7 079e7c 14411520000 00 8000000A 14412151330 00 Failure InvalidOperati on 15321c64d- 0A92-4f4e7- bcc8- 2aeb89354ff2 6 null
  • 13. How We Built It Using Cassandra + Spark (contd) 13© 2015. All Rights Reserved. What Data Do We Ingest? CREATE TABLE tenantsubscription ( omstidtext, skuid text, city text static, comculture text static, exchgtid text static, exousercnt int static, geoareacd text static, isconcierge boolean static, istrial boolean, liccount int, lyousercnt int static, numtrailsubs int, orgname text static, sku text, spousercnt int static, statechangedt timestamp, subautorenew boolean, subcreated timestamp, subexpiry timestamp, substate text, tdssynctime timestamp, usedliccount int, PRIMARY KEY ((omstid), skuid) ) omstid skuid city exousrcnt liccnt subautorenew sku 129001cd1- 21dc-4706- 96cfe- 1f632522d3 65ae 6fe22a85e- b296-42f0- b187- 1b91e9394b 900 BANGKOK 1000 1500 false OFFICE 365 ENTERPRIS E E3
  • 14. How We Built It Using Cassandra + Spark (contd) 14© 2015. All Rights Reserved. Aggregations Spark Batch Jobs to roll up UserDatasetRaw into 1 hour and 24 hour aggregates. Other jobs use the 1 hour and 24 hours aggregates and join them with the tenant subscription and dataset tables to calculate insights: Result: 1. Great feedback from customers and support agents! 2. Saved customers money! 3. Save customers from being locked out of their service! 4. Proactively fix user experience (detected customer misconfiguration)!
  • 15. How We Built It Using Cassandra + Spark (contd) 15© 2015. All Rights Reserved. What Problems Did We Run Into? • Bad Data Modelling: • Partitions getting too large (1-2 GB) which raised the “compacting large row” warning and led to OOM errors (Cassandra 2.0). • Not using DateTieredCompactionStrategy for time-series data. • Bad Configuration: • OS limits configured too low (ulimit, nofiles, etc…) • Number of concurrent compactors and flush writers too low. • Not using G1 garbage collection with large heaps. • Not paying enough attention to blocked flush writers and dropped mutations. • Allowing SSTable count to get too high, causing OOM errors.
  • 16. Why We Built It On Azure + DSE 16© 2015. All Rights Reserved. Why Azure? • We didn’t want to manage bare metal and the overhead it brings. • Easy to add capacity without ordering hardware and rack space. • We have used Azure for other services for ~5 years. • We have built tools for deploying and managing services. • Great track record with Azure support. • We love to try out new things on the Azure platform!
  • 17. Why We Built It On Azure + DSE (contd) 17© 2015. All Rights Reserved. Why Cassandra? The Good • Low Latency ✓ • Linear Scale ✓ • Highly Available ✓ • Aggregations (Spark/Spark Streaming) ✓ • Machine Learning (Spark ML) ✓ • No Enforcement of Full Consistency ✓ ✓ ✓ The Not-So-Good • No Hosted Option in Azure ✗ • Have to Install and Configure it Ourselves ✗
  • 18. Why We Built It On Azure + DSE (contd) 18© 2015. All Rights Reserved. Why DataStax Enterprise? Training: Cassandra can be complex and its success depends on various design decision. Getting training from the experts is invaluable to ensuring our success. Integration: DataStax has built integration between Cassandra and Spark (as well as other products) and provides a tested package that we can depend on. Ops Center UI. Support: Cassandra is new to us. There is nothing better than being able to send off a message when something goes wrong, and getting experts to help solve the problem.
  • 19. How Can You Build It 19 © 2015. All Rights Reserved. - Marketplace - Simplified set of deployment options - Bring Your Own License - Production Cluster or Dev Sandbox - 4, 12, 36 or 90 nodes - Pick your VM type and size - Single VNET - OpsCenter: http://{cluster}.{region}.cloudapp.azure.com:8888 - Define your own deployment 1. Group cluster resources based on common lifecycle 1. E.g. separate infrastructure components from compute nodes 2. Define compute and storage options for nodes in the cluster 1. Pick your VM type and size 2. Ephemeral vs persistent disks (Standard/Premium) 3. Snapshots 3. Define networking options 1. VNETs configuration 2. Cross-DC (VNET to VNET) connectivity 4. Performance considerations 1. Compute 2. Storage 3. Networking
  • 20. Azure Resource Manager principles AZURE RESOURCE MANAGER API • RBAC-based • Template-driven • Declarative and imperative • Idempotent • Multi-service • Multi-region • Extensible
  • 21. Resource Group  container for multiple resources  resources exist in one* resource group  resource groups can span regions  resource groups can span services RESOURCE GROUP Deployment  tracks template execution  created within a resource group  allows nested deployments
  • 22. • Template describes the topology (outside the box) • Template extensions can initiate state configuration (inside the box) • Multiple extensions available for Windows and Linux VMs – DSC – Chef – Puppet – Custom Scripts – AppService + WebDeploy – SQLDB + BACPAC Inside the Box vs. Outside the Box
  • 23. Common Use Cases for ARM Templates • Enterprises and System Integrators – Delivering a capability or cloud capacity (building block templates, e.g. DSE) – Delivering an end to end application (solution templates) • Cloud Service Vendors (CSVs) – Support different multi-tenancy approaches • Distinct deployments per customer – Within the CSV’s subscription – “Bring Your Own Subscription” model that uses customer subscriptions • Scale units within a central multi-tenant system • Marketplace integration • All deploy known configurations/skus/t-shirt sizes – Lots of variables makes free form less desirable – T-shirt Sizes / SKUs are the common approach
  • 24. Design and deploy a building block template Go to http://github.com/azure/azure-quickstart-templates to find 100s of quick start deployment templates for finished solutions. DataStax is evolving ARM deployment templates in this github repo to include DSE specific capabilities (e.g. multi-region topology) for those who want to manage their own deployment. Deploying DataStax with the Azure CLI Deploying DataStax with Azure Marketplace
  • 25. Compute and storage options for nodes in the cluster • Compute families for production clusters – D-Series, G-Series (Xeon® E5 v3) • Local SSD disks – DS-Series, GS-Series • Premium Storage optimized, host caching for reads • Storage options for nodes – Maintain data and logs on local ephemeral SSD disks • ~100k IOPs and 1.5 GB/sec on G5 – Leverage Premium Storage Disks for persistent data and logs • P10, P20, P30 (128GB to 1TB, up to 5000 IOPs and 200MB/sec) • Striped volumes to balance storage size, throughput and costs • Max 64TB, 80000 IOPs and 1GB/sec per node – Use Standard Storage for backup snapshots • Low cost, geo-replicated
  • 26. Networking deployment options • Supporting your replication topology (NetworkTopologyStrategy), including geo- replication, for disaster recovery or workload segregation purposes • Within a VNET, bandwidth is a function of VM type/size – Up to 20Gbps for G5 • Cross-region VNET gateways – Standard (100Mbps) or High Performance (200Mbps), No-Crypto option – Latency impact proportional to distance
  • 27. Contact Us 27© 2015. All Rights Reserved. Sean Usher Office 365 Email: seusher@microsoft.com Twitter: @seanushermsft Silvano Coriani Azure Email: scoriani@microsoft.com Twitter: @scoriani