SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Cassandra, réplication et langage de requête
Jonathan Ellis
RDBMS & VOUS
SQLITE, PYTHON
SCRIPTS, FICHIERS LOG
EXEMPLE ?
DONNÉES DE
PETITE TAILLE
LA PLUPART DES
SITES WEB
RDBMS
DONNÉES DE
TAILLE MOYENNE
RDBMS CONVIENT-IL AUX DONNÉES MASSIVES ?
VOUS BIG DATA
SCALABILITÉ VERTICALE
Taille des
données au
démarrage
OH, WHOA,
THINGS ARE
KICKING UP
“ACID”, PAS TOUJOURS VRAI
ATOMICITÉ
COHÉRENCE
ISOLATION
DURABILITÉ
Cohérence ?
RÉPLICATION ASYNCHRONE != COHÉRENCE
CLIENT
MAITRE ESCLAVE
Délai de
Réplication
NON! HUH?
LA 3ÈME FORME NORMALE NE SCALE PAS
HORRIBLE
▸ IMPRÉVISIBLE
▸ DONNÉES >>MÉMOIRE ?
▸ RECHERCHE SUR DISQUE —>
LENT
▸ UTILISATEURS MÉCONTENTS
PARTITIONNER
CLIENT
CAUCHEMAR
DISPONIBILITÉ?
PAS AVEC CES
TÊTES DE MULE …
CONCLUSION:
GESTION DE LA CROISSANCE DES
DONNÉES, PAS SI SIMPLE …
VOTRE MEILLEUR AMI, CASSANDRA
ARCHITECTURE
ARCHITECTURE
PEER TO PEER
▸ Cassandra, architecture Masterless: ni
Maître ni Esclave
▸ Chaque noeud gère lui-même ses données
▸ Comment est-ce possible ?
▸ Réplication
▸ Niveau de cohérence
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
RÉSULTAT ?
SCALABILITÉ LINÉAIRE
HAUTE DISPONIBILITÉ
CLIENT
TOPOLOGIE
OPÉRATION
▸ Le “facteur de réplication”
définit le nombre de
“copies”
ARCHITECTURE
CASSANDRA DISTRIBUE ET RÉPLIQUE LES DONNÉES
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
▸ Les copies acquittent auprès du
coordinateur;
▸ le coordinateur acquitte auprès du client
ARCHITECTURE
COMMENT ACQUITTE T-ON LES ÉCRITURES ?
COORDINATEUR
ack
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
ARCHITECTURE
NIVEAUX DE COHÉRENCE AJUSTABLE ?
▸ ONE
▸ QUORUM
▸ ALL
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
ONE
ARCHITECTURE
▸ Une seule copie acquitte
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
▸ Toutes les copies acquittent
ARCHITECTURE
ALL
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
ARCHITECTURE
QUORUM
▸ Quorum = (somme du facteur de réplication / 2) + 1
▸ Question: combien de copies doivent-ils acquitter si le
facteur de réplication est 3 & on veut du quorum?
NOEUD
1
NOEUD
2
NOEUD
3
NOEUD
4
PARAMÈTRES MULTI-DC
▸ QUORUL vs. LOCAL_QUORUM
▸ ONE vs. LOCA_ONE
US-EAST FR-PARIS
PARTITIONER
CONSISTENT HASHING
Comment les données sont
elle réparties sur le cluster
en réalité ?
LA MODÉLISATION DE
DONNÉES AVEC CASSANDRA
SEMBLE DIFFICILE
PAS EXACTEMENT
KEYSPACE
TABLE
PARTITION
LIGNE
STRUCTURE DE DONNÉES DANS CASSANDRA
CLÉ PRIMAIRE = 

CLÉ DE PARTITION + 

CLUSTERING COLUMNS
CLÉ DE PARTITION
SEULE MANIÈRE DE LOCALISER
LA PARTITION SUR LE CLUSTER
CLUSTERING COLUMNS
POUR DÉFINIR LE TRI,
L’ORDRE ET L’UNICITÉ
POURQUOI LES CLUSTERING
COLUMNS SONT SI EFFICACES ?
UN EXEMPLE
CRÉER UN KEYSPACE
CREATE KEYSPACE test 

WITH replication = 

{'class': 'SimpleStrategy',

'replication_factor': '1'};



USE test;
CRÉER UNE TABLE
CREATE TABLE timeline (
  user_id text,
  tweet_id timeuuid,
  tweet_author text,
tweet_body text,
  PRIMARY KEY (user_id,
tweet_id)
);
CLÉ PRIMAIRE EN BLANC
EXEMPLE DE REQUÊTE D’INSERTION
insert into timeline
(user_id, tweet_id, tweet_author, tweet_body)
values ('jbellis',
now(),
‘Jonathan Ellis',
'Bonjour Paris!');
CRÉER UNE TABLE
CREATE TABLE timeline (
  user_id text,
  tweet_id timeuuid,
  tweet_author text,
tweet_body text,
  PRIMARY KEY (user_id,
tweet_id)
);
LA CLÉ DE PARTITION PERMET LE ROUTAGE
PARTITIONER
CRÉER UNE TABLE
CREATE TABLE timeline (
  user_id text,
  tweet_id timeuuid,
  tweet_author text,
tweet_body text,
  PRIMARY KEY (user_id,
tweet_id)
);
QU’EN EST-IL DES CLUSTERING COLUMNS?
user_id tweet_id _author _body
jbellis 3290f9da.. rbranson lorem
jbellis 3895411a.. tjake ipsum
... ... ...
driftx 3290f9da.. rbranson lorem
driftx 71b46a84.. yzhang dolor
... ... ...
yukim 3290f9da.. rbranson lorem
yukim e451dd42.. tjake amet
... ... ...
SELECT * 

FROM timeline
WHERE 

user_id = ’driftx’;
MODÈLE DE DONNÉES
ORIENTÉ REQUÊTE
LES RDBMS, C’ÉTAIT SI BIEN…
JE N’AVAIS PAS
BESOIN DE DÉ-
NORMALISER
MAIS N’OUBLIONS
PAS QUE…
LA 3ÈME FORME NORMALE NE SCALE PAS!
HORRIBLE
▸ LES TRIS GLOBAUX SONT
COUTEUX
▸ LES JOINTURES MULTI-
MACHINES SONT ENCORE
PLUS COUTEUSES
▸ UTILISATEURS MÉCONTENTS !
PAS DE PANIQUE!
ROULEMENT DE TAMBOUR …
MAINTENANT C’EST
DUY HAI DOAN !

Contenu connexe

En vedette

CI and CD with Jenkins
CI and CD with JenkinsCI and CD with Jenkins
CI and CD with JenkinsMartin Málek
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor MicroservicesSysdig
 
Continuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSContinuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSAmazon Web Services
 
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) PipelineAnatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) PipelineRobert McDermott
 

En vedette (6)

CI and CD with Jenkins
CI and CD with JenkinsCI and CD with Jenkins
CI and CD with Jenkins
 
How to Monitor Microservices
How to Monitor MicroservicesHow to Monitor Microservices
How to Monitor Microservices
 
Continuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWSContinuous Integration and Deployment Best Practices on AWS
Continuous Integration and Deployment Best Practices on AWS
 
An Introduction to Python Concurrency
An Introduction to Python ConcurrencyAn Introduction to Python Concurrency
An Introduction to Python Concurrency
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) PipelineAnatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
 

Plus de DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 

Plus de DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Becoming Friends with Cassandra