Contenu connexe
Plus de Patrick McFadin (20)
Cassandra Virtual Node talk
- 1. V is for vnodes
Patrick McFadin, Sr Solution Architect
DataStax
©2012 DataStax
1
Friday, February 15, 13
- 2. Agenda for today
• What is a node?
• How vnodes work
• Converting your cluster
• Benefits
©2012 DataStax
2
Friday, February 15, 13
- 3. Since the beginning...
Cassandra has had...
Clusters, which have...
Keyspaces, which have...
Column Families, which have...
©2012 DataStax
3
Friday, February 15, 13
- 4. Row Keys
Unique in a column family
Can be up to 64k in size
Can be sorted in the cluster
Byte Ordered Partitioner
OR...
Can be randomly placed in cluster
Random Partitioner
©2012 DataStax
4
Friday, February 15, 13
- 5. Row Keys
How do you...
• Create a random number?
• Make sure the number is big enough?
• Make it reproducible?
MD5 does the job
Input a Row Key MD5 Get a 128 bit number
©2012 DataStax
5
Friday, February 15, 13
- 6. Row Keys
Input Get
@PatrickMcFadin MD5 0xcfc2d0610aaa712a8c36711d08a2550a
Input Get
8675309 MD5 0x6cc0d36686e6a433aa76f96773852d35
The number produced is a range between:
0 and 2128-1... but Cassandra uses 2127-1
2128 = 340,282,366,920,938,463,463,374,607,431,768,211,456
...otherwise known as a HUGE number.
©2012 DataStax
6
Friday, February 15, 13
- 8. Token Assignment
• Each Cassandra node is assigned a token
• Each token is a number inside the huge range
• Tokens mark the ownership range of Row Keys
From: Token = 0
To: Token = 56713727820156410577229101238628035242
From:
To: Token = 113427455640312821154458202477256070484
©2012 DataStax
8
Friday, February 15, 13
- 9. Row Key to Token
Input Get
@PatrickMcFadin MD5 276161727147663567581939045564154008842
Token = 0
I’ll Token = 56713727820156410577229101238628035242
take it!
Token = 113427455640312821154458202477256070484
©2012 DataStax
9
Friday, February 15, 13
- 10. Row Key to Token
Input Get
@PatrickMcFadin MD5 276161727147663567581939045564154008842
Token = 0
I’ll Token = 56713727820156410577229101238628035242
take it!
Token = 113427455640312821154458202477256070484
©2012 DataStax
9
Friday, February 15, 13
- 11. Row Key to Token
Input Get
@PatrickMcFadin MD5 276161727147663567581939045564154008842
Token = 0
I’ll Token = 56713727820156410577229101238628035242
take it!
Token = 113427455640312821154458202477256070484
©2012 DataStax
9
Friday, February 15, 13
- 12. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
©2012 DataStax
10
Friday, February 15, 13
- 13. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
©2012 DataStax
10
Friday, February 15, 13
- 14. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
Commodity node?
©2012 DataStax
10
Friday, February 15, 13
- 15. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
Commodity node?
©2012 DataStax
10
Friday, February 15, 13
- 16. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
Commodity node? What you really want.
©2012 DataStax
10
Friday, February 15, 13
- 17. Cassandra 1.1 Node
• Responsible for a single range of keys
• Range determined by single token
• One server = One token = One node
Commodity node? What you really want.
©2012 DataStax
10
Friday, February 15, 13
- 18. Time for a new plan
• Hardware is only getting bigger
• One node is responsible for more data
• Token assignments are a pain
©2012 DataStax
11
Friday, February 15, 13
- 19. Token assignment (sucks)
• Tokens need to be evenly spread
• Growing a ring... not good options
• Shrinking a ring... not good options
• Tokens have to be added to each server config
©2012 DataStax
12
Friday, February 15, 13
- 20. Enter Virtual Nodes
• One server should have many nodes
• Each node should be small
• Tokens should be automatic
Version 1.1 Version 1.2
Server 1 Server 1
1 2
1-4
4 3
©2012 DataStax
13
Friday, February 15, 13
- 21. Virtual Node Features
• Default 256 Nodes per server
• Auto assign tokens
• Faster rebuilds of servers
• Faster server add to cluster
• New partitioner (More later)
©2012 DataStax
14
Friday, February 15, 13
- 22. Transitioning to vnodes
Super easy!
Find these lines in your cassandra.yaml file:
#num_tokens:
initial_token: <some big number>
Change to:
num_tokens: 256
initial_token:
and restart.
Repeat on all nodes in cluster
©2012 DataStax
15
Friday, February 15, 13
- 23. Transitioning to vnodes
After all Cassandra instances have been reset
Initialize a shuffle operation
[patrick@cassandra0 ~]$ cassandra-shuffle create
Enable shuffling
[patrick@cassandra0 ~]$ cassandra-shuffle enable
List pending relocations*
[patrick@cassandra0 ~]$ cassandra-shuffle ls
Let’s walk through it...
*This is a slow op. Be patient.
©2012 DataStax
16
Friday, February 15, 13
- 24. Existing 1.1 cluster
Server 1 Server 2
1-4 4-8
Server 4 Server 3
13-16 9-12
©2012 DataStax
Friday, February 15, 13
- 25. Set num_tokens and restart
Server 1 Server 2
1-4 1-4 4-8 4-8
1-4 1-4 4-8 4-8
Server 4 Server 3
13-16 13-16 9-12 9-12
13-16 13-16 9-12 9-12
©2012 DataStax
18
Friday, February 15, 13
- 26. Set num_tokens and restart
Server 1 Server 2
1 2 5 6
3 4 7 8
Server 4 Server 3
13 14 9 10
15 16 11 12
©2012 DataStax
Initialize and Enable shuffling...
19
Friday, February 15, 13
- 27. Shuffle enable
Server 1 Server 2
1 5 2 6
13 9 14 10
Server 4 Server 3
3 7 4 8
16 12 15 11
©2012 DataStax
20
Friday, February 15, 13
- 28. Shuffle complete
Server 1 Server 2
1 5 2 6
13 9 14 10
Server 4 Server 3
3 7 4 8
16 12 15 11
©2012 DataStax
21
Friday, February 15, 13
- 29. Ops life with vnodes
• Add any number of nodes
• No token assignments!
• Bigger server? Larger num_tokens
• Decommission any number of nodes
• New nodetool command: status
One more time now!
©2012 DataStax
22
Friday, February 15, 13
- 30. Bonus new thing
• New Partitioner: Murmur3Partitoner
• Murmur3 replaces MD5
• Slightly faster than MD5 in certain cases
• Go forward partitioner for NEW clusters
• No need to convert
More details here:
https://issues.apache.org/jira/browse/CASSANDRA-3772
©2012 DataStax
23
Friday, February 15, 13
- 31. In conclusion...
Go out and try some vnode love today!
Download Cassandra 1.2 now
http://www.datastax.com/download/community
http://cassandra.apache.org/download/
©2012 DataStax
24
Friday, February 15, 13
- 32. Some handy references
http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2
http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes
Follow me on Twitter for more: @PatrickMcFadin
©2012 DataStax
25
Friday, February 15, 13
- 33. We power the apps
that transform
business.
©2012 DataStax
26
Friday, February 15, 13