Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 53 Publicité

Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012

Télécharger pour lire hors ligne

The presentation the CUBRID team presented at Russian HighLoad++ Conference in October, 2012. The presentation covers the topic of Big Data management through Database Sharding. CUBRID open source RDBMS provides native support for Sharding with load balancing, connection pooling, and auto fail-over features.

The presentation the CUBRID team presented at Russian HighLoad++ Conference in October, 2012. The presentation covers the topic of Big Data management through Database Sharding. CUBRID open source RDBMS provides native support for Sharding with load balancing, connection pooling, and auto fail-over features.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Les utilisateurs ont également aimé (20)

Publicité

Similaire à Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012 (20)

Publicité

Plus récents (20)

Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012

  1. 1. Database Sharding the Right Way: Easy, Reliable, and Open source.
  2. 2. • – – – – –
  3. 3. Growing in the Wild. The story by CUBRID Database Developers. View on Slideshare http://profyclub.ru/docs/439
  4. 4. • • • •
  5. 5.           
  6. 6. • •
  7. 7. = Big Business Opportunity
  8. 8. - Enterprise - Vendor dependency SQL - Scalability constraints - Common interface - Open Source NoSQL - Scalable - Non-standard API
  9. 9. • • • • • • • • • • • • • •
  10. 10. SQL Transactions NoSQL => NoACID Standard Interface Experts
  11. 11. DBMS Worldwide 21,359 23,252 26,701 11.8% Market Korea 349 395 478 17% $MM Ratio 1.6% 1.7% 1.8% 70% 65% 60% 55% Korea 50% Worldwide 45% 40% 2009 2010 2011 Source: Gartner, 2012
  12. 12. RDBMS is still the best choice for mission-critical data
  13. 13. Database Sharding
  14. 14. Name Type Requirements Interface DB ETC DBMS w/ - Hibernate Hibernate shards AS framework Hibernate Java - JVM support dbShards AS & Middleware MySQL Java, C Middleware Gizzard (Twitter) Any storage - JVM Java Middleware & Spider for MySQL MySQL Any Storage Engine - CUBRID CUBRID SHARD Middleware - MySQL Any - Oracle
  15. 15. • • • • – – • • •
  16. 16. • – – – – –
  17. 17. Is there such RDBMS?
  18. 18. CUBRID 9.0
  19. 19.        
  20. 20.
  21. 21. Easy Installation
  22. 22. http://www.cubrid.org/downloads
  23. 23. • – • – • –
  24. 24. SHARD_KEY_MODULAR = 256 SHARD_KEY_LIBRARY_NAME = ‘’ SHARD_KEY_FUNCTION_NAME = ‘’
  25. 25.  id  user_id =  order_no  …
  26. 26. int user_get_shard_key(int type, void *val) { int mod = 2; if (val == NULL) { return ERROR_ON_ARGUMENT; } switch(type) { case SHARD_U_TYPE_INT: { int ival; ival = (int) (*(int *)val); return ival % 2; } break; case SHARD_U_TYPE_STRING: return ERROR_ON_MAKE_SHARD_KEY; default: return ERROR_ON_ARGUMENT; } return ERROR_ON_MAKE_SHARD_KEY; }
  27. 27. Configuring CUBRID SHARD is very easy!
  28. 28. • $> cubrid createdb shard1 $> csql -S -u dba shard1 -c "create user shard password 'shard123’” $> cubrid server start shard1
  29. 29. • $> csql -C -u shard -p 'shard123' shard1@localhost -c ”CREATE TABLE users (id BIGINT PRIMARY KEY, name VARCHAR(20), age SMALLINT)”
  30. 30. $> cubrid shard start @ cubrid shard start ++ cubrid shard start: success
  31. 31. connectionURL = "jdbc:cubrid:localhost:45511:shard1:shard:shard123:";
  32. 32. String query = "SELECT name FROM student WHERE student_no = /*+ shard_key */ ?; "; PrepareStatement query_stmt = connection.prepareStatement(query); query_stmt.setInt(1,100); ResultSet rs = query_stmt.executeQuery(); // fetch resultset range key_column (hash result) shard_id min max student_no 0 63 0 student_no 64 127 1 student_no 128 191 2 student_no 192 255 3
  33. 33. SELECT name FROM student WHERE student_no = /*+ shard_key */ ?; • •
  34. 34. How did we tackle the unique ID problem?
  35. 35. • – – – – –
  36. 36. CUBRID SHARD Performance
  37. 37. Description Quantity OS (64bit) / CPU / MEM Agent to generat load and 8 Centos5.3 / xeon 2G-8core / 8G NDrive App Simulator CUBRID Shard 1 Centos5.3 / xeon 2.27G-16core / 24G CUBRID Broker 1 Centos5.3 / xeon 2.27G-16core / 24G Meta DB 4 Centos5.x / xeon 2.33G-4core / 8G User DB 1 Centos5.3 / xeon 2.5G-8core / 8G
  38. 38. Load Generator Performance 100000 80000 60000 RPS 40000 20000 0 32 64 96 128 160 192 256 320 384 448 512 # of concurrent users Performance trend when load is increased 60000 70 50000 60 50 40000 40 30000 30 20000 20 10000 10 0 0 64 128 192 256 320 proxy cpu RPS metadb TPS Mean Time(ms)
  39. 39. - Similar performance until 128 Vuser - When SHARD is not used, 128 Vuser is maximum - In SHARD usage case, when # of Vuser is increase - maximum performance can be achieved as well as shorter response time and lower CPU utilization. 64 128 192 256 320 Vuser
  40. 40. TPC-C Performance Test
  41. 41. • • AWS Xlarge instance – • 7GB RAM • 20 EC2 units – – • Ubuntu 12.04 64-bit – • CUBRID 9.0 (beta) – – no shrading – • MySQL 5.5.28 – • Buffer – • 2.8GB – data_buffer_size • 2.8GB • innodb_pool_size • Default configurations
  42. 42. 46 44.18 42.66 42 38 MySQL 5.5.28 CUBRID 9.0 34 30 TPC-C Index
  43. 43.               
  44. 44. • – – • – – • –
  45. 45. What’s next for CUBRID?
  46. 46.    
  47. 47. www.cubrid.org Esen Sagynov CUBRID Project Manager esen@cubrid.org CUBRID Q&A www.cubrid.org/questions

Notes de l'éditeur

  • Self introduction.
  • CUBRID is a fully-feature Relational Database Management System.CUBRID is not a usual open source project backed by a community, but it’s actually backed by the largest IT corporation in South Korea.
  • Today I want to talk about the importance of relational database systems.
  • Nice NoSQL vs. RDBMS discussion on one of the Russian forums http://it-talk.org/post80487.html#p80487
  • In South Korea, Enterprise Business is even more dependent on Oracle database.
  • If you ask companies who operate mission-critical services, they will tell:1) that a relational database system is still the best choice for mission-critical data;2) that service availability is more important than performance;3) that high performance is good, but predictable performance is the king.The fellows at Box.com cloud storage platform also say they choose RDBMS for mission-critical data.
  • We’ve developed Database Sharding in CUBRID!The difference between partitioning and sharding is that with partitioning you can divide the data between multiple tables within one database which have identical schema.But with sharding you divide data between tables located in different databases. Sometimes the database gets so big that mere tables partitioning is not enough, in fact, it will hinder the performance of the entire system. So we’d better add new databases otherwise called Shards.If HA is for READ distribution, Sharding is for WRITE distribution as you can write to different databases simultaneously.This feature is something mostdevelopers dream to have it on Database side rather than on the application layer. Database Sharding doesn’t just simplify the developers’ life, but also improves both the application and database performance.The Application gets rid of the sharding logic.The Database reduces the index size.Win-win!
  • - Talking about the open source RDBMS solutions, MySQL doesn’t provide database sharding out of the box.- Google had to significantly change MySQL replication to make it work similarly. But at the time Sun, the former owner of MySQL didn’t accept Google’s changes, resulting in a fork form mainstream without mainstream support.- Twitter has recently opened their MySQL fork.http://www.oracle.com/technetwork/database/features/availability/300461-132370.pdf
  • SHARD_KEY_MODULAR = 256SHARD_KEY_LIBRARY_NAME = stringSHARD_KEY_FUNCTION_NAME = string
  • No additional SQL parsing because of HINT.
  • Eugen:When I started thinking about this presentation, this is the outcome that I wanted from it:For the experienced guys in the audience this are the thoughtswhat I want you to have at the end of this presentation. I want you to think that:Some guys talked about some cool stuff they encountered in applications (don't remember what)There's a database that they use for this type of applications, it's open source and saves a lot of trouble (don't remember what trouble exactly)They're really keen on doing things rightThis is what I remember from every presentation that I’ve attended. Not the details.So I don’t expect you to remember the technical details. What I want is to grasp the concept of what we will talk about.

×