Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Abhishek Kumar - CloudStack Locking Service

114 vues

Publié le

As CloudStack cannot work with any mysql clustering solution, it is time to explore a new locking service, manager and pluggable interface which would allow CloudStack DB to be HA enabled with multi-master read/write. Talk will focus on, - Need for a locking service and challenges with existing CloudStack architecture - Different possible clustering solution that can be adopted - Showcasing a PoC for future implementation with minimal changes to existing architecture using percona xtradb or any other clustering solution - Additionally, explore the idea of getting rid of mshost table, and use locking service to find about other management servers.

Publié dans : Technologie
  • Soyez le premier à commenter

Abhishek Kumar - CloudStack Locking Service

  1. 1. CloudStack Locking Service Abhishek Kumar Software Developer, ShapeBlue abhishek.kumar@shapeblue.com
  2. 2. About me  Software Developer at ShapeBlue  From Meerut, India  Previously used to develop applications for desktops and mobile  Worked on CloudStack features – Domain, zone specific offerings, VM ingestion, container service plugin  Love going gym, watching action-thriller movies, discussing politics
  3. 3. Objective New locking service, manager and pluggable interface with ZooKeeper (using curator framework), Hazelcast or other distributed lock managers. Outcome: cloudstack db can be HA enabled with multi-master read/write, using clustering solution. Peer discovery
  4. 4. Why?  CloudStack can control 100s of hosts with 1000s of virtual machines  Can support multiple management servers  But for database!!!  Limited support for replication and high availability. Cannot use mult- master replication  Implementing active-active, active-passive configuration becomes difficult  Database clustering not possible
  5. 5. Topics  Locking Introduction  Database locking  Locking in CloudStack and its limitations  Distributed locks  Introduction  Different Distributed Lock Managers  Overview of Apache Zookeeper  Overview of Hazelcast  Demo  Implementation of new locking service, pluggable interface with Apache Zookeeper- Curator, Hazelcast  Comparison, current limitation, future work  Q & A
  6. 6. Lock  Lock or mutex is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy  Locks – usually threads of same process, Mutex – threads from different processes  Can be advisory or manadatory  Granularity - measure of the amount of data the lock is protecting. Fine for smaller, specific data and coarse for larger data  Issues –  Overhead  Contention  Deadlock
  7. 7. Database locks Ensuring transaction synchronicity  Mainly two types,  Pessimistic – Record is locked until the lock is released  Optimistic – System keeps copy of initial read and later verifies data on release accepting or rejecting update Wikipedia uses optimistic locking for document editing  Different granularity  Database level  File level  Table level  Page or block level  Row level  Column level
  8. 8. DB Locks Issues – Lock contention Many sessions requiring frequent access to same lock for short amount of time resulting in “single lane bridge” Example: Deploying 100s of VM simultaneously
  9. 9. DB Locks Issues – Long Term Blocking Many sessions requiring frequent access to same lock for long period of time resulting in blocking of all dependent sessions
  10. 10. DB Locks Issues – Database Deadlocks Occurs when two or more transactions hold dependent locks and neither can continue until the other releases
  11. 11. DB Locks Issues – contd. Other issues,  Overhead  Difficult to debug  Priority inversion  Convoying
  12. 12. Locking in CloudStack  Uses MySQL lock functions to acquire and release locks on database connections  A hashmap is kept for all the acquired locks and their connection in the code  Fast and effective as locking is taking place in database itself.
  13. 13. Locking in CloudStack – contd. Limitations with current design,  Cannot work with MySQL clustering solutions This is due to locking functions – GET_LOCK(), RELEASE_LOCK() are not supported by clustering solutions like Percona XtraDB, https://www.percona.com/doc/percona- xtradb-cluster/LATEST/limitation.html  HA enabled, multi-master DB cannot be implemented Solution could be implementing distributed locks using available distributed locking services
  14. 14. Distributed Locks  Synchronize accesses to shared resources for the applications distributed across a cluster on multiple machines  Coordination between different nodes  Ensure only one server can write to a database or write to a file.  Ensure that only one server can perform a particular action.  Ensure that there is a single master that processes all writes
  15. 15. Distributed Locking - Implementation  Complex compared to conventional OS or relational DB locking as more variables present, network, different nodes which could individually fail at any time  Different algorithms – Redis, Paxos, etc.  Implementation of Distributed Locking Manager (DLM)  Different types of lock DLM can grant, Null, Concurrent Read, Concurrent Write, Protected Read, Protected Write, Exclusive
  16. 16. Distributed Locking - Implementation Null (NL) Concurrent Read (CR) Concurrent Write (CW) Protected Read (PR) Protected Write (PW) Exclusive (EX)
  17. 17. Distributed Locking Manager  Apache ZooKeeper – high performance coordination service for distributed systems, can be used for distributed locks  Redis - advanced key-value cache and store, can be used to implement Redis algorithm for distributed lock management.  Hazelcast - distributed In-Memory Data Grid platform for Java  Chubby - lock service for loosely coupled distributed systems developed by Google  Etcd, Consul
  18. 18. Apache ZooKeeper  An open source, high-performance coordination service for distributed applications.  Exposes common services in simple interface:  naming  configuration management  locks & synchronization  group services … developers don't have to write them from scratch  Build your own on it for specific needs.  Apache Curator – Java client library
  19. 19. Apache ZooKeeper contd. • ZooKeeper Service is replicated over a set of machines • All machines store a copy of the data (in memory) • A leader is elected on service startup • Clients only connect to a single ZooKeeper server & maintains a TCP connection. • Client can read from any Zookeeper server, writes go through the leader & needs majority consensus.
  20. 20. Apache ZooKeeper Implementation Need to use Curator framework with it. Different implementation recipes available, https://github.com/apache/zookeeper/tree/master/zookeeper-recipes  Start an embedded server, create client to connect to this server, File dir = new File(tempDirectory, "zookeeper").getAbsoluteFile(); zooKeeperServer = new ZooKeeperServer(dir, dir, tickTime); serverFactory = new NIOServerCnxnFactory(); serverFactory.configure(new InetSocketAddress(clientPort), numConnections); serverFactory.startup(zooKeeperServer); … RetryPolicy retryPolicy = new ExponentialBackoffRetry(1000, 3); curatorClient = CuratorFrameworkFactory.newClient(String.format("", clientPort), retryPolicy); curatorClient.start();  Locks can be acquired and released for a given name InterProcessMutex lock = new InterProcessMutex(curatorClient, String.format("%s%s", tempDirectory, name)); lock.acquire(timeoutSeconds, TimeUnit.SECONDS) … lock.release();
  21. 21. Hazelcast  The Hazelcast IMDG operational in-memory computing platform helps leading companies worldwide manage their data and distribute processing using in-memory storage and parallel execution for breakthrough application speed and scale.  Hazelcast implement a distributed version of some Java data structures like Maps, Set, Lists, Queue and Lock  ILock is the distributed implementation of java.util.concurrent.locks.Lock.
  22. 22. Hazelcast - Implementation  Define config, set CPSubsytem member, create HazelcastInstance objects Config config = new Config(); CPSubsystemConfig cpSubsystemConfig = config.getCPSubsystemConfig(); cpSubsystemConfig.setCPMemberCount(3); hazelcastInstance = Hazelcast.newHazelcastInstance(config); ...  Locks can be acquired and released FencedLock lock = hazelcastInstance.getCPSubsystem().getLock(name); lock.tryLock(timeoutSeconds, TimeUnit.SECONDS); ... lock.unlock();
  23. 23. Locking Service in CloudStack  Pluggable service implementation using existing distributed lock managers for different locking service plugins  Global setting to control the locking service, db.locking.service.plugin  Current implementation using Apache ZooKeeper and Hazelcast
  24. 24. Demo
  25. 25. Why generic framework design  Choice  Easier to develop  Performance difference
  26. 26. Locking Service in CloudStack - Issues  Apart from traditional issues wrt locking service, speed will be a major issue compared to existing database locking in CloudStack. Since locking will be managed by a server it will create an additional overhead 0 2 4 6 8 10 12 Lock 1 Lock 2 Lock 3 Lock 4 Lock 5 Lock 6 Lock 7 Lock 8 Lock 9 Lock 10 Lock 11 Lock 12 Lock 13 Lock 14 Lock 15 Timeinmilliseconds Locks Lock acquire performance during VM deployment Current DB Locking ZooKeeper Hazelcast
  27. 27. Future work  Current state – basic implementation with HazelCast, ZooKeeper  Testing with database clustering  Optimization for better performance  Implement peer discovery for getting rid of mshost table and using locking service for discovering different management server nodes.  Code cleanup and start PR  Target 4.15(if not 4.14)
  28. 28. Thank You! Thoughts and Question