SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
Rapid Scalability of Complex and Dynamic Web-Based
        Systems: Challenges and Recent Approaches to Mitigation

                  Mani Malarvannan                                      S. Ramaswamy
                      Cybelink                                Department of Computer Science
                  www.cybelink.com                           University of Arkansas at Little Rock
                   manim@ieee.org                                 AR 72204. srini@ieee.org


Abstract                                                    over the world, to provide access redundancy, so their
In this paper, we summarize and outline some of the big     customers always experience very efficient service
challenges in addressing performance, scalability, and      performance. Managing such vastly distributed data
availability of applications with large and complex         across thousands of servers is not a simple task and it
backend database systems. We present some of the            needs an effective system of systems engineering
limitations imposed by the CAP theorem on distributed       principles to identify and fix the issues that arise
systems development and identify some approaches that       dynamically in real-time. Of course, while most of the
can be used to effectively address these limitations. We    data will sit idle in disks and will only be retrieved
propose the adoption of an effective System-of-Systems      when requested, the efficiency of retrieval when needed
approach that pushes adoption of principles supported       is of critical importance. Moreover, there are still
by loosely-coupled SPU based architecture, which can        several petabytes of data like email, blogs, web
support incremental development and rapid scalability       searches, online business transactions, etc. that are read
of such complex and dynamic applicative needs.              and regularly updated by the internet community.

1. Introduction                                             Web companies get thousands of concurrent requests
                                                            either to read or update their data. They cannot store
The rapid evolution of Internet has created technology      such data in small server farms to effectively serve their
companies such as Google, Amazon, Facebook,                 customers‟ needs. They need to store their data across
Twitter, etc., which are unlike traditional multinational   multiply distributed servers that are often
companies. These web-focused companies have one             geographically separated in multiple locations. For
major business requirement that is unique - they need to    example if Amazon gets a request for a book from a US
mange several petabytes of data to efficiently run their    host it will read it from one of their data center located
daily operations. The existence of these companies          in US to the customer's browser. Now consider another
depends upon the performance, scalability, and              request comes for the same book from a customer in
availability of their large and complex backend             London it will be retrieved from Amazon data center
database systems. Traditional relational database design    located in London. Say both the customers add the book
techniques cannot support these companies manage            into their shopping cart and successfully processed the
such massive amounts of data. Furthermore, such             order. Now Amazon need to reduce the quantity of the
voluminous data cannot be stored and managed using a        book by two and it has to be replicate this update to all
single server, or even a cluster of servers.                their servers. How can Amazon do this seamlessly
                                                            without any face-losing failures?
The Internet research firm IDC estimates that the digital
universe has 281 exabytes of data and it will grow to be    One of the hallmarks of such web-based applications is
1.8 zetabytes in 2011 [1], most of the exponential          the ability to handle request spikes that come
growth happening not due to traditional computing and       nondeterministically from all over the world. Anytime a
communication equipments, but through mobile and            major event (like Haiti Earthquake) happens, the
embedded devices. For example, Google is estimated to       number of tweets to a Twitter site spikes drastically by
currently process 20 petabytes [2] of data every day and    several millions. During these unexpected events, users
this is growing rapidly. Amazon with their cloud            following other users send a flurry of requests to find
computing solutions is storing petabytes of customer's      what other users are tweeting about. If one can imagine
data in their systems. Twitter gets more than 50            a user following thousand of other users trying to find
million tweets per day. For efficient management and        the tweets posted, the issue then is about how can one
service, they require thousands of servers located all      manage these sudden upsurges in web traffic?


    Accepted for publication in the 5th IEEE International Conference on System of Systems
    Engineering
volume of data, and these indexes often cannot fit
Among several popular Google applications is Google                in the memory. Hence RDBMS need to do disk
Analytics [3, 4], which serves a very important business           seeks to find data, an expensive operation that
need, especially for small businesses. This application            degrades the scalability of the application.
embeds a small code snippet on web sites, thereby             (ii) RDBMS vendor solutions to the above problem is
allowing companies to monitor and analyze web traffic              to buy bigger servers with more RAM capacity,
on their sites. It is a free service offered by Google and         termed vertical scaling. Though this may work for
millions of web sites are using it to perform real-time            some instances, it cannot completely address huge
analysis of web site traffic. Such a strong dependency             user spikes that are experienced by the web
by businesses implies Google should provide these                  companies and ultimately it will be a performance
services consistently.                                             bottleneck to support the simultaneous requests
                                                                   from a large number web customers.
Once data is distributed and stored across multiple
servers, companies need to worry about answering all          While relational database systems are a good fit for
the questions raised above in a cost effective and            applications that follows ACID properties, they are not
efficient manner. In such a complex system of multiply        appropriate for applications that undergo rapid surges in
distributed server farms, single or multiple server           user activities. For example, in a trading application it
failures are to be assumed as a daily occurrence, and         is a fundamental requirement to inform the customer if
companies need to identify and fix such server failures       a bid is successfully placed or not, an RDBMS may
as they occur without causing any service disruption.         suffice. Any time a transaction fails due to one database
This has become a huge issue for cloud computing              instance failure then the entire transaction is abandoned.
environments; environments where companies manage             The data updated at one instance is either available in
data for other companies, subject to honoring specific        all the instances or it is not. For web-based applications,
service-level agreements (SLAs). Traditional processes,       ACID properties are too restrictive to satisfy the
tools and technologies cannot help companies to               associated business needs due to surges in user activity;
manage their data stored over such a vast array of            hence a different set of properties must be established
servers. They need to adopt system of systems                 to      achieve      scalability     and      performance.
engineering principles; specifically with respect to
dealing with distributed databases, in order to achieve       2.1. RDBMS Replication
their goals.                                                  RDBMS has been using data replication techniques to
                                                              achieve performance and scalability. There are several
The rest of the paper is organized as follows: Section II     types of replication available and each has its own set
briefly presents RDBMS, their properties, various             of limitations for web based applications. Different
efforts to improve RDBMS for rapid scalability and            relational databases support different flavors of
some of their limitations. Section III presents the CAP       replication: Master-Slave, Multi-Master, Synchronous,
theorem, BASE property and its importance to the              and Asynchronous. For read-intensive applications,
development of such applications that have rapid              replication gives good scalability and performance but
„exponential-scale‟ upswings in load / user driven            for write-intensive applications it cannot provide the
activity. Section IV…, Section V ….                           desired scalability. Let us examine each of these in
                                                              more detail in the sequel.
2.    RDBMS           and       ACID         Properties
                                                              2.1.1. Master-Slave Replication: In this configuration,
RDBMS have been a de-facto standard for managing              one master manages several slaves; while reads can
data for several years now. Based on the ACID                 happen from any slave, all writes must go through one
properties    (Atomicity,      Consistency,      Isolation,   Master. This restriction degrades scalability and
Durability), a RDBMS offered all the benefits needed          performance. It additionally, leads to a single point of
to manage data in a single server or in a rack of servers.    failure for all write operations.
But RDBMS have failed to offer the benefits needed for
web companies due to following reasons:                       2.1.2. Multi-Master Replication: In this configuration,
(i) RDBMS is built on a btree index that works well as        more than one master manages several slaves, though
    long as the index can be stored in memory. Any            write operations may scale, it will lead to manual
    time a request comes, index in the memory is used         conflict resolutions which are error-prone and
    to find the location in the hard disk to either read or   problematic as the number of masters‟ increase.
    update the data. By avoiding the disk seeks
    RDBMS gave good performance to the users. But             2.1.3. Synchronous Multi-Master Replication:
    web based companies need to maintain large                RDBMS vendors have been using two phase commit


Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
transactions for a long time to guarantee ACID                have to perform join query to join the twittes across all
properties to support synchronous replication. In the         the servers where the “following users” data resides.
two phase commit protocol, first a transaction manager
asks each participating database instance to check if it            Following


can commit the transaction. If all the database instances                        User       Twitte

agree then the transaction manager asks each database              Followed by

instance to commit the transaction. If any database
                                                                          Figure 1. Twitter Information flows
instance vetoes the commit, then all database instances
are asked to roll back their transactions. For web-based
applications this cannot work as they maintain their          3. Distributed Databases and Scalability
data in multiple servers that could go down without
warning, causing write failures and degrading                 The Internet pioneers Amazon and Google created
availability of the application.                              distributed database systems that follows Eric Brewer's
                                                              CAP theorem [6] to satisfy their own needs, which are
2.1.4. Asynchronous Multi-Master Replication: In              now widely used by other internet companies.
asynchronous replication, when more than one master
gets a write operation for the same data, both will           3.1. CAP Theorem: Simply put, the CAP theorem
succeed. In this case, the database has to resolve the        suggests that no distributed database systems can have
conflicts based on timestamps on the data. Current            all of the following properties of Consistency,
relational databases do not support this feature out of       Availability and Partition Tolerance, at the same time.
the box. Even if they have the ability to support, such
support is not automatic and manual intervention is           1.    Consistency: Set of operations performed on
needed to resolve the conflicts.                                    distributed databases is perceived as a single
                                                                    operation by the clients. It is similar to “C” in
2.2. RDBMS and Sharding                                             ACID properties. In low volume transactions
Database sharding is a process in which the data is                 where the data is stored in single database, it is
partitioned either horizontally or vertically and kept in           possible to achieve consistency. But for web based
different servers to increase performance and                       applications where the data is stored in multiple
scalability. Although this concept has been around for a            servers, it is not possible to achieve strict
while, it has been popularized by Google engineers for              consistency.
improving the throughput and overall performance of           2.    Availability: The system must be always available
high-transaction, large database-centric business                   to the clients
applications Database vendors support two types of            3.    Partition Tolerance: All the operations on the
partitions, vertical partition and horizontal partition. In         distributed database will complete even if
vertical partition a table is split vertically and kept in          individual database instances are not available.
different servers and in horizontal partition different
rows are kept in different database servers. When a read      3.2. BASE: Distributed databases which have only two
or write operation comes from a client it is routed to the    of the properties mentioned in the CAP theorem, are
server where the data resides to execute the operation.       said to satisfy BASE properties, or Basically Available
                                                              Soft State Eventually Consistent. BASE properties
The main problem with relational database partition is        suggest that distributed systems are available all the
performing join queries, without partition all the join       time even if the individual database instances are not
queries will execute in the same server. Once the data is     consistent and it will become consistent eventually.
shared across multiple servers, it is not feasible to         BASE is opposite to ACID, ACID is pessimistic and
perform join queries in a reasonable time that spans          forces consistency on every operation. BASE, on the
multiple servers. For example, if we are creating a           other hand, is optimistic and accepts the fact that the
partition for a twitter website in a relational database      data will be in inconsistent state for clients, by relaxing
using horizontal partitions, one approach is to store the     the consistency distributed databases achieve
different set of users and their tweets in different          scalability, availability, and performance which are the
servers. When a request comes to display all the twittes      backbone for web applications. Of course BASE is not
for a particular user, it is easy to read the data from one   suitable for all types of applications, for example it is
server and send it to the client. Twitter allows a user to    not possible to design an online trading system using
follow other users, by following other users you can see      BASE, one will need to use relational database that
their tweets. Now if a request comes from a user to see       supports ACID. But there are web applications where it
all the tweets posted by their “following users”, we          is acceptable to have BASE properties and for those
                                                              distributed databases, it is fast becoming the standard.


Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
3.3. Distributed Databases                                     3.3.2. Dynamo: Dynamo is Amazon‟s distributed key-
Google created Big Table [7] and Amazon created                value pair database used by several Amazon
Dynamo [8] to solve their own scalability, performance,        applications. Based on CAP theorem Dynamo gives
and availability issues. Numerous open source projects         importance to A and P, any time a client puts a value
created distributed databases based on these two               with an associated key, Dynamo adds version to the
technologies that are called in different names like key-      key-value and replicates it to other nodes within the
value db, document store, extended record, etc. While,         same cluster.
explaining all the projects and their differences is
beyond the scope of this paper, we will explore the big        Number of nodes that will be replicated is a
two: Big Table and Dynamo to see how they achieve              configuration parameter. In the figure, Client1 executes
BASE properties.                                               a put command with key-value pair to Node6, if the
                                                               replication configuration parameter is set to three;
3.3.1. Big Table : Google‟s BigTable is a multi                Node6 replicates the key-value along with version to
dimensional map built on top of their propriety Google         Node1, Nod2, and Node3. Any time a Node gets a key-
File System, GFS [9]. In BigTable data is arranged in          value with version, first it checks if it already has the
rows and columns that can be accessed by giving row            key, if it finds the same key, checks if it is newer to the
and column positions. To manage a huge table, it is            key it just received. If it is new, deletes the older
split across row boundaries called Tablet. By default          version with newer version. This automatic
GFS replicates the data into three servers.                    reconciliation is called syntactic reconciliation. If more
                                                               than one client reads a key-value concurrently and
                                       Performs Meta
                                        Operations
                                                               writes it back, Dynamo accepts both the write and
                    Bigtable
                     Client
                                       Bigtable Master         stores all the updates from the clients. In this case,
                                                               Dynamo has more than one version of the same key-
               Read/Write Operations
                                                               value pair and it cannot reconcile automatically. When
   Primary Tablet       Secondary             Secondary        a read comes for the same key, Dynamo sends all the
       Server         Tablet Server1        Tablet Server2
                                                               versions for that key and makes the client to do manual
                                                               reconciliation which is called semantic reconciliation.
               Figure 2. BigTable Architecture
                                                               Dynamo‟s architecture is based on eventual consistency
BigTable has a Master server that manages the Tablet           model, replication is performed asynchronously. Any
Servers and performs other Meta operations. For read           time a client writes the key-value, even before it
operation BigTable Client sends a request to BigTable          replicates the data to all the nodes, the write operation
Master to get the location of the Primary and secondary        returns to the client. Amazon‟s business requires high
Tablet Servers. Once the client gets the locations of the      scalability on read and writes operations, they get
tablet servers, it directly contacts the closed Tablet         millions of concurrent requests to view their catalog
server to read the data. For write operations once again       and write the contents of the shopping cart. In the CAP
the client contacts the Master server to get the locations     properties Dynamo sacrificed C and has A and P.
of the tablet servers and sends the data directly to the
Primary Tablet Server. The Primary Tablet Server
                                                                                      N1             N2
appends the new data with its existing data and keeps it
in memory. It then notifies the two secondary servers to       Put(key,value)
                                                                                                                   Get(key)

perform the same merging operation. If everything                Client1         N6
                                                                                       N3 replicates to
                                                                                        N1,N2, and N3     N3          Client2
succeeds the Primary Tablet Server notifies the client
                                                                                                                 Returns all the
the success of the write. If any of the merging operation                                                          unresolved
                                                                                                                 versions of the
fails then the operation is repeated until the data is                                N5             N4        objects associated
                                                                                                                  with the key

consistent across all the servers. Based on the above, it
is clear that BigTable supports C and P properties of the
CAP theorem. When the write operation is executed on                            Figure 3. Dynamo Architecture
the BigTable, it is guaranteed that all the data is in
consistent state for subsequent read operations. If any        4. System of Systems Approach
of the merging mechanism fails then the write operation        Based on BigTable and Dynamo there are many open
fails, by giving importance to Consistency and Partition       source projects which are being actively developed and
Tolerance it reduces the availability of the BigTable.         some of them are used in commercial web applications.
Though it is rare all the Tablet servers will fail but it is   While it is true that the CAP theorem states it is not
still possible.                                                possible to achieve all three of C, A, and P


Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
simultaneously, a using a system of systems approach        will operate on top of a distributed database and
where the architecture is dynamically reconfigurable        provides both ACID and CAP properties based on the
can support all three properties adequately. For            needs of the SPUs. Each SPU is a self contained unit
example, by setting the number of nodes to replicate the    that has its own cached data, messaging, rule engine, UI
data asynchronously, we can have distributed DB which       and other software components needed. Only time an
behaves like a RDBMS with strong consistency. Based         SPU communicates with external resource is when it
on the applicative needs, distributed database can be       needs to either read or write data into the distributed
modified from strong to weak consistency models and         database. SPU Request Router routes the incoming
vice versa.                                                 requests to appropriate SPU for processing. Once the
                                                            connection is established between the client and an SPU
4.1. Software Processing Unit Architecture                  all the communication happens between them directly.
Traditional multi layered architecture based approach
has been used to architect large-scale complex software     4.1. Designing software for SPU Architecture
systems. In a multi layered architecture approach, the      The SPU architecture collapses traditional multi-layer
lower layer provides services to the upper layer. Any       design into single module called SPU. This does not
time a software object moves between layers it has to       mean that UI, businesses services, data access logic
go through serialization, de-serialization and it has to    must be designed into one single monolithic code.
pass through different networks which affect both           Individual components must be designed with high
performance and scalability. Multi layer architecture       cohesion and low coupling. The only requirement is
works well where the load on the system is predictable      when deployed all the components for a SPU must
and provides vertical scalability. But the complex          collocate in a single physical server. Intra
nature of web application need massive scalability with     communication between components within a SPU will
minimal cost and buying bigger hardware is expensive        happen through messaging or local method invocation
and hence we may quickly reach the upper limit on           depending on asynchronous or synchronous nature of
scalability.                                                the call. Since there are no remote calls and network
                      SPU Request
                                                            delays for the components inside a SPU, this design
                        Router                              scales rapidly.

                                                            Only time a SPU communicates with external system is
                                                            when it need to persist their data in a permanent store.
                   SPU1             Software
                                                            For this purpose all SPUs run on top of a Distributed
     Software                                     SPU2
     Service         UI             Service                 DB. Based on the needs of an individual SPU the
                                                            distributed DB can have either weak or strong
    Cached      Messa             Cached        Rule
                                                            consistency. All the read operations from SPU to the
     Data        ging              Data        Engine
                                                            Distributed DB happen without delay. Based on the
                                                            need of the individual SPU write operations can happen
                                                            either through weak or strong consistency.

                                                            4.2. Cloud Ready SPU Architecture
                                                            The loosely coupled nature of the SPU architecture
                                                            allows it to be deployed on a cloud computing platform.
                                                            The cloud allows SPU provisioning and virtualization
                        Distributed DB                      based on the needs of external loads. The cloud
       Dynamically configurable weak and strong             monitors the individual loads on a SPU and either
                    consistencies
                                                            increases or decreases the number of instances of the
        Figure 4. Proposed SPU Architecture                 SPU. The cloud can also monitor the load and switch
                                                            the Distributed DB from weak consistency to strong
Building on similar concepts proposed by space-based        and vice versa.
[10] and shared-nothing architectures [11], and
adopting a loosely-coupled systems approach, we             4.3. Limitations of SPU architecture
propose a new architectural paradigm for web                SPU architecture provides massive scalability for read
applications design, in which the application is built as   intensive applications. For write intensive applications
stateless software services and deployed in its own         it provides good scalability compared with traditional
server with all other software components are contained     multi-tier architecture that runs on relational databases,
in a single module called Software Processing Unit,         but it is limited by the communication between SPU
SPU. This SPU architecture, as depicted in the figure 4,    and Distributed DB. Also write intensive applications


Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
need to communicate the state changes between SPU               10. C. Dumont, F. Mourlin, “Space Based Architecture for
instances so that the cached data will be in sync in all            Numerical Solving” Intnl Conf. on Comp. Intel. for
SPU instances.                                                      Modelling, Control & Automation, 2008. pp. 309-314.
                                                                11. J. Lifflancer, A. McDonald, O. Pilskalns, “Clustering
                                                                    Versus Shared Nothing: A Case Study”, 33rd Annual
5. Conclusion                                                       IEEE International Computer Software and Applications
                                                                    Conference, 2009. COMPSAC '09, pp. 116-121.
In this paper, we have summarized and outlined some
of the big challenges in addressing performance,
scalability, and availability of applications with large
and complex backend database systems. We have
presented the limitations imposed by the CAP theorem
and the approaches that can be used to address such
limitations. We have articulated the use of a loosely-
coupled system of systems approach to scale the
software architecture further. Our future work will be
focused on studies that validate the applicability of the
SPU architecture approach with open source
Distributed Database on industrial strength software
systems and issues that arise therein.

6. Acknowledgments
This work is based in part, upon research supported by the
National Science Foundation (under Grant Nos. CNS-
0619069, EPS-0701890, OISE 0729792, CNS-0855248 and
EPS-0918970). Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the
funding agencies.


References
1.   ---, “The Digital Universe Is Still Growing”, EMC / IDC
     study, 2009. http://www.emc.com/digital_universe
2.   J. Dean, S. Ghemwat, “MapReduce: simplified Data
     Processing on Large Clusters”, Communications of the
     ACM, 51(1), Jan 2008, pp. 107-113.
3.   M. Tyler, J. Ledford, “Google Analytics”, JWS, 2006.
4.   http://www.google.com/analytics/
5.   Dan Pritchett, “BASE: An ACID Alternative”, ACM
     Queue, July 28th 2008.
6.   I. Filip, I. C. Vasar, R. Robu, “Considerations about an
     Oracle database multi-master replication”, 5th
     International Symposium on Applied Computational
     Intelligence and Informatics, May 2009, pp 147 – 152
7.   F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A.
     Wallach, M. Burrows, T. Chandra, A. Fikes, R. E.
     Gruber, “Bigtable: A Distributed Storage System for
     Structured Data”, 7th Symposium on Operating System
     Design and Implementation, OSDI'06, pp 205-218.
8.   G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
     A. Lakshman, A. Pilchin, S. Sivasubramanian, P.
     Vosshall, W. Vogels, „Dynamo: Amazon‟s Highly
     Available Key-Value Store”, ACM SIGOPS Operating
     Systems Review, 41(6), December 2007, pp. 205-220.
9.   Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung,
     the google File System, ACM SIGOPS Operating
     Systems Review, 37(5), December 2003, pp. 29-43.




Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering

Contenu connexe

Tendances

An Architecture for Modular Data Centers
An Architecture for Modular Data CentersAn Architecture for Modular Data Centers
An Architecture for Modular Data Centersguest640c7d
 
IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM India Smarter Computing
 
Cloud scalability considerations
Cloud scalability considerationsCloud scalability considerations
Cloud scalability considerationsIJCSES Journal
 
Data Center Solutions: Radical Shift toward Design-Driven Innovation
Data Center Solutions: Radical Shift toward Design-Driven InnovationData Center Solutions: Radical Shift toward Design-Driven Innovation
Data Center Solutions: Radical Shift toward Design-Driven InnovationNetmagic Solutions Pvt. Ltd.
 
"Going Offline", one of the hottest mobile app trends
"Going Offline", one of the hottest mobile app trends"Going Offline", one of the hottest mobile app trends
"Going Offline", one of the hottest mobile app trendsDerek Baron
 
Cloud computing notes unit II
Cloud computing notes unit II Cloud computing notes unit II
Cloud computing notes unit II NANDINI SHARMA
 
The Journey Toward the Software-Defined Data Center
The Journey Toward the Software-Defined Data CenterThe Journey Toward the Software-Defined Data Center
The Journey Toward the Software-Defined Data CenterCognizant
 
Cloud computing pioneers - remarkable examples 2010-11-05
Cloud computing pioneers - remarkable examples 2010-11-05Cloud computing pioneers - remarkable examples 2010-11-05
Cloud computing pioneers - remarkable examples 2010-11-05Abe Pachikara
 
Optimize your virtualization_efforts_with_a_blade_infrastructure
Optimize your virtualization_efforts_with_a_blade_infrastructureOptimize your virtualization_efforts_with_a_blade_infrastructure
Optimize your virtualization_efforts_with_a_blade_infrastructureMartín Ríos
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computingwww.pixelsolutionbd.com
 
Analyst paper: Private Clouds Float with IBM Systems and Software
Analyst paper: Private Clouds Float with IBM Systems and SoftwareAnalyst paper: Private Clouds Float with IBM Systems and Software
Analyst paper: Private Clouds Float with IBM Systems and SoftwareIBM India Smarter Computing
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESTushar Dhoot
 
AMD Putting Server Virtualization to Work
AMD Putting Server Virtualization to WorkAMD Putting Server Virtualization to Work
AMD Putting Server Virtualization to WorkJames Price
 

Tendances (19)

An Architecture for Modular Data Centers
An Architecture for Modular Data CentersAn Architecture for Modular Data Centers
An Architecture for Modular Data Centers
 
IBM Point of View: Security and Cloud Computing
IBM Point of View: Security and Cloud ComputingIBM Point of View: Security and Cloud Computing
IBM Point of View: Security and Cloud Computing
 
IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)IBM Point of view -- Security and Cloud Computing (Tivoli)
IBM Point of view -- Security and Cloud Computing (Tivoli)
 
Cloud scalability considerations
Cloud scalability considerationsCloud scalability considerations
Cloud scalability considerations
 
Data Center Solutions: Radical Shift toward Design-Driven Innovation
Data Center Solutions: Radical Shift toward Design-Driven InnovationData Center Solutions: Radical Shift toward Design-Driven Innovation
Data Center Solutions: Radical Shift toward Design-Driven Innovation
 
"Going Offline", one of the hottest mobile app trends
"Going Offline", one of the hottest mobile app trends"Going Offline", one of the hottest mobile app trends
"Going Offline", one of the hottest mobile app trends
 
Cloud computing notes unit II
Cloud computing notes unit II Cloud computing notes unit II
Cloud computing notes unit II
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
The Journey Toward the Software-Defined Data Center
The Journey Toward the Software-Defined Data CenterThe Journey Toward the Software-Defined Data Center
The Journey Toward the Software-Defined Data Center
 
What is real time SOA?
What is real time SOA? What is real time SOA?
What is real time SOA?
 
Cloud computing pioneers - remarkable examples 2010-11-05
Cloud computing pioneers - remarkable examples 2010-11-05Cloud computing pioneers - remarkable examples 2010-11-05
Cloud computing pioneers - remarkable examples 2010-11-05
 
Optimize your virtualization_efforts_with_a_blade_infrastructure
Optimize your virtualization_efforts_with_a_blade_infrastructureOptimize your virtualization_efforts_with_a_blade_infrastructure
Optimize your virtualization_efforts_with_a_blade_infrastructure
 
Cloud Computing
Cloud ComputingCloud Computing
Cloud Computing
 
IJET-V2I6P19
IJET-V2I6P19IJET-V2I6P19
IJET-V2I6P19
 
Real time service oriented cloud computing
Real time service oriented cloud computingReal time service oriented cloud computing
Real time service oriented cloud computing
 
Analyst paper: Private Clouds Float with IBM Systems and Software
Analyst paper: Private Clouds Float with IBM Systems and SoftwareAnalyst paper: Private Clouds Float with IBM Systems and Software
Analyst paper: Private Clouds Float with IBM Systems and Software
 
CLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTESCLOUD COMPUTING UNIT-5 NOTES
CLOUD COMPUTING UNIT-5 NOTES
 
P18 2 8-5
P18 2 8-5P18 2 8-5
P18 2 8-5
 
AMD Putting Server Virtualization to Work
AMD Putting Server Virtualization to WorkAMD Putting Server Virtualization to Work
AMD Putting Server Virtualization to Work
 

En vedette

Bringing Private Cloud computing to HPC and Science - EGI TF tf 2013
Bringing Private Cloud computing to HPC and Science -  EGI TF tf 2013Bringing Private Cloud computing to HPC and Science -  EGI TF tf 2013
Bringing Private Cloud computing to HPC and Science - EGI TF tf 2013Ignacio M. Llorente
 
IEEE BIGDATA PROJECT TITLE 2015-16
IEEE BIGDATA  PROJECT TITLE 2015-16IEEE BIGDATA  PROJECT TITLE 2015-16
IEEE BIGDATA PROJECT TITLE 2015-16Spiro Vellore
 
cse ieee projects in trichy,BE cse projects in Trichy
 cse ieee projects in trichy,BE cse projects in Trichy  cse ieee projects in trichy,BE cse projects in Trichy
cse ieee projects in trichy,BE cse projects in Trichy vsanthosh05
 
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)VHT: Vertical Hoeffding Tree (IEEE BigData 2016)
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)Nicolas Kourtellis
 
Trust me, I am authentic -
Trust me, I am authentic - Trust me, I am authentic -
Trust me, I am authentic - Gunn Enli
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...Khalid Belhajjame
 
Best topics for seminar
Best topics for seminarBest topics for seminar
Best topics for seminarshilpi nagpal
 

En vedette (7)

Bringing Private Cloud computing to HPC and Science - EGI TF tf 2013
Bringing Private Cloud computing to HPC and Science -  EGI TF tf 2013Bringing Private Cloud computing to HPC and Science -  EGI TF tf 2013
Bringing Private Cloud computing to HPC and Science - EGI TF tf 2013
 
IEEE BIGDATA PROJECT TITLE 2015-16
IEEE BIGDATA  PROJECT TITLE 2015-16IEEE BIGDATA  PROJECT TITLE 2015-16
IEEE BIGDATA PROJECT TITLE 2015-16
 
cse ieee projects in trichy,BE cse projects in Trichy
 cse ieee projects in trichy,BE cse projects in Trichy  cse ieee projects in trichy,BE cse projects in Trichy
cse ieee projects in trichy,BE cse projects in Trichy
 
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)VHT: Vertical Hoeffding Tree (IEEE BigData 2016)
VHT: Vertical Hoeffding Tree (IEEE BigData 2016)
 
Trust me, I am authentic -
Trust me, I am authentic - Trust me, I am authentic -
Trust me, I am authentic -
 
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...Small Is Beautiful:  Summarizing Scientific Workflows  Using Semantic Annotat...
Small Is Beautiful: Summarizing Scientific Workflows Using Semantic Annotat...
 
Best topics for seminar
Best topics for seminarBest topics for seminar
Best topics for seminar
 

Similaire à Ieee-no sql distributed db and cloud architecture report

Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL TechnologiesAmit Singh
 
Ethopian Database Management system as a Cloud Service: Limitations and advan...
Ethopian Database Management system as a Cloud Service: Limitations and advan...Ethopian Database Management system as a Cloud Service: Limitations and advan...
Ethopian Database Management system as a Cloud Service: Limitations and advan...IOSR Journals
 
An architacture for modular datacenter
An architacture for modular datacenterAn architacture for modular datacenter
An architacture for modular datacenterJunaid Kabir
 
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Mihir Gandhi
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Amy W. Tang
 
Cloud application services (saa s) – multi tenant data architecture
Cloud application services (saa s) – multi tenant data architectureCloud application services (saa s) – multi tenant data architecture
Cloud application services (saa s) – multi tenant data architectureJohnny Le
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32jujukoko
 
Megastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesMegastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesJoão Gabriel Lima
 
Database in banking industries
Database in banking industriesDatabase in banking industries
Database in banking industriesnajammm007
 
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docx
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docxDynamo Amazon’s Highly Available Key-value Store Giuseppe D.docx
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docxjacksnathalie
 
Amazon dynamo-sosp2007
Amazon dynamo-sosp2007Amazon dynamo-sosp2007
Amazon dynamo-sosp2007huangjunsk
 
amazon-dynamo-sosp2007
amazon-dynamo-sosp2007amazon-dynamo-sosp2007
amazon-dynamo-sosp2007Thomas Hughes
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lakeCapgemini
 
Technology Overview
Technology OverviewTechnology Overview
Technology OverviewLiran Zelkha
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesG. Habib Uddin Khan
 
Web based lims_white_paper
Web based lims_white_paperWeb based lims_white_paper
Web based lims_white_paperPreeti Gupta
 
Architecting extremelylargescalewebapplications
Architecting extremelylargescalewebapplicationsArchitecting extremelylargescalewebapplications
Architecting extremelylargescalewebapplicationsPrashanth Panduranga
 

Similaire à Ieee-no sql distributed db and cloud architecture report (20)

Big Data using NoSQL Technologies
Big Data using NoSQL TechnologiesBig Data using NoSQL Technologies
Big Data using NoSQL Technologies
 
Performance Evaluation of Virtualization Technologies for Server
Performance Evaluation of Virtualization Technologies for ServerPerformance Evaluation of Virtualization Technologies for Server
Performance Evaluation of Virtualization Technologies for Server
 
Ethopian Database Management system as a Cloud Service: Limitations and advan...
Ethopian Database Management system as a Cloud Service: Limitations and advan...Ethopian Database Management system as a Cloud Service: Limitations and advan...
Ethopian Database Management system as a Cloud Service: Limitations and advan...
 
An architacture for modular datacenter
An architacture for modular datacenterAn architacture for modular datacenter
An architacture for modular datacenter
 
The NoSQL Movement
The NoSQL MovementThe NoSQL Movement
The NoSQL Movement
 
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
Sigmod 2013 - On Brewing Fresh Espresso - LinkedIn's Distributed Data Serving...
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
 
Cloud application services (saa s) – multi tenant data architecture
Cloud application services (saa s) – multi tenant data architectureCloud application services (saa s) – multi tenant data architecture
Cloud application services (saa s) – multi tenant data architecture
 
Cidr11 paper32
Cidr11 paper32Cidr11 paper32
Cidr11 paper32
 
Megastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive servicesMegastore providing scalable, highly available storage for interactive services
Megastore providing scalable, highly available storage for interactive services
 
Database in banking industries
Database in banking industriesDatabase in banking industries
Database in banking industries
 
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docx
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docxDynamo Amazon’s Highly Available Key-value Store Giuseppe D.docx
Dynamo Amazon’s Highly Available Key-value Store Giuseppe D.docx
 
Amazon dynamo-sosp2007
Amazon dynamo-sosp2007Amazon dynamo-sosp2007
Amazon dynamo-sosp2007
 
amazon-dynamo-sosp2007
amazon-dynamo-sosp2007amazon-dynamo-sosp2007
amazon-dynamo-sosp2007
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lake
 
Technology Overview
Technology OverviewTechnology Overview
Technology Overview
 
Case Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile DatabasesCase Study: Synchroniztion Issues in Mobile Databases
Case Study: Synchroniztion Issues in Mobile Databases
 
Web based lims_white_paper
Web based lims_white_paperWeb based lims_white_paper
Web based lims_white_paper
 
Api enablement-mainframe
Api enablement-mainframeApi enablement-mainframe
Api enablement-mainframe
 
Architecting extremelylargescalewebapplications
Architecting extremelylargescalewebapplicationsArchitecting extremelylargescalewebapplications
Architecting extremelylargescalewebapplications
 

Dernier

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Dernier (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

Ieee-no sql distributed db and cloud architecture report

  • 1. Rapid Scalability of Complex and Dynamic Web-Based Systems: Challenges and Recent Approaches to Mitigation Mani Malarvannan S. Ramaswamy Cybelink Department of Computer Science www.cybelink.com University of Arkansas at Little Rock manim@ieee.org AR 72204. srini@ieee.org Abstract over the world, to provide access redundancy, so their In this paper, we summarize and outline some of the big customers always experience very efficient service challenges in addressing performance, scalability, and performance. Managing such vastly distributed data availability of applications with large and complex across thousands of servers is not a simple task and it backend database systems. We present some of the needs an effective system of systems engineering limitations imposed by the CAP theorem on distributed principles to identify and fix the issues that arise systems development and identify some approaches that dynamically in real-time. Of course, while most of the can be used to effectively address these limitations. We data will sit idle in disks and will only be retrieved propose the adoption of an effective System-of-Systems when requested, the efficiency of retrieval when needed approach that pushes adoption of principles supported is of critical importance. Moreover, there are still by loosely-coupled SPU based architecture, which can several petabytes of data like email, blogs, web support incremental development and rapid scalability searches, online business transactions, etc. that are read of such complex and dynamic applicative needs. and regularly updated by the internet community. 1. Introduction Web companies get thousands of concurrent requests either to read or update their data. They cannot store The rapid evolution of Internet has created technology such data in small server farms to effectively serve their companies such as Google, Amazon, Facebook, customers‟ needs. They need to store their data across Twitter, etc., which are unlike traditional multinational multiply distributed servers that are often companies. These web-focused companies have one geographically separated in multiple locations. For major business requirement that is unique - they need to example if Amazon gets a request for a book from a US mange several petabytes of data to efficiently run their host it will read it from one of their data center located daily operations. The existence of these companies in US to the customer's browser. Now consider another depends upon the performance, scalability, and request comes for the same book from a customer in availability of their large and complex backend London it will be retrieved from Amazon data center database systems. Traditional relational database design located in London. Say both the customers add the book techniques cannot support these companies manage into their shopping cart and successfully processed the such massive amounts of data. Furthermore, such order. Now Amazon need to reduce the quantity of the voluminous data cannot be stored and managed using a book by two and it has to be replicate this update to all single server, or even a cluster of servers. their servers. How can Amazon do this seamlessly without any face-losing failures? The Internet research firm IDC estimates that the digital universe has 281 exabytes of data and it will grow to be One of the hallmarks of such web-based applications is 1.8 zetabytes in 2011 [1], most of the exponential the ability to handle request spikes that come growth happening not due to traditional computing and nondeterministically from all over the world. Anytime a communication equipments, but through mobile and major event (like Haiti Earthquake) happens, the embedded devices. For example, Google is estimated to number of tweets to a Twitter site spikes drastically by currently process 20 petabytes [2] of data every day and several millions. During these unexpected events, users this is growing rapidly. Amazon with their cloud following other users send a flurry of requests to find computing solutions is storing petabytes of customer's what other users are tweeting about. If one can imagine data in their systems. Twitter gets more than 50 a user following thousand of other users trying to find million tweets per day. For efficient management and the tweets posted, the issue then is about how can one service, they require thousands of servers located all manage these sudden upsurges in web traffic? Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
  • 2. volume of data, and these indexes often cannot fit Among several popular Google applications is Google in the memory. Hence RDBMS need to do disk Analytics [3, 4], which serves a very important business seeks to find data, an expensive operation that need, especially for small businesses. This application degrades the scalability of the application. embeds a small code snippet on web sites, thereby (ii) RDBMS vendor solutions to the above problem is allowing companies to monitor and analyze web traffic to buy bigger servers with more RAM capacity, on their sites. It is a free service offered by Google and termed vertical scaling. Though this may work for millions of web sites are using it to perform real-time some instances, it cannot completely address huge analysis of web site traffic. Such a strong dependency user spikes that are experienced by the web by businesses implies Google should provide these companies and ultimately it will be a performance services consistently. bottleneck to support the simultaneous requests from a large number web customers. Once data is distributed and stored across multiple servers, companies need to worry about answering all While relational database systems are a good fit for the questions raised above in a cost effective and applications that follows ACID properties, they are not efficient manner. In such a complex system of multiply appropriate for applications that undergo rapid surges in distributed server farms, single or multiple server user activities. For example, in a trading application it failures are to be assumed as a daily occurrence, and is a fundamental requirement to inform the customer if companies need to identify and fix such server failures a bid is successfully placed or not, an RDBMS may as they occur without causing any service disruption. suffice. Any time a transaction fails due to one database This has become a huge issue for cloud computing instance failure then the entire transaction is abandoned. environments; environments where companies manage The data updated at one instance is either available in data for other companies, subject to honoring specific all the instances or it is not. For web-based applications, service-level agreements (SLAs). Traditional processes, ACID properties are too restrictive to satisfy the tools and technologies cannot help companies to associated business needs due to surges in user activity; manage their data stored over such a vast array of hence a different set of properties must be established servers. They need to adopt system of systems to achieve scalability and performance. engineering principles; specifically with respect to dealing with distributed databases, in order to achieve 2.1. RDBMS Replication their goals. RDBMS has been using data replication techniques to achieve performance and scalability. There are several The rest of the paper is organized as follows: Section II types of replication available and each has its own set briefly presents RDBMS, their properties, various of limitations for web based applications. Different efforts to improve RDBMS for rapid scalability and relational databases support different flavors of some of their limitations. Section III presents the CAP replication: Master-Slave, Multi-Master, Synchronous, theorem, BASE property and its importance to the and Asynchronous. For read-intensive applications, development of such applications that have rapid replication gives good scalability and performance but „exponential-scale‟ upswings in load / user driven for write-intensive applications it cannot provide the activity. Section IV…, Section V …. desired scalability. Let us examine each of these in more detail in the sequel. 2. RDBMS and ACID Properties 2.1.1. Master-Slave Replication: In this configuration, RDBMS have been a de-facto standard for managing one master manages several slaves; while reads can data for several years now. Based on the ACID happen from any slave, all writes must go through one properties (Atomicity, Consistency, Isolation, Master. This restriction degrades scalability and Durability), a RDBMS offered all the benefits needed performance. It additionally, leads to a single point of to manage data in a single server or in a rack of servers. failure for all write operations. But RDBMS have failed to offer the benefits needed for web companies due to following reasons: 2.1.2. Multi-Master Replication: In this configuration, (i) RDBMS is built on a btree index that works well as more than one master manages several slaves, though long as the index can be stored in memory. Any write operations may scale, it will lead to manual time a request comes, index in the memory is used conflict resolutions which are error-prone and to find the location in the hard disk to either read or problematic as the number of masters‟ increase. update the data. By avoiding the disk seeks RDBMS gave good performance to the users. But 2.1.3. Synchronous Multi-Master Replication: web based companies need to maintain large RDBMS vendors have been using two phase commit Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
  • 3. transactions for a long time to guarantee ACID have to perform join query to join the twittes across all properties to support synchronous replication. In the the servers where the “following users” data resides. two phase commit protocol, first a transaction manager asks each participating database instance to check if it Following can commit the transaction. If all the database instances User Twitte agree then the transaction manager asks each database Followed by instance to commit the transaction. If any database Figure 1. Twitter Information flows instance vetoes the commit, then all database instances are asked to roll back their transactions. For web-based applications this cannot work as they maintain their 3. Distributed Databases and Scalability data in multiple servers that could go down without warning, causing write failures and degrading The Internet pioneers Amazon and Google created availability of the application. distributed database systems that follows Eric Brewer's CAP theorem [6] to satisfy their own needs, which are 2.1.4. Asynchronous Multi-Master Replication: In now widely used by other internet companies. asynchronous replication, when more than one master gets a write operation for the same data, both will 3.1. CAP Theorem: Simply put, the CAP theorem succeed. In this case, the database has to resolve the suggests that no distributed database systems can have conflicts based on timestamps on the data. Current all of the following properties of Consistency, relational databases do not support this feature out of Availability and Partition Tolerance, at the same time. the box. Even if they have the ability to support, such support is not automatic and manual intervention is 1. Consistency: Set of operations performed on needed to resolve the conflicts. distributed databases is perceived as a single operation by the clients. It is similar to “C” in 2.2. RDBMS and Sharding ACID properties. In low volume transactions Database sharding is a process in which the data is where the data is stored in single database, it is partitioned either horizontally or vertically and kept in possible to achieve consistency. But for web based different servers to increase performance and applications where the data is stored in multiple scalability. Although this concept has been around for a servers, it is not possible to achieve strict while, it has been popularized by Google engineers for consistency. improving the throughput and overall performance of 2. Availability: The system must be always available high-transaction, large database-centric business to the clients applications Database vendors support two types of 3. Partition Tolerance: All the operations on the partitions, vertical partition and horizontal partition. In distributed database will complete even if vertical partition a table is split vertically and kept in individual database instances are not available. different servers and in horizontal partition different rows are kept in different database servers. When a read 3.2. BASE: Distributed databases which have only two or write operation comes from a client it is routed to the of the properties mentioned in the CAP theorem, are server where the data resides to execute the operation. said to satisfy BASE properties, or Basically Available Soft State Eventually Consistent. BASE properties The main problem with relational database partition is suggest that distributed systems are available all the performing join queries, without partition all the join time even if the individual database instances are not queries will execute in the same server. Once the data is consistent and it will become consistent eventually. shared across multiple servers, it is not feasible to BASE is opposite to ACID, ACID is pessimistic and perform join queries in a reasonable time that spans forces consistency on every operation. BASE, on the multiple servers. For example, if we are creating a other hand, is optimistic and accepts the fact that the partition for a twitter website in a relational database data will be in inconsistent state for clients, by relaxing using horizontal partitions, one approach is to store the the consistency distributed databases achieve different set of users and their tweets in different scalability, availability, and performance which are the servers. When a request comes to display all the twittes backbone for web applications. Of course BASE is not for a particular user, it is easy to read the data from one suitable for all types of applications, for example it is server and send it to the client. Twitter allows a user to not possible to design an online trading system using follow other users, by following other users you can see BASE, one will need to use relational database that their tweets. Now if a request comes from a user to see supports ACID. But there are web applications where it all the tweets posted by their “following users”, we is acceptable to have BASE properties and for those distributed databases, it is fast becoming the standard. Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
  • 4. 3.3. Distributed Databases 3.3.2. Dynamo: Dynamo is Amazon‟s distributed key- Google created Big Table [7] and Amazon created value pair database used by several Amazon Dynamo [8] to solve their own scalability, performance, applications. Based on CAP theorem Dynamo gives and availability issues. Numerous open source projects importance to A and P, any time a client puts a value created distributed databases based on these two with an associated key, Dynamo adds version to the technologies that are called in different names like key- key-value and replicates it to other nodes within the value db, document store, extended record, etc. While, same cluster. explaining all the projects and their differences is beyond the scope of this paper, we will explore the big Number of nodes that will be replicated is a two: Big Table and Dynamo to see how they achieve configuration parameter. In the figure, Client1 executes BASE properties. a put command with key-value pair to Node6, if the replication configuration parameter is set to three; 3.3.1. Big Table : Google‟s BigTable is a multi Node6 replicates the key-value along with version to dimensional map built on top of their propriety Google Node1, Nod2, and Node3. Any time a Node gets a key- File System, GFS [9]. In BigTable data is arranged in value with version, first it checks if it already has the rows and columns that can be accessed by giving row key, if it finds the same key, checks if it is newer to the and column positions. To manage a huge table, it is key it just received. If it is new, deletes the older split across row boundaries called Tablet. By default version with newer version. This automatic GFS replicates the data into three servers. reconciliation is called syntactic reconciliation. If more than one client reads a key-value concurrently and Performs Meta Operations writes it back, Dynamo accepts both the write and Bigtable Client Bigtable Master stores all the updates from the clients. In this case, Dynamo has more than one version of the same key- Read/Write Operations value pair and it cannot reconcile automatically. When Primary Tablet Secondary Secondary a read comes for the same key, Dynamo sends all the Server Tablet Server1 Tablet Server2 versions for that key and makes the client to do manual reconciliation which is called semantic reconciliation. Figure 2. BigTable Architecture Dynamo‟s architecture is based on eventual consistency BigTable has a Master server that manages the Tablet model, replication is performed asynchronously. Any Servers and performs other Meta operations. For read time a client writes the key-value, even before it operation BigTable Client sends a request to BigTable replicates the data to all the nodes, the write operation Master to get the location of the Primary and secondary returns to the client. Amazon‟s business requires high Tablet Servers. Once the client gets the locations of the scalability on read and writes operations, they get tablet servers, it directly contacts the closed Tablet millions of concurrent requests to view their catalog server to read the data. For write operations once again and write the contents of the shopping cart. In the CAP the client contacts the Master server to get the locations properties Dynamo sacrificed C and has A and P. of the tablet servers and sends the data directly to the Primary Tablet Server. The Primary Tablet Server N1 N2 appends the new data with its existing data and keeps it in memory. It then notifies the two secondary servers to Put(key,value) Get(key) perform the same merging operation. If everything Client1 N6 N3 replicates to N1,N2, and N3 N3 Client2 succeeds the Primary Tablet Server notifies the client Returns all the the success of the write. If any of the merging operation unresolved versions of the fails then the operation is repeated until the data is N5 N4 objects associated with the key consistent across all the servers. Based on the above, it is clear that BigTable supports C and P properties of the CAP theorem. When the write operation is executed on Figure 3. Dynamo Architecture the BigTable, it is guaranteed that all the data is in consistent state for subsequent read operations. If any 4. System of Systems Approach of the merging mechanism fails then the write operation Based on BigTable and Dynamo there are many open fails, by giving importance to Consistency and Partition source projects which are being actively developed and Tolerance it reduces the availability of the BigTable. some of them are used in commercial web applications. Though it is rare all the Tablet servers will fail but it is While it is true that the CAP theorem states it is not still possible. possible to achieve all three of C, A, and P Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
  • 5. simultaneously, a using a system of systems approach will operate on top of a distributed database and where the architecture is dynamically reconfigurable provides both ACID and CAP properties based on the can support all three properties adequately. For needs of the SPUs. Each SPU is a self contained unit example, by setting the number of nodes to replicate the that has its own cached data, messaging, rule engine, UI data asynchronously, we can have distributed DB which and other software components needed. Only time an behaves like a RDBMS with strong consistency. Based SPU communicates with external resource is when it on the applicative needs, distributed database can be needs to either read or write data into the distributed modified from strong to weak consistency models and database. SPU Request Router routes the incoming vice versa. requests to appropriate SPU for processing. Once the connection is established between the client and an SPU 4.1. Software Processing Unit Architecture all the communication happens between them directly. Traditional multi layered architecture based approach has been used to architect large-scale complex software 4.1. Designing software for SPU Architecture systems. In a multi layered architecture approach, the The SPU architecture collapses traditional multi-layer lower layer provides services to the upper layer. Any design into single module called SPU. This does not time a software object moves between layers it has to mean that UI, businesses services, data access logic go through serialization, de-serialization and it has to must be designed into one single monolithic code. pass through different networks which affect both Individual components must be designed with high performance and scalability. Multi layer architecture cohesion and low coupling. The only requirement is works well where the load on the system is predictable when deployed all the components for a SPU must and provides vertical scalability. But the complex collocate in a single physical server. Intra nature of web application need massive scalability with communication between components within a SPU will minimal cost and buying bigger hardware is expensive happen through messaging or local method invocation and hence we may quickly reach the upper limit on depending on asynchronous or synchronous nature of scalability. the call. Since there are no remote calls and network SPU Request delays for the components inside a SPU, this design Router scales rapidly. Only time a SPU communicates with external system is when it need to persist their data in a permanent store. SPU1 Software For this purpose all SPUs run on top of a Distributed Software SPU2 Service UI Service DB. Based on the needs of an individual SPU the distributed DB can have either weak or strong Cached Messa Cached Rule consistency. All the read operations from SPU to the Data ging Data Engine Distributed DB happen without delay. Based on the need of the individual SPU write operations can happen either through weak or strong consistency. 4.2. Cloud Ready SPU Architecture The loosely coupled nature of the SPU architecture allows it to be deployed on a cloud computing platform. The cloud allows SPU provisioning and virtualization Distributed DB based on the needs of external loads. The cloud Dynamically configurable weak and strong monitors the individual loads on a SPU and either consistencies increases or decreases the number of instances of the Figure 4. Proposed SPU Architecture SPU. The cloud can also monitor the load and switch the Distributed DB from weak consistency to strong Building on similar concepts proposed by space-based and vice versa. [10] and shared-nothing architectures [11], and adopting a loosely-coupled systems approach, we 4.3. Limitations of SPU architecture propose a new architectural paradigm for web SPU architecture provides massive scalability for read applications design, in which the application is built as intensive applications. For write intensive applications stateless software services and deployed in its own it provides good scalability compared with traditional server with all other software components are contained multi-tier architecture that runs on relational databases, in a single module called Software Processing Unit, but it is limited by the communication between SPU SPU. This SPU architecture, as depicted in the figure 4, and Distributed DB. Also write intensive applications Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering
  • 6. need to communicate the state changes between SPU 10. C. Dumont, F. Mourlin, “Space Based Architecture for instances so that the cached data will be in sync in all Numerical Solving” Intnl Conf. on Comp. Intel. for SPU instances. Modelling, Control & Automation, 2008. pp. 309-314. 11. J. Lifflancer, A. McDonald, O. Pilskalns, “Clustering Versus Shared Nothing: A Case Study”, 33rd Annual 5. Conclusion IEEE International Computer Software and Applications Conference, 2009. COMPSAC '09, pp. 116-121. In this paper, we have summarized and outlined some of the big challenges in addressing performance, scalability, and availability of applications with large and complex backend database systems. We have presented the limitations imposed by the CAP theorem and the approaches that can be used to address such limitations. We have articulated the use of a loosely- coupled system of systems approach to scale the software architecture further. Our future work will be focused on studies that validate the applicability of the SPU architecture approach with open source Distributed Database on industrial strength software systems and issues that arise therein. 6. Acknowledgments This work is based in part, upon research supported by the National Science Foundation (under Grant Nos. CNS- 0619069, EPS-0701890, OISE 0729792, CNS-0855248 and EPS-0918970). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding agencies. References 1. ---, “The Digital Universe Is Still Growing”, EMC / IDC study, 2009. http://www.emc.com/digital_universe 2. J. Dean, S. Ghemwat, “MapReduce: simplified Data Processing on Large Clusters”, Communications of the ACM, 51(1), Jan 2008, pp. 107-113. 3. M. Tyler, J. Ledford, “Google Analytics”, JWS, 2006. 4. http://www.google.com/analytics/ 5. Dan Pritchett, “BASE: An ACID Alternative”, ACM Queue, July 28th 2008. 6. I. Filip, I. C. Vasar, R. Robu, “Considerations about an Oracle database multi-master replication”, 5th International Symposium on Applied Computational Intelligence and Informatics, May 2009, pp 147 – 152 7. F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, R. E. Gruber, “Bigtable: A Distributed Storage System for Structured Data”, 7th Symposium on Operating System Design and Implementation, OSDI'06, pp 205-218. 8. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, W. Vogels, „Dynamo: Amazon‟s Highly Available Key-Value Store”, ACM SIGOPS Operating Systems Review, 41(6), December 2007, pp. 205-220. 9. Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung, the google File System, ACM SIGOPS Operating Systems Review, 37(5), December 2003, pp. 29-43. Accepted for publication in the 5th IEEE International Conference on System of Systems Engineering