SlideShare une entreprise Scribd logo
1  sur  6
An Enterprise System should be Highly Scalable,
Always Available, Easily Manageable, Fast
Performer, Auto-Healer and Capable of Super
Fast Searching and Intelligent Analysis through
Machine Learning.

Scalability means the system does not downgrade even when load increases multiple times.
For example, when number of users of an ecommerce system increases heavily further leading to
sudden increase in transactions per user, the system should still keep performing with same SLA
and handle the traffic smoothly ensuring that business is not down.

One of the design choice for large-scale system is to scale out horizontally. Create one DB per
Entity (User/ Item/ Product) and store only Key-Value in that table. Replicate the DB in various
machines and implement a scheme for data-sharding.
For example, in the Modulo(Entity_Pky, 100) approch, User Bob with id 1000 will be located in
machine #10 . So if number of users increases suddenly, new User DB will be created in new
machine and the new user info will be inserted automatically.

There are other data-sharding policies like Range-based partitions, Lookup based approach,
Read-Most-Write-Least Model.

SimpleDB can be effectively used to manage lookup-orieneted entity info.
This is an out-of-the-box offering from Amazon AWS as a hosted and managed solution
administered by Amazon itself. SimpleDB does the heavy-lifting of multi-value data fetching for a
key, batch operations, consistent reads.

Its better to delegate the tradional 'data management operations' like 'Relations, Transactions,
Locks, Sequences, Constraints' to Application Layer as SimpleDB is meant for handling simple
things !! Such a typical example is eBay DAL layer.

Actually the System Architect should choose the Entities which are meant for mere lookup (like
User, Item, Manufacturer, Order) and mark them as best candidates for SimpleDB items. Those
entities will be then guarnteed to be highly available. Off course this pattern is not suitable for
financial transactions in banking domain and rule-based complex events where every fetch query
triggers sub-query-based rules or procedures.

Discussing further details on implementing SimpleDB is out-of-scope here.

If we want to handle complex business transactions while updating Entity Info and manage intricate
relatiuonships then we should take resort to Cassandra or mongoDB.

But we should remember that one needs to take care of Scalability in Application layer while
adopting NoSQL Dbs.
If the data are hierarchical in nature - we can leverage a Graph Database (MonogoDB) which is as
opposed to conventional sql-db is a de-normalized graph processor.

Availability : Load-balancing and Clustering are standard practices for making applications
available. Normally DNS Resolver routes to a pool of servers for a requested application and Load-
balancer picks up one server.

Machines should be load-balanced in such a way the moving the user from one machine to another
machine can be achieved easily without shutting down the system.

Fast Searching :

NoSql is a must for fast lookup of trillions of Business Entity data and persisting time-critical
entities by locking data storage row for a negligible amount of time (contrary to traditional
RDBMS), yet be able to write-through / broadcast write events to sub0system grids (Search Grid /
Analytics Machine / Events Collector etc.)
Cassandra does a very neat job for that matter.

"The sparse two-dimensional “super-column family” architecture allows for rich data model
representations (and better performance) beyond just a simple key-value look up.....
Some of the most attractive features of Cassandra are its uniquely flexible consistency and
replication models. Applications can determine at call level what consistency level to use for reads
and writes (single, quorum or all replicas). This, combined with customizable replication factor, and
special support to determine which cluster nodes to designate as replicas, makes it particularly well
suited for cross-datacenter and cross-regional deployments. In effect, a single global Cassandra
cluster can simultaneously service applications and asynchronously replicate data across multiple
geographic locations...."

Its worth paying for the learning curve and operational compexities in exchange of the 'scalability,
availability and performance advantages of the NoSQL persistence model'....

In traditional DB, what happens if one of the node containing one of the table in a join condition is
down !
Simple - now the whole application that depends on that join condition is unavailable !
Well there is no join condition in NoSql !! Cassandra is best for cross-regional deployments and
scaling with no single points of failure.

Next question – does the NoSql gurantees data-consistency the same way Relational DB vouches
for read-after-write consistency at the cost of Blocking the read untill Transaction is finished !

Well NoSql follows CAP theorem not ACID principle !
So if you think 'immediate consistency' need to be ensured for super-critical tasks like Bidding /
Buying, better await till data committed before reading the latest data ! In certain cases of 'Eventual
Consistency' like searching data where we expect fast response we may not await for latest data to
availble but rather instantly display the pre-computed search result !!

Also there is no concurrency bottlenecks ! Replicas are mostly Multi-master Highly Available !
NoSql ensures Asynchronous Lock-based Reconciliation as opposed to Synchronous Lock-based
Reconciliation bby SQL.

This means more work but saving time !! Using Message -Multicast do write-behind to replicated
database / grid without taouching master db. Then synchronize with master DB after a specific time
period. This saves lot of time spent in Index Arrangement and Sequential Updation.

Say you want to search – 'Blue Sony mp3 player' ! NoSql will hold the lock for a row only for
couple of milliseconds as opposed to locking the entire table by RDBMS ! NoSql will just lookup
the id of next entity (manufatures) against the id of the main entity (product) and move the mesage
to the next table !! At the same time, it looks up the product requested by the next user !! No Joins ,
Just Intelligent Routing !! Handling millions of requests concurrently has never looked so esy
before !!!

Which is for What ?

Memcahed : Static Key-Value Pair
Neo4j : Network Graph Store
Hbase : Row-Orieneted Sparse Column Store
MongoDB : (Key, Document) Storage.

Great Document : http://highscalability.com/blog/2011/2/15/wordnik-10-million-api-requests-a-
day-on-mongodb-and-scala.html

Fast Data Analysis

Apache Hbase – best suited for Data Analytics, Data Mining, Enterprise Search and other Data
Query Services.

If the main business is to mine PeraBytes of data and perform parallel range-queries and then
combine the results through batch Map-Reduce (say Enterprise Search for Video Content), then one
of the defacto choice is Apache Hbase configured with haddop Jobs, using HDFS as a shared
storage platform. Hbase comes with availability trade-off i.e. the persistent domain entities may not
immediately available in the Search Result. The reason for this is huge Hadoop Map-Reduce jobs
are performed parallely in offline mode so that main source of data is not hogged by Hadoop Jobs
and is highly Scalable.

It should be noted that Hadoop is not meant for searching in real-time. Its actually an offline batch-
processing system well-suited for BI analytics, data aggregation, normalization in parallel.
Hadoop provides a framework which will automatically take care of interProcess co-ordination,
distributed counters, failure detection, automatic restart of failed process, co-ordinated, sorting and
much more.

There is different stages like mapper, combiner, partitions, reducer etc, instead of we writing from
the scratch, the framework takes care of it.

Erlang / Scala also provide out-of-the box parallel processing.

Its important to know that there are different tools based on hadoop that serve different purposes :

Hive : SQL query for MR
Pig : Scripting for MR
HBase : Low-latency, Big Table like database on hadoop
Oozie : Workflo Mgmt on Hadoop
Sqoop - RDBMS import / export
Zookeeper : Fast, distributed, coordination system
Hue : advanced web env. for Hadoop and custom applications
Here come the ultimate offering from Spring - http://www.springsource.org/spring-data

Apache has a solution for combining lucene with hadoop for blazing-fast document search.

Fast Cache :

Implementing a cold cache with minimum memory footprint (MySQL native memory, Memcached,
Terracota Ehcahce) is absolutely important.

Application Server should not remeber the state of the Entity rather should just persist in DB. The
metadata should be stored in Cold Cache. The persistent pojo objects should never be cached in
memory.

If queries are mostly read-only, very less updates - the mySQL native memory scores high !
MySQL InnoDB storage engine’s dictionary cache provides blazing fast data acceess. On demand,
data blocks are read from disk and stored in memory using LRU chains or other advanced
mechanisms(like MySQL’s Midpoint Insertion Strategy) .

In order to lookup the value for a key, Memcached clients issue multiple parallel requests to
memcached servers where keys are sharded across those servers.
For a frequently changing data set , Memcahed is surely better than DB Cache

Probably the fastes cache on planet is - is Ehcache using Terracotta BigMemory
http://blog.terracottatech.com/2010/09/bigmemory_explained_a_bit.html

Here directly the native byte-buffer of the OS is used bypassing main-memory.

A write-optimized data store. Something that aggregates the writes in RAM, and writes out
generational updates. Take a look at Google's BigTable paper for a good description of this strategy.

A very good reference : http://research.yahoo.com/files/sigmod278-silberstein.pdf



Everything Asynchronous

All communications in every layer should be asynchronous to reduce the latency.

There are 2 types of latency.
1. user latency - how fast user gets back the control on web site
2. execution latency - how fast the execution takes place in backend

UI behavior should be completely Ajax and send the main events to queue and schedule for batch
updates in offline mode. There is absoulutely no room for a wait_state i.e. all response should be
immediate and non-blocking.

Common Flow :

-- Submit a job/task to a thread and return to the user immediately.
-- The thread should perform the operation in background (it may communicate to LRU cache
optimized for concurrent access / Graph Structure / Iry-IIry master-slave replicatioed env / Map-
Reduce based sturctures)
-- Once done with the opertaion it should return the result thru a CallbackHandler and client will get
notification.

There are multiple patterns for Asynchronous Communications.

(1) Store all jobs in Event Queue. Select a queue based on a contract/ algo depending on type of
task. Then use multiple event brokers/ listeners to handle the jobs from queue (thrid party ESB like
Tibco / apache fuse/ apache camel / ...) can be used.
(2) Message Multicast - say user enters a new item in system. do not update the iry db immediately.
rather send messages to pollers / searchers - that there is a 'New' event. Return to user immedialtely.
Now the updater thread will update Iry db. Then searcher will behind the scene query Iry db / data
source to find what has been just added and it will add it to its search grid !
(3) Batch Processing (schedule offline procesing). Identify which type of job requests can be
scheduled for offline processing and do not need immediate attention !

Prallel Threading
Use Executor Service to effectively manage pool of threads -
 Remember a simple set of worker threads 'without the manager' - can simply lead to
 (i) Resource Thrashing (each thread is expensive - execution call stack / context switching)
 (ii) Request Overload .. if all requests are provided threads for execution
 (iii) Thread Leakage : sive of threadpool diminishes due to uncaught errors but requests are not
served !
So there should be a proper manager to allocate threads either FixedThreadPool /
ScheduledThreadPool / SingleThreadPool / QueuedThreadPool with proper RejectionPolicy.

This manager should also place the completed result in a non-blocking queue from where result can
be taken off asynchrounously.

Thread should not wait to acquire locks - leverage advanced processor optimizations to use LATCH
concept to lock/unlock threads at the same time. Locks should be acquired / released in any order.

Fork/Join and CountDownLatch are powerful concepts for running threads parallely.

You can download and try out the open-source Thrift which is a C++ Fwk for multi-threading and
asynchronous processing of job


Auto-Healing and Self-Recovery :

Systems should be falut-tolernt and should know how to degrade gracefully in the scenario of
unprecented load.

The JMX agents and other Robot Apps should continously monitor the system and gather
intelligence to take the best desicion.

A high-degree of automation is requireed for easily recovering from failures and managing the
system smoothly.
Normally automation is driven though Centrallized Logging System and Self-organization Artificial
Intelligence. BI tools are used to analyze user experience and provide Best Match through continous
inhouse experimentations.

Scalable systems are mangeable if new partitions can be added, DB instance can be horizantally
scaled out and new application servers turned on without affecting users of the system.

Case Study :

Twitter :
1. http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010

2. http://www.slideshare.net/kevinweil/hadoop-at-twitter-hadoop-summit-2010

       Twitter Example :
       user SAM tweets -
       > store tweet > iterate social graph
       > split chunks into parallel jobs > prepend packet into its memcahced blob / (some other cache) - if not present in
       cache - talk to db / hadoop

       user RAM who follows SAM - sees sam's tweet -
       > read mysql blob from memcache (or other cache) > deserialze integers > sort > slice > probablistic
       truncation (fast but may not be all consistent).

3. SalesForce : http://www.infoq.com/interviews/dave-carroll-salesforce-api

4. Facebook : http://www.infoq.com/presentations/Scale-at-Facebook .

       Facebook Example :

       Alex friend of bob - logs in
       > Web-tier Calls a C++ based Service (thrift)
       > Thrift has the user-id of Bob and calls Aggregator to find all friends ids of Alex
       > Aggregator in parallel calls the multi-feed leaves.... (each leaf node is one user for which their is one DB .. one
       DB .. has a key-value pair (uid, user-data) ... no traditional sql (this is like noSql graph db)
       > feed result says [Bob, Sam, Paul] - these are all alexe's friends .... returns those ids (multi-feed) - by calling all
       DB servers in parallel .. finds the Indexes ..
       > Aggregator says ..now got ids of 40 most interesting stories .... It .. ranks them ... based on certain criteria .....
       > For each Id, .. get the metadata (timestamp, user name, comment..) from memcached (in ur cache it could be
       any other cache) - parallel query on multi-core Fedorra
       > Now web tier renders the data.

5. Google : http://highscalability.com/google-architecture

6.Flickr : http://highscalability.com/flickr-architecture

Contenu connexe

Plus de Kaniska Mandal

Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data Analytics
Kaniska Mandal
 
Debugging over tcp and http
Debugging over tcp and httpDebugging over tcp and http
Debugging over tcp and http
Kaniska Mandal
 
Concurrency Learning From Jdk Source
Concurrency Learning From Jdk SourceConcurrency Learning From Jdk Source
Concurrency Learning From Jdk Source
Kaniska Mandal
 
Wondeland Of Modelling
Wondeland Of ModellingWondeland Of Modelling
Wondeland Of Modelling
Kaniska Mandal
 
The Road To Openness.Odt
The Road To Openness.OdtThe Road To Openness.Odt
The Road To Openness.Odt
Kaniska Mandal
 
Perils Of Url Class Loader
Perils Of Url Class LoaderPerils Of Url Class Loader
Perils Of Url Class Loader
Kaniska Mandal
 
Making Applications Work Together In Eclipse
Making Applications Work Together In EclipseMaking Applications Work Together In Eclipse
Making Applications Work Together In Eclipse
Kaniska Mandal
 
E4 Eclipse Super Force
E4 Eclipse Super ForceE4 Eclipse Super Force
E4 Eclipse Super Force
Kaniska Mandal
 
Create a Customized GMF DnD Framework
Create a Customized GMF DnD FrameworkCreate a Customized GMF DnD Framework
Create a Customized GMF DnD Framework
Kaniska Mandal
 
Creating A Language Editor Using Dltk
Creating A Language Editor Using DltkCreating A Language Editor Using Dltk
Creating A Language Editor Using Dltk
Kaniska Mandal
 
Advanced Hibernate Notes
Advanced Hibernate NotesAdvanced Hibernate Notes
Advanced Hibernate Notes
Kaniska Mandal
 
Converting Db Schema Into Uml Classes
Converting Db Schema Into Uml ClassesConverting Db Schema Into Uml Classes
Converting Db Schema Into Uml Classes
Kaniska Mandal
 

Plus de Kaniska Mandal (20)

Machine learning advanced applications
Machine learning advanced applicationsMachine learning advanced applications
Machine learning advanced applications
 
MS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning AlgorithmMS CS - Selecting Machine Learning Algorithm
MS CS - Selecting Machine Learning Algorithm
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data Analytics
 
Machine Learning Comparative Analysis - Part 1
Machine Learning Comparative Analysis - Part 1Machine Learning Comparative Analysis - Part 1
Machine Learning Comparative Analysis - Part 1
 
Debugging over tcp and http
Debugging over tcp and httpDebugging over tcp and http
Debugging over tcp and http
 
Designing Better API
Designing Better APIDesigning Better API
Designing Better API
 
Concurrency Learning From Jdk Source
Concurrency Learning From Jdk SourceConcurrency Learning From Jdk Source
Concurrency Learning From Jdk Source
 
Wondeland Of Modelling
Wondeland Of ModellingWondeland Of Modelling
Wondeland Of Modelling
 
The Road To Openness.Odt
The Road To Openness.OdtThe Road To Openness.Odt
The Road To Openness.Odt
 
Perils Of Url Class Loader
Perils Of Url Class LoaderPerils Of Url Class Loader
Perils Of Url Class Loader
 
Making Applications Work Together In Eclipse
Making Applications Work Together In EclipseMaking Applications Work Together In Eclipse
Making Applications Work Together In Eclipse
 
Eclipse Tricks
Eclipse TricksEclipse Tricks
Eclipse Tricks
 
E4 Eclipse Super Force
E4 Eclipse Super ForceE4 Eclipse Super Force
E4 Eclipse Super Force
 
Create a Customized GMF DnD Framework
Create a Customized GMF DnD FrameworkCreate a Customized GMF DnD Framework
Create a Customized GMF DnD Framework
 
Creating A Language Editor Using Dltk
Creating A Language Editor Using DltkCreating A Language Editor Using Dltk
Creating A Language Editor Using Dltk
 
Advanced Hibernate Notes
Advanced Hibernate NotesAdvanced Hibernate Notes
Advanced Hibernate Notes
 
Best Of Jdk 7
Best Of Jdk 7Best Of Jdk 7
Best Of Jdk 7
 
Converting Db Schema Into Uml Classes
Converting Db Schema Into Uml ClassesConverting Db Schema Into Uml Classes
Converting Db Schema Into Uml Classes
 
EMF Tips n Tricks
EMF Tips n TricksEMF Tips n Tricks
EMF Tips n Tricks
 
Graphical Model Transformation Framework
Graphical Model Transformation FrameworkGraphical Model Transformation Framework
Graphical Model Transformation Framework
 

Best Practices For Enterprise System Architecture Part 1

  • 1. An Enterprise System should be Highly Scalable, Always Available, Easily Manageable, Fast Performer, Auto-Healer and Capable of Super Fast Searching and Intelligent Analysis through Machine Learning. Scalability means the system does not downgrade even when load increases multiple times. For example, when number of users of an ecommerce system increases heavily further leading to sudden increase in transactions per user, the system should still keep performing with same SLA and handle the traffic smoothly ensuring that business is not down. One of the design choice for large-scale system is to scale out horizontally. Create one DB per Entity (User/ Item/ Product) and store only Key-Value in that table. Replicate the DB in various machines and implement a scheme for data-sharding. For example, in the Modulo(Entity_Pky, 100) approch, User Bob with id 1000 will be located in machine #10 . So if number of users increases suddenly, new User DB will be created in new machine and the new user info will be inserted automatically. There are other data-sharding policies like Range-based partitions, Lookup based approach, Read-Most-Write-Least Model. SimpleDB can be effectively used to manage lookup-orieneted entity info. This is an out-of-the-box offering from Amazon AWS as a hosted and managed solution administered by Amazon itself. SimpleDB does the heavy-lifting of multi-value data fetching for a key, batch operations, consistent reads. Its better to delegate the tradional 'data management operations' like 'Relations, Transactions, Locks, Sequences, Constraints' to Application Layer as SimpleDB is meant for handling simple things !! Such a typical example is eBay DAL layer. Actually the System Architect should choose the Entities which are meant for mere lookup (like User, Item, Manufacturer, Order) and mark them as best candidates for SimpleDB items. Those entities will be then guarnteed to be highly available. Off course this pattern is not suitable for financial transactions in banking domain and rule-based complex events where every fetch query triggers sub-query-based rules or procedures. Discussing further details on implementing SimpleDB is out-of-scope here. If we want to handle complex business transactions while updating Entity Info and manage intricate relatiuonships then we should take resort to Cassandra or mongoDB. But we should remember that one needs to take care of Scalability in Application layer while adopting NoSQL Dbs.
  • 2. If the data are hierarchical in nature - we can leverage a Graph Database (MonogoDB) which is as opposed to conventional sql-db is a de-normalized graph processor. Availability : Load-balancing and Clustering are standard practices for making applications available. Normally DNS Resolver routes to a pool of servers for a requested application and Load- balancer picks up one server. Machines should be load-balanced in such a way the moving the user from one machine to another machine can be achieved easily without shutting down the system. Fast Searching : NoSql is a must for fast lookup of trillions of Business Entity data and persisting time-critical entities by locking data storage row for a negligible amount of time (contrary to traditional RDBMS), yet be able to write-through / broadcast write events to sub0system grids (Search Grid / Analytics Machine / Events Collector etc.) Cassandra does a very neat job for that matter. "The sparse two-dimensional “super-column family” architecture allows for rich data model representations (and better performance) beyond just a simple key-value look up..... Some of the most attractive features of Cassandra are its uniquely flexible consistency and replication models. Applications can determine at call level what consistency level to use for reads and writes (single, quorum or all replicas). This, combined with customizable replication factor, and special support to determine which cluster nodes to designate as replicas, makes it particularly well suited for cross-datacenter and cross-regional deployments. In effect, a single global Cassandra cluster can simultaneously service applications and asynchronously replicate data across multiple geographic locations...." Its worth paying for the learning curve and operational compexities in exchange of the 'scalability, availability and performance advantages of the NoSQL persistence model'.... In traditional DB, what happens if one of the node containing one of the table in a join condition is down ! Simple - now the whole application that depends on that join condition is unavailable ! Well there is no join condition in NoSql !! Cassandra is best for cross-regional deployments and scaling with no single points of failure. Next question – does the NoSql gurantees data-consistency the same way Relational DB vouches for read-after-write consistency at the cost of Blocking the read untill Transaction is finished ! Well NoSql follows CAP theorem not ACID principle ! So if you think 'immediate consistency' need to be ensured for super-critical tasks like Bidding / Buying, better await till data committed before reading the latest data ! In certain cases of 'Eventual Consistency' like searching data where we expect fast response we may not await for latest data to availble but rather instantly display the pre-computed search result !! Also there is no concurrency bottlenecks ! Replicas are mostly Multi-master Highly Available ! NoSql ensures Asynchronous Lock-based Reconciliation as opposed to Synchronous Lock-based Reconciliation bby SQL. This means more work but saving time !! Using Message -Multicast do write-behind to replicated database / grid without taouching master db. Then synchronize with master DB after a specific time
  • 3. period. This saves lot of time spent in Index Arrangement and Sequential Updation. Say you want to search – 'Blue Sony mp3 player' ! NoSql will hold the lock for a row only for couple of milliseconds as opposed to locking the entire table by RDBMS ! NoSql will just lookup the id of next entity (manufatures) against the id of the main entity (product) and move the mesage to the next table !! At the same time, it looks up the product requested by the next user !! No Joins , Just Intelligent Routing !! Handling millions of requests concurrently has never looked so esy before !!! Which is for What ? Memcahed : Static Key-Value Pair Neo4j : Network Graph Store Hbase : Row-Orieneted Sparse Column Store MongoDB : (Key, Document) Storage. Great Document : http://highscalability.com/blog/2011/2/15/wordnik-10-million-api-requests-a- day-on-mongodb-and-scala.html Fast Data Analysis Apache Hbase – best suited for Data Analytics, Data Mining, Enterprise Search and other Data Query Services. If the main business is to mine PeraBytes of data and perform parallel range-queries and then combine the results through batch Map-Reduce (say Enterprise Search for Video Content), then one of the defacto choice is Apache Hbase configured with haddop Jobs, using HDFS as a shared storage platform. Hbase comes with availability trade-off i.e. the persistent domain entities may not immediately available in the Search Result. The reason for this is huge Hadoop Map-Reduce jobs are performed parallely in offline mode so that main source of data is not hogged by Hadoop Jobs and is highly Scalable. It should be noted that Hadoop is not meant for searching in real-time. Its actually an offline batch- processing system well-suited for BI analytics, data aggregation, normalization in parallel. Hadoop provides a framework which will automatically take care of interProcess co-ordination, distributed counters, failure detection, automatic restart of failed process, co-ordinated, sorting and much more. There is different stages like mapper, combiner, partitions, reducer etc, instead of we writing from the scratch, the framework takes care of it. Erlang / Scala also provide out-of-the box parallel processing. Its important to know that there are different tools based on hadoop that serve different purposes : Hive : SQL query for MR Pig : Scripting for MR HBase : Low-latency, Big Table like database on hadoop Oozie : Workflo Mgmt on Hadoop Sqoop - RDBMS import / export Zookeeper : Fast, distributed, coordination system Hue : advanced web env. for Hadoop and custom applications
  • 4. Here come the ultimate offering from Spring - http://www.springsource.org/spring-data Apache has a solution for combining lucene with hadoop for blazing-fast document search. Fast Cache : Implementing a cold cache with minimum memory footprint (MySQL native memory, Memcached, Terracota Ehcahce) is absolutely important. Application Server should not remeber the state of the Entity rather should just persist in DB. The metadata should be stored in Cold Cache. The persistent pojo objects should never be cached in memory. If queries are mostly read-only, very less updates - the mySQL native memory scores high ! MySQL InnoDB storage engine’s dictionary cache provides blazing fast data acceess. On demand, data blocks are read from disk and stored in memory using LRU chains or other advanced mechanisms(like MySQL’s Midpoint Insertion Strategy) . In order to lookup the value for a key, Memcached clients issue multiple parallel requests to memcached servers where keys are sharded across those servers. For a frequently changing data set , Memcahed is surely better than DB Cache Probably the fastes cache on planet is - is Ehcache using Terracotta BigMemory http://blog.terracottatech.com/2010/09/bigmemory_explained_a_bit.html Here directly the native byte-buffer of the OS is used bypassing main-memory. A write-optimized data store. Something that aggregates the writes in RAM, and writes out generational updates. Take a look at Google's BigTable paper for a good description of this strategy. A very good reference : http://research.yahoo.com/files/sigmod278-silberstein.pdf Everything Asynchronous All communications in every layer should be asynchronous to reduce the latency. There are 2 types of latency. 1. user latency - how fast user gets back the control on web site 2. execution latency - how fast the execution takes place in backend UI behavior should be completely Ajax and send the main events to queue and schedule for batch updates in offline mode. There is absoulutely no room for a wait_state i.e. all response should be immediate and non-blocking. Common Flow : -- Submit a job/task to a thread and return to the user immediately.
  • 5. -- The thread should perform the operation in background (it may communicate to LRU cache optimized for concurrent access / Graph Structure / Iry-IIry master-slave replicatioed env / Map- Reduce based sturctures) -- Once done with the opertaion it should return the result thru a CallbackHandler and client will get notification. There are multiple patterns for Asynchronous Communications. (1) Store all jobs in Event Queue. Select a queue based on a contract/ algo depending on type of task. Then use multiple event brokers/ listeners to handle the jobs from queue (thrid party ESB like Tibco / apache fuse/ apache camel / ...) can be used. (2) Message Multicast - say user enters a new item in system. do not update the iry db immediately. rather send messages to pollers / searchers - that there is a 'New' event. Return to user immedialtely. Now the updater thread will update Iry db. Then searcher will behind the scene query Iry db / data source to find what has been just added and it will add it to its search grid ! (3) Batch Processing (schedule offline procesing). Identify which type of job requests can be scheduled for offline processing and do not need immediate attention ! Prallel Threading Use Executor Service to effectively manage pool of threads - Remember a simple set of worker threads 'without the manager' - can simply lead to (i) Resource Thrashing (each thread is expensive - execution call stack / context switching) (ii) Request Overload .. if all requests are provided threads for execution (iii) Thread Leakage : sive of threadpool diminishes due to uncaught errors but requests are not served ! So there should be a proper manager to allocate threads either FixedThreadPool / ScheduledThreadPool / SingleThreadPool / QueuedThreadPool with proper RejectionPolicy. This manager should also place the completed result in a non-blocking queue from where result can be taken off asynchrounously. Thread should not wait to acquire locks - leverage advanced processor optimizations to use LATCH concept to lock/unlock threads at the same time. Locks should be acquired / released in any order. Fork/Join and CountDownLatch are powerful concepts for running threads parallely. You can download and try out the open-source Thrift which is a C++ Fwk for multi-threading and asynchronous processing of job Auto-Healing and Self-Recovery : Systems should be falut-tolernt and should know how to degrade gracefully in the scenario of unprecented load. The JMX agents and other Robot Apps should continously monitor the system and gather intelligence to take the best desicion. A high-degree of automation is requireed for easily recovering from failures and managing the system smoothly.
  • 6. Normally automation is driven though Centrallized Logging System and Self-organization Artificial Intelligence. BI tools are used to analyze user experience and provide Best Match through continous inhouse experimentations. Scalable systems are mangeable if new partitions can be added, DB instance can be horizantally scaled out and new application servers turned on without affecting users of the system. Case Study : Twitter : 1. http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010 2. http://www.slideshare.net/kevinweil/hadoop-at-twitter-hadoop-summit-2010 Twitter Example : user SAM tweets - > store tweet > iterate social graph > split chunks into parallel jobs > prepend packet into its memcahced blob / (some other cache) - if not present in cache - talk to db / hadoop user RAM who follows SAM - sees sam's tweet - > read mysql blob from memcache (or other cache) > deserialze integers > sort > slice > probablistic truncation (fast but may not be all consistent). 3. SalesForce : http://www.infoq.com/interviews/dave-carroll-salesforce-api 4. Facebook : http://www.infoq.com/presentations/Scale-at-Facebook . Facebook Example : Alex friend of bob - logs in > Web-tier Calls a C++ based Service (thrift) > Thrift has the user-id of Bob and calls Aggregator to find all friends ids of Alex > Aggregator in parallel calls the multi-feed leaves.... (each leaf node is one user for which their is one DB .. one DB .. has a key-value pair (uid, user-data) ... no traditional sql (this is like noSql graph db) > feed result says [Bob, Sam, Paul] - these are all alexe's friends .... returns those ids (multi-feed) - by calling all DB servers in parallel .. finds the Indexes .. > Aggregator says ..now got ids of 40 most interesting stories .... It .. ranks them ... based on certain criteria ..... > For each Id, .. get the metadata (timestamp, user name, comment..) from memcached (in ur cache it could be any other cache) - parallel query on multi-core Fedorra > Now web tier renders the data. 5. Google : http://highscalability.com/google-architecture 6.Flickr : http://highscalability.com/flickr-architecture