Webinar presented on Jan 26th 2011 by Juan De Abreu.
Learn how to achieve:
• Scalability linear scale, scale up vs. scale out, choose VM sizes
• Storage Cache
• Elasticity, scale out, scale back and automation of scaling
Intended for:CIOs, CTOs, IT Managers, IT Developers, Lead Developers
3. Outline Scalability Achieving linear scale Scale Up vs. Scale Out in Windows Azure Choosing VM Sizes Caching Approaches to caching Cache storage Elasticity Scale out, scale back Automation of scaling #CSwebinar
4. A Primer on Scale Scalability is the ability to add capacity to a computing system to allow it to process more work #CSwebinar
5. A Primer On Scalability Vertical Scale Up Add more resources to a single computation unit i.e. Buy a bigger box Move a workload to a computation unit with more resourcese.g. Windows Azure Storage moving a partition. Horizontal Scale Out Adding additional computation units and having them act in concert Splitting workload across multiple computation units #CSwebinar
6. Vertical vs. Horizontal For small scenarios scale up is cheaper Code ‘just works’ For larger scenarios scale out only solution Massive diseconomies of scale 1 x 64 Way Server >>>$$$ 64 x 1 Way Servers. Shared resource contention becomes a problem Scale out offers promise of linear, infinite scale #CSwebinar
7. Roughly Linear Scalei.e. Additional throughput achieved by each additional unit remains constant Throughput Non Linear Scalei.e. Additional throughput achieved by each additional unit decreases as more are added Computation Units
8. Scalability != Performance Often you will sacrifice raw speed for scalability For example; ASP.NET session state In Process ASP.NET Session State SQL Server ASP.NET Session State #CSwebinar
9. Achieving Linear Scale Out Reduce or Eliminate Shared Resources Minimize reliance on transactions or transactional type behaviour Homogenous, Stateless computation nodes We can then use simple work distribution methodsLoad balancers, queue distribution Less reliance on expensive hardware H/A
10. Units of Scale Consolidation of Roles provides more redundancy for same Create as many roles as you need ‘knobs’ to adjust scale Web Driven Role WCF Role Web Site Role’ Cache Build Role Clean Up Role Loss of an instance results in just 25% capacity loss in web site. Loss of an instance results in 50% capacity loss in web site. Queue Drive Role #CSwebinar
11. VM Size in Windows Azure Windows Azure Supports Various VM Sizes ~800mb/s NIC shared across machine Set in Service Definition (*.csdef).All instances of role will be equi-sized <WorkerRole name=“myRole" vmsize="ExtraLarge"> #CSwebinar
12. Remember: If it doesn’t run faster on multiple cores on your desktop … It’s not going to run faster on multiple cores in the cloud! #CSwebinar
13. Choosing Your VM Size Don’t just throw big VMs at every problem Scale out architectures have natural parallelism Test various configurations under load Some scenarios will benefit from more cores Where moving data >$ parallel overhead E.g. Video processing Stateful services Database server requiring full network bandwidth #CSwebinar
15. Caching Caching can improve both performance and scalability Moving data closer to the consumer (Web/Worker) improves performance. Reducing load on the hard to scale data tier Caching Is The Easiest Way To Add Performance and Scalability To Your Application In Windows Azure: Caching Will Save You Money! #CSwebinar
16. Caching Scenario: Website UI Images Website UI Images Largely static data Included in every page Goal: A Better UI Serve content once Avoid round trip unless content changes Minimise traffic over the wire Fewer storage transactions Lower load on web roles #CSwebinar
17. Caching Scenario: RSS Feeds Regular RSS Feed Data delivered from database/storage Large content payload>1mb Data changes irregularly Cost determined by client voracity Goal: A Better RSS Feed Minimise traffic over the wire Fewer storage transactions Less hits on database #CSwebinar
19. Client Side Caching Client Web Roles WorkerRoles BLOBs Queues Tables SQL Azure #CSwebinar
20. Client Caching - ETags ETag == Soft Caching Header added on HTTP Response ETag: “ABCDEFG” Client does conditional HTTP GET If-None-Match: “ABCDEFG” Returns content if ETag no longer matches Implemented natively by Windows Azure Storage Supports client side caching Also used for optimistic concurrency control #CSwebinar
21. Client Caching - ETags Benefits Prevents client downloading un-necessary data Out of the box support for simple ‘static content’ scenarios. Problems Still requires round trip to server May require execution of server side code to re-create ETag before checking string etag = Request.Headers["If-None-Match"]; if(String.Compare(etag, GetLastBlogPostIDAzTable()) == 0) { Response.StatusCode = 412; return; } #CSwebinar
22. Client Caching – Cache-Control Cache-Control: max-age == Hard Caching Header added on HTTP Response Cache-Control: max-age=2592000 Client may cache file without further request for 30 days Client will not re-check on every request Very useful for static files header_logo.png Used to determine TTL on CDN edge nodes Set this on Blob using x-ms-blob-cache-control #CSwebinar
23. Client Caching – Cache-Control Benefits Prevents un-necessary HTTP requests Prevents un-necessary downloads Problems What if files do change in the 30 days? Windows Azure Technique: Put static files in Blob storage use Cache-Control + URL FlippingSimple randomization == simple but no versioning Container level flipping == simple but more expensive Snapshot level flipping == more complex but lower cost <img src=http://*.blob.*/Container/header_logo.png ?random=<rnd>/> <img src=http://*.blob.*/Containerv1.0/header_logo.png /> <img src=http://*.blob.*/Containerv2.0/header_logo.png /> <img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT1>/> <img src=http://*.blob.*/Container/header_logo.png ?snapshot=<DT2>/>
25. Static Content Generation Generate Content Periodically in Worker Role Can spin up workers just for generation Generate as triggered async operation Content May Be Full pages Resources (CSS Sprites, PDF/XPS, Images etc…) Content fragments Push static content into Blob storage Serve direct out of Blob storage May also be able to use persistent local storage #CSwebinar
26. Static Content Generation Benefits Reduce load on web roles Potentially reduce load on data tier Response times improved Can combine with Cache-Control and ETags Problems Need to deal with stale data Manage/Refresh Ignore #CSwebinar
27. A Better RSS Feed? Build standard RSS Feed in Web Role Generate content dynamically from storage Serialize as RSS using Feed Formatters Place on obfuscated (hidden) URL Build a worker role to poll hidden RSS feed Retrieve RSS content at certain intervals or on event Push content into a Blob if changed Serve RSS to users from Blob storage Take advantage of E-Tags Zero load on database or RSS tables to serve content #CSwebinar
28. BLOBs vs. Compute Instances BLOB Storage Disk Based 15c/GB/Month 1c/10,000 requests Compute Instances RAM and Disk Based 12c/hrper 1GB RAMper 250GB disk Dedicated compute cache roles must serve at least 120,000 cache requests per hour to be cheaper than Windows Azure storage Outside USA and Europe: use CDN for caching due to much lower bandwidth costs #CSwebinar
30. Elastic Cloud Workflow Patterns “Growing Fast“ “On and Off “ Inactivity Period Compute Compute Average Usage Usage Average Time Time On & off workloads (e.g. batch job) Over provisioned capacity is wasted Time to market can be cumbersome Successful services needs to grow/scale Keeping up w/ growth is big IT challenge Cannot provision hardware fast enough “Unpredictable Bursting“ “Predictable Bursting“ Compute Compute Average Usage Average Usage Time Time Unexpected/unplanned peak in demand Sudden spike impacts performance Can’t over provision for extreme cases Services with micro seasonality trends Peaks due to periodic increased demand IT complexity and wasted capacity #CSwebinar
36. Head Room in Windows Azure Web Roles Run additional web roles Handle additional load before performance degrades Worker Roles If possible just buffer into queues Will be driven by tolerable level of latency Start additional roles only if queues not clearing Use generic workers to pool resources #CSwebinar
37. Head Room in Windows Azure Services Windows Azure Storage Storage nodes serve many partitions Partition served by a single storage node Fabric can move to a different storage node Opaque to the Windows Azure customer SQL Azure Non-deterministic throttle gives little indication Run extra instances – requires DB sharding #CSwebinar
38. Adding Capacity in Windows Azure Web Roles/Worker Roles Enable more instances (API or *.config) Editing instance count in config leaves existing instances running Change to using larger VMs- will require redeploy. Windows Azure Storage Opaque to user Partition aggressively Can ‘heat up’ a partition to encourage scale up #CSwebinar
39. Adding Capacity in SQL Azure SQL Azure Add more databases (more partitions) Very difficult to achieve mid-stream Requires moving hot data Maintaining consistency across multiple DBs without DTC Will depend on partitioning strategy #CSwebinar
40. Rule Based Scaling Use Service Management and Diagnostics APIs On/Off and Predictable Bursting Time based rules Unpredictable demand and Fast Growth Monitor metrics and react accordingly Action+/- instance count Deploy new service Increase queues Send notifications Monitor InputsHistorical Data TransactionsPerf CountersBusiness KPIs Evaluate Biz Rules Latency too high/lowHow much $ spent Are we at limit Predicted load Diagnostics & Management APIs #CSwebinar
41. Monitor metrics Primary metrics (actual work done) Requests per Second Queue messages processed / interval Secondary metrics CPU Utilization Queue length Response time Derivative metrics Rate of change of queue lengthUse ‘historical’ data to help predict requirements #CSwebinar
42. Gathering Metrics Use Microsoft.WindowsAzure.Diagnostics.* Capture various metrics via Management API Diagnostics Infrastructure Logs Event Logs Performance Counters IIS Logs May need to smooth/average some measures Remember the cost of gathering data Both performance and financial costs Would you use Perf Counters 24/7 on a production system? http://technet.microsoft.com/en-us/library/cc938553.aspx #CSwebinar
43. Evaluating Business Rules Are requests taking too long? Do I have too many jobs in my queue? How much money have I spent this month? Could write these into code. Could build some sort of rules engine. Could use the WF rules engine. #CSwebinar
44. Take Action Add/Remove Instances Use Service Management API Change role size Requires change to *.csdef Most suited to Worker Roles Send notifications Email IM Manage momentum Be careful not to overshoot #CSwebinar
45. Summary Designing for multiple instances provides Scale out Availability Elasticity options Caching should be a key component of any Windows Azure application Various options for variable load Spare capacity Scale Out/Back Automation possible #CSwebinar
46. Resources www.msteched.com/Australia Sessions On-Demand & Community www.microsoft.com/australia/learning Microsoft Certification & Training Resources http:// technet.microsoft.com/en-au Resources for IT Professionals http://msdn.microsoft.com/en-au Resources for Developers #CSwebinar
Hello good mornign afternoon of evening depending on the time zone you may be, as nancy just said, we I will be talking about Windows Azure, and things to consider when you want to Scale up, Out Or Down.
This are the items we will be talking about todayScalabilityWe will talk about linear scaling and how to achieve itDifferences between Scaling up And Scaling Out in Azure.Different characteristics and Sizes of VM’sAlso we will Talk about Caching and how it helps to increase performanceDifferent approach that can be used And finally we will talk about Azure elasticity, how to take advantage of it, and what to consider to manage load.
What scalability mean?Basically, Scalable applications are those that you can add more computer resources, and you get more computer power.Ideally that happens in a linear fashion, which is key to maintaining very scalable applications.Specially if you have thousands of nodes running your platform.Do not confuse scalability with performance.
There are Two general ways to approach scaling. Basically Two types of scaling.We talk generally about computational units. (banks of machines) in case of azure they are virtual machines.Scale up vertically entail : bigger boxes. More resources to same computational unitScale out horizontally: more machines. More computational units. Have them work together.Most of the Big scale web sites , Bing, Facebook, Twitter, follow the Scale out Architecture.
For small scenarios scaling up is pretty easy to do, you see this all the time in the enterprice.Generally if you can get a faster machine , place more memory and more disk, your application start performing better.But it has the limitations that you can only scale up so far, and they do not necessary scale in a linear fashion.Also it is more expensive to have 1 X 64 Way Server Than 64 X 1 way servers, Economies of scaling come to play.So for larger scenarios, Scaling out is how the big applications are build, Bing , Hotmail, etc, If cheaper, and well architected it can scale infinitelly.And this is one of the features that Azure bring to the table to do very easily.
Touch point Here is where you want to get linear scale, for every computational unit I am equally increasing the capacity the same amount of performance.You want to increase the capacity and also increase the performance you have to Scale out.Load testing.
Right away you are going to realize that Scalability is not the same as performance.Some times you need to Sacrifice raw performance, for scalability Asp.Net session State is an example of that, in order to scale we move our session state into other machines, I cases of big scale out applications the session state is partitioned across many machines.So you can think about grouping a set of machines to manage the state on a certain partition, this can take to your process some extra milliseconds to manage, but you gain the scalability to be able to do it, in many nodes to manage big loads.
How do we archive linear scale.For instance if every machine in your farm have to go and query state on SQL database you start having resources issues if you do not partition you data when you get to thousand of nodes, you have to start reduction or elimination of hot share resources. Partition of your data is key.Minimize the use of transactional behaviour, in this case, windows Azure leverage queues and you can used them to help scaling you application, you must be careful with this be cause it requires that you maintain this principle of idempoten, Idempoten is a big word, but it just means that if you have a function, you should be able to call that function 1,2,3 times over and over without incurring in concurrently issues of what that function is actually doing. The end state of that function stores will stay the same.By default you get round robing with windows azure, so the idea is all you nodes need to be Homogeneous and stateless, to be able to continue with the process even its change to a different node.By having multiple nodes you are more fault tolerant in case a node failure and by doing this you also have less reliance on expensive hardware.
Slide1You can create a Create a bunch of roles to do the work you need. And you can create as many instances of those roles as you need.Most of the solutions we see come out with multiple differrent roles, and is ok, you can scale this way and you can see we have some roles a clean up role, a web roles, worker role, some of this have a couple of instances.Two instances of a web site role and two of the worker role, what endsup happening if I lost and instanceSlide 2Is I lost one instance then I lost 50 % of my processing.Slide 3Instead of that you can consolidate, some of the roles and in this case if one of my instances goes down I just lost 25% of capacity.The key here is how to decide what roles can be consolidated in an azure VM and how many instances of those services do I need?And this is something that you have to balance based on you specific aplication and you can decide if you need to scale out to multiple instance or if you need to scale up to a bigger VM.
And in azure we have Multiple VM sizes, there is a Extra Small In beta. But basically you have.Small. To Extra large, 1 to * cpus and of course Different cost, A single Azure unit has a top 8 cpu core. There are some thing to consider when selecting a VM Size.So with windows azure eventually you will optimize your architecture based on the cost.The network is Share but not necesarally burstable, in cases of the Extra Large you can burst thru the 8 core, making the most of the machine. Determine how much bandwith you needWe guarantee 1gb, you 100mbs guarenteed, you share bandwith between swichesFor video streaming go for a bigger machine.
Some thing important to remember, If it does runfaster on multi… in the cloudParallel task library or any or the other optimization.
Don’t just throw big VMs at every problemScale out architectures have natural parallelismTest various configurations under loadSome scenarios will benefit from more coresWhere moving data >$ parallel overheadE.g. Video processingStateful servicesDatabase server requiring full network bandwidth
Lets talk about caching
The Key in Cashing is that ir is the easiest way to improve performanceCaching is like indexing in SQL Server, it can improve performance greatly.It is important to think about moving the data closer to consumer, you can move the data out to the client, or out to memory.And it gives you the advantage of reducing the load on the data tier, having huges ramifications in the back end, allowing for more scale.Is it valuable to cache data for just a second. Sub seconds calls make a difference.
Caching helps a lot when you have Largely static data, images , etc, you can also cash entire pages.Here you can use azure blob storage to serve those image and you save in processing and bandwith. Lowering cost and having a better performance.The idea here is to serve content once, avoid round trips to the database, lower load on web roles.There are a couple of ways you can do this you will that here soon.
The other scenario is with more dynamic content like RSS feeds.Most of people think I can not use caching for dymanic data, of course you can, the idea is to stic to the goals, Minimise traffic on the wire, fewer transactions and less hits to the database.In windows azure you pay for transactions to the database so if I have a an use that really like my RSS feeds, and is downloading it constanty, this can affect my cost, so you need to architect you solution to control this situations.So you can definitylly use caching to your advantage and control this cases.And we will see some strategies on hoe to control that.
The two strategies that you can use areStatic Content generation or Client Side CachingClient Side Caching, has to do with using the different headers available on windows Azure, to encouraging the client to cache the content localy, to minimize the amount of round trips that ocurr back into your server.There are also unique approaches to minimize visit to you sever to update content by increasing the amount of caching tags. Place there for clients to consume content inteligently.On the Static Content generation you can generate content and storage it as a cache resource and directly access it from storage.
Lets ilustrate this for client CashingWe have to scenarios here one is to officialy cache data, out of the web role and the other is cache data out of the blob storage.So a client could restfulltechniqques to go directly to the Blog and table storage and for many scenarios this is a way to minimize your cost. Also actually if I am making use of the CDN, because actually CDN Technology pushes the content to the CDN Network and increases the performance.
Generally Client Caching involve the use of etags.You can think of etags as a way to version content. It makes content version aware. They are just like a resource version number for that particular content.It is a header that is added to the Http Response, and a client will make a conditional code based on the maching of the etag, if that etag does not match the updated content is delivered.You can implent that as a conditional Put Or get that is done in a restful way behind the scene to update the content when update are present. By using blogs you can store those tags, and have the content served automatically,the windows azure blobs and tables actually use those tags under the hud, so it is a great way to implement client side caching natively.The benefit is that etags can definitelly reduce unneceserally downloads.
If I have the most current content because the etag is telling me that, I don’t need to go and download the whole contentIt will only download the content when it changes.You just need to generate the etag, just when content change on the database.
We also have a Cache Control header in the client side, it works with the CDN, its basically tells the client the how much to wait before requesting this content again.I is great for static content or content that you know it will not change in X amount of seconds to reduce the amount of unnecesary request to the server and is easy to set on Blobs just by using the XML at the bottom on the slide.
If files change how do I refresh the content, How do I expire content. On cdnYou can target clients to different containers by the data on the cache control headers.Xdrive. Your content in one page.Do I need to provision your , cloud source studio.The cloud control headers.
Static Content generationsIs about reducing the times you process content.And increase the performance.So the idea is to cache the content and storage it and send it to the client from there. This to reduce the trips to the database. The you can use all the caching headers techniques we just talk about to deliver de content efficiently.
So the idea is to cache the content and storage it and send it to the client from there. This to reduce the trips to the database. The you can use all the caching headers techniques we just talk about to deliver de content efficiently.You can spin up worker roles to update the content.You can queue messages during the day to process those updates and take advantage of lower load periods and make the most of the spare CPU cycles of VM.Content could be many things, Full pages, images, PDF files, or just portion of your content data.Blog storage can behave like a web server and deliver de storage as dynamic content.Also by using the storage you are significantly reducing the cost y you compare that with the cost of processing and accessing the data on the Database.
Manage your stale data.If they are not pulling data because of the etag, there are less transactions less cost.
Update the blog just if the etag change
Order of magnitud cheaper.
Describe the different patterns.On off: most of the scientific calculations/ simulations, that run for a period of time ,and then the data is analized for weeks, and no more process is required for while.Groing Fast, new web sites of sites that have a history of increase demandUnpredictable bursting: sites affected by sudden news or unpredictable behaviourAnd the Predictable bursting of: the pizza store every Saturday night.So how do you deal with variable load.
There ate two forms to deal with variable loadYou have to maintain excess capacity what we call Headroom, you are going to pay a little bit for excess capacity, to trade that off wit faster availability when you need it, you can maintain some asyncronous work to provide buffer but in most cases a good head room will be about 10 to 15%.The other form is to add or remove capacity when needed, this usually takes time, this is why is important the head room, you can use your current analitics to determine when to add or remove capacity. Takes two form, maintaing excess capacity, keep head room. 15% margingTo give enough time to add.Adding removal capacity.
For the head room, approach, For Web roles: you need to add additional instances to allow you to get that additional processing when is needed for when picks happend, also you need to monitor your load to be prepare to handle additional load before performance degrades.On worker roles: Try to buffer into queues when is possible to manage picks, star additional roles if levels of latency are increasing or if the queues are not clearing, and the same if the workers are idle and the queues are empty, decrease the amount of roles.How to go about this?
You need to use the Service management and diagnostics, API available in Azure.Based on the metric you obtain and we will see how to get them just in a moment, you create to me time based rules to turn on or off the amount of instances required to manage load.In case of unpredictable demand, you have to monitor some metrics and react to their behaviour.
What are the metrics we need to consider.We need to monitor de actual work done, Request per second, messages processed in the queue how often?Another level is to monitor CPU, Amount of messages in the queue, response time,And also , Rate or change in queue length, monitor how much latency in the queue is created based on the load during the day.Now how you can get this information.
You can capture information via de management API, this will give you information about Infraestructure logs, events and IIS logs and performance Counters.In some case you may need to average some measures. Also depending on how your out come is, you want to consider evaluating some business rules.
What to do when?Request are taking too longDo I have may jobs in the queueAlso ask the question, Am I soending to much money?How to improve the processeShould I automated, can I use a rules engine?This answers will allow you to decide.
When to take Action.And add or remove instances, you can use the service management API or develop you automated process.Change roles sizes, Remember to send notifications on every case, use email, instant messaging, but keep people in the loop of what is happening.And mostly manage the momentum, do not overshoot because at the end it cost you money.
Well, we see that designing for multiple instances allow you have a better elastic solution that can scale acondingly to the demand.We talk about how caching can improve performance and some techniques on how to implement it in windows Azure.And also we have describe various options to manage the different loads your environment may have.With this a finish the webinar.
If you have any questions please feel free to post them on our blog we will be more that happy to answer them, Thanks.Nancy.