Developers have been exploring the possibilities opened up by the Windows Azure Platform for Cloud Computing. This book pulls together great articles from many of those developers who have been active with the Windows Azure Platform to hopefully help others become successful. There are twenty articles in this first volume covering everything from getting started to implementing best practices for elastic applications.
Handwritten Text Recognition for manuscripts and early printed texts
Windows Azure Platform: Articles from the Trenches, Volume One
1. The Windows Azure
Platform: Articles from
the Trenches
Volume One
Editor and copy and paste guru: Eric Nelson and 15 authors smarter than him
22nd June 2010 (v0.9)
Cover art by Andrew Fryer
Developers have been exploring the possibilities opened up by the Windows Azure Platform for
Cloud Computing. This book pulls together great articles from many of those developers who have
been active with the Windows Azure Platform to hopefully help others become successful. There are
twenty articles in this first volume covering everything from getting started to implementing best
practices for elastic applications.
2. The Windows Azure Platform: Articles from the Trenches
TABLE OF CONTENTS
INTRODUCTION 6
From the Editor 6
Would you like to become an author for a future edition? 6
Introduction to the Windows Azure Platform 7
AE – Acronyms Explained 8
CHAPTER 1: GETTING STARTED 9
5 steps to getting started with Windows Azure 9
Step 1: Creating an Azure account. 9
Step 2: Provisioning a SQL Azure database 9
Step 3: Building a Web Application for Azure 10
Step 4: Packaging the Web Application for Windows Azure 11
Step 5: Deploying the Web Application to Azure. 11
The best tools for working with the Windows Azure Platform 14
Category: The usual suspects 14
Category: Windows Azure Storage 14
Category: Windows Azure diagnostics 17
Category: SQL Azure 18
Category: General Development 19
CHAPTER 2: WINDOWS AZURE PLATFORM 20
Architecting For Azure – Building Highly Scalable Applications 20
Principles of Azure Architectures 20
Partition Data 20
Colocation 21
Cache 21
State 21
Distribute Workloads Effectively 22
Maximise Resources 22
Summary 23
The Windows Azure Platform and Cost-Oriented Architecture 24
Cost is important 24
What costs to consider 24
Conclusion 25
De-risking Your First Windows Azure Project 26
Popular Risks 26
Non-Technical Tactics for Reducing Risk 27
Technical Tactics for Reducing Risk 28
2
3. The Windows Azure Platform: Articles from the Trenches
Developer Responsibility 29
Trials & tribulations of working with Azure when there’s more than one of you 30
Development Environment 30
Test Environment 30
Certificates 31
When things go wrong 31
Summary 31
Using a Continuous Integration build to achieve an automated deployment of your latest build 32
Getting the right “bits” 32
Packaging for deployment 32
Deploying 33
Using Java with the Windows Azure Platform 35
Accessing Windows Azure Storage from Java 35
Running Java Code on Windows Azure 36
AzureRunme 37
CHAPTER 3: WINDOWS AZURE 39
Auto-Scaling Windows Azure Compute Instances 39
Introduction 39
A Basic Approach 39
The Scale Agent 39
Monitoring: Retrieving Diagnostic Information 40
Rules: Establishing When To Scale 41
Trust: Authorising For Scale 42
Scaling – The Service Management API 44
Summary 45
Building a Content-Based Router Service on Windows Azure 46
Bing Maps Tile Servers using Azure Blob Storage 49
Azure Drive 51
Guest OS 51
VHD 51
CloudDrive 52
Development Environment 53
Azure Table Service as a NoSQL database 55
Master-Detail structures 55
Dynamic schema 55
Column names as data 56
Table names as data 56
3
4. The Windows Azure Platform: Articles from the Trenches
Summary 57
Queries and Azure Tables 58
CreateQuery<T>() 58
Contexts 59
Querying on PartitionKey and RowKey 59
Continuation 60
DataServiceQuery 60
CloudTableQuery 61
Tricks for storing time and date fields in Table Storage 64
Using Worker Roles to Implement a Distributed Cache 68
Configuring the Cache 68
Using the Distributed Cache 69
Logging, diagnostics and health monitoring of Windows Azure Applications 71
Collecting diagnostic data 71
Persisting diagnostic data 72
Analysing the diagnostic data 72
More information 73
Service Runtime in Windows Azure 74
Roles and Instances 74
Endpoints 74
Service Upgrades 74
Service Definition and Service Configuration 75
RoleEntryPoint 75
Role 76
RoleEnvironment 76
RoleInstance 77
RoleInstanceEndpoint 78
LocalResource 78
CHAPTER 4: SQL AZURE 79
Connecting to SQL Azure in 5 Minutes 79
Prerequisite – Get a SQL Azure account 79
Working with the SQL Azure Portal 79
Create a database through the Server Administration 80
Configuring the firewall 80
Connecting using SQL Server Management Studio 81
Application credentials 83
Keep in mind – the target database 83
CHAPTER 5: WINDOWS AZURE PLATFORM APPFABRIC 85
4
5. The Windows Azure Platform: Articles from the Trenches
Real Time Tracing of Azure Roles from Your Desktop 85
Custom Trace Listener 85
Send Message Console Application 86
Trace Service 86
Service Host Class 87
Service 88
Summary 88
MEET THE AUTHORS 90
Eric Nelson 90
Marcus Tillett 90
Richard Prodger 91
Saksham Gautam 91
Steve Towler 92
Rob Blackwell 92
Juliën Hanssens 92
Simon Munro 93
Sarang Kulkarni 93
Steven Nagy 93
Grace Mollison 94
Jason Nappi 94
Josh Tucholski 95
David Gristwood 95
Neil Mackenzie 96
Mark Rendle 96
5
6. The Windows Azure Platform: Articles from the Trenches
INTRODUCTION
FROM THE EDITOR
Hello all,
The Windows Azure Platform is changing the way we architect,
implement, deploy and manage solutions. In early 2010 it went live
and in the first six months we have already seen an impressively
diverse range of solutions developed to take advantage of the
services offered.
This book pulls together great articles from many of those
developers who have been active with the Windows Azure Platform
to hopefully help others be successful. There are twenty articles in
this first volume covering everything from getting started to
implementing best practices for elastic applications. You are not
expected to read it in order from start to finish. Instead I would
encourage you to head straight to the chapters or the individual
articles that look most relevant or interesting.
The book was put together in May and early June 2010 which means
that it pre-dates the 1.2 release of the Windows Azure SDK. The 1.2
released adds some great new features, especially for Visual Studio
2010 and .NET Framework 4.0 in areas such as debugging and IDE
integration. Volume Two of this book will cover off those new
features (and more!)
Once you have had a chance to look at the articles please give us
your feedback at http://bit.ly/azureebook1feedback (It should take
less than one minute). Thank you and happy reading.
Eric Nelson
Developer Evangelist, Microsoft UK
Website: http://www.ericnelson.co.uk
Email: eric.nelson@microsoft.com
Blog: http://geekswithblogs.net/iupdateable
Twitter: http://twitter.com/ericnel
WOULD YOU LIKE TO BECOME AN AUTHOR FOR A FUTURE EDITION?
Developers value the sharing of best practices, knowledge and experiences – knowledge and
experiences such as your own. If you have insight into the Windows Azure Platform then you are a
great candidate for becoming an author involved in the next volume of this book as the Windows
Azure Platform continues to evolve and broaden.
Please email me (eric.nelson@microsoft.com) with your proposed article(s) and if possible a “sample of
your work” such as a link to your blog.
6
7. The Windows Azure Platform: Articles from the Trenches
INTRODUCTION TO THE WINDOWS AZURE PLATFORM
The Windows Azure Platform contains three technologies which can be used individually or
together to build solutions which run “in the cloud”. For the first time you are able to run your code
and store your data in Microsoft datacenters and let Microsoft take on some of the responsibility for
keeping your solution running great and able to respond to the changing demands of business.
Solutions can either run entirely on the Windows Azure Platform or as a hybrid, with some of the
solution running on-premise or elsewhere on the Internet. The three key technologies are Windows
Azure, SQL Azure and Windows Azure Platform AppFabric:
Windows Azure
Windows Azure is the cloud services operating system for the Windows Azure Platform.
Windows Azure provides developers with on-demand compute and storage to run your code
and store your data.
Windows Azure supports a consistent development experience through its integration with
Visual Studio 2008 and Visual Studio 2010. Windows Azure is an open platform that supports
both Microsoft and non-Microsoft languages and technologies. Windows Azure welcomes
third-party tools and technologies such as Eclipse, Ruby, PHP, and Python.
SQL Azure
Microsoft SQL Azure delivers the capabilities of Microsoft SQL Server to Windows Azure
applications or applications running outside of the Windows Azure Platform. It can store and
retrieve structured, semi-structured, and unstructured data with the advantage of high
availability through the storage of multiple copies of your data. It enables relational queries,
search, and data synchronization with mobile users, remote offices and business partners.
Windows Azure Platform AppFabric
AppFabric provides secure connectivity as a service to help developers bridge cloud, on-
premise, and hosted deployments. AppFabric comprises Service Bus and Access Control.
From simple eventing scenarios to complex protocol tunneling, AppFabric Service Bus gives
developers the flexibility to choose how their applications communicate; addressing the
challenges presented by firewalls, NATs, dynamic IP, and disparate identity systems.
AppFabric Access Control enables simple, secure authorization for RESTful web services that
federate with a variety of identiy providers.
There are many articles, videos and screencasts designed to help you get up to speed with the
Windows Azure Platform and a great place to start is http://bit.ly/startazure. We also have a Getting
Started chapter within this book.
7
8. The Windows Azure Platform: Articles from the Trenches
AE – ACRONYMS EXPLAINED
If you are new to the Windows Azure Platform then you may need a little help with some of the
acronyms and industry terms used in this book.
REST and RESTful - Representational State Transfer. A style of software architecture to
enable clients and servers to interact.
WCF – Windows Communication Foundation. A technology shipped initially in .NET
Framework 3.0 to allow communication to take please between code running in different
“locations”.
Cloud Computing – running of code and storage of data off-premise. (Also see the 100+
alternative definitions of Cloud Computing e.g.
http://en.wikipedia.org/wiki/Cloud_computing )
Elastic Computing –as more processing power is needed or as more data needs to be stored,
elastic computing (in our case the Windows Azure Platform) promises to rapidly respond to
those demands and provision out additional compute and storage resources.
PaaS – Platform as a Service is one approach to Cloud Computing that favors abstraction and
simplicity over flexibility e.g. the Windows Azure Platform.
IaaS – Infrastructure as a Service is one approach to Cloud Computing that favors flexibility
over abstraction and simplicity e.g. Amazon Web Services.
Codename “Dallas” – a 4th member of the Windows Azure Platform, currently in CTP.
http://www.microsoft.com/WindowsAzure/dallas/
CTP – Community Technology Preview. In simple terms – not quite as solid as a traditional
Beta
8
9. The Windows Azure Platform: Articles from the Trenches
CHAPTER 1: GETTING STARTED
5 STEPS TO GETTING STARTED WITH WINDOWS AZURE
By Jason Nappi
Getting started with a new technology can be daunting, but generally once you get going things
become familiar and learning accelerates. Therefore, I’d like to focus on providing a few of the basic
steps that I recently went through in the hope that it will both answer some of the basic questions
and knock down some of the barriers to accelerated learning. The following are some of the primary
design considerations for what I think of as a typical business application, and the implications of
building those same types of applications in the Azure cloud.
STEP 1: CREATING AN AZURE ACCOUNT.
The first step, as you might imagine, is to set up an Azure account. Since Windows Azure is a cloud
service, you’ll need to create an account in the cloud, and provision a cloud environment. You can
create an Azure account at the Windows Azure Developer portal. This is a pretty straightforward
registration process that will require you to create a Windows Live ID if you don’t already have one
and will require a credit card.
At the conclusion of the registration process you should have access to Windows Azure, SQL Azure
and AppFabric. At this point you haven’t created any cloud services; you’ve only created an account
under which the services you create can be provisioned and deployed.
STEP 2: PROVISIONING A SQL AZURE DATABASE
This step may not be required by everyone, but most of the applications I’ve built have been
database driven. Given that, whether creating a new application or moving an existing one to the
cloud, I think it’s going to be a fairly common question to ask where the database lives and how you
connect to it. The reasonable answer is that if my application is going to be hosted in the cloud, my
database needs to be in the cloud too.
The Windows Azure Platform provides Windows Azure Storage as well as SQL Azure for storing data.
SQL Azure is most similar to the relational databases of the typical business application, so while
Azure Storage may have scalability and cost advantages, SQL Azure provides the more familiar
paradigm. Naturally I’m inclined towards SQL Azure to get started.
In order to create my cloud database I’ll need to return to the Azure account that I set up in step 1
and navigate to the SQL Azure section of the portal https://sql.azure.com. To create a SQL Azure
server, you’ll need to provide a username and password and the SQL Azure Developer Portal will
create a server using a generated unique name similar to crkvq7vdhu.database.windows.net.
With the SQL Azure server created, you can now create the database. There is also an additional
requirement that you configure firewall rules to allow access. Again, for the sake of simplicity, you
can just grant your local machines IP address access to the SQL Azure server.
9
10. The Windows Azure Platform: Articles from the Trenches
Lastly, you might be wondering, as I did, whether the newly created SQL Azure database is accessible
via the familiar SQL Server Management Studio Tools. I was able to successfully connect after
downloading SQL Server Management Studio 2008 R2.
STEP 3: BUILDING A WEB APPLICATION FOR AZURE
Having provisioned our cloud database and proven that you can connect to it with familiar SQL
Server Management studio tools, and assuming you’ve created the tables required by your
application, you’re ready to begin building your application. In order to do so you’ll need to install
the Windows Azure SDK and the Windows Azure Tools for Microsoft Visual Studio 1.1. The good
news about both of these is that they support Visual Studio 2008 and Visual Studio 2010.
Once you fire up Visual Studio you’ll notice a new project template for “Windows Azure Cloud
Service”. After choosing the cloud service template you will be prompted to choose from one of the
cloud service ‘roles’; Web, Worker and WCF Service Roles. Assuming you’ve chosen “ASP.NET Web
10
11. The Windows Azure Platform: Articles from the Trenches
Role”, a solution containing two projects, a cloud services project and the familiar ASP.NET Web
project, will be created. The only real difference between a standard ASP.NET web project and the
ASP.NET Web Role project is the existence of a WebRole.cs file. The WebRole.cs serves as the entry
point for Azure.
When you hit F5 your Azure application starts up and runs inside the development Fabric. The
Development Fabric simulates the Windows Azure cloud environment enabling you to run, test and
debug Azure applications on the desktop!
STEP 4: PACKAGING THE WEB APPLICATION FOR WINDOWS AZURE
Packaging up the application for publishing to Azure turns out to be fairly simple. From within
Visual Studio you can right click on the Cloud Services project and choose Publish from the context
menu. This will package the web application into a .cspkg file, and also create the
ServiceConfiguration.cscfg file. These two files are all you need to deploy your application to
Windows Azure.
STEP 5: DEPLOYING THE WEB APPLICATION TO AZURE.
Now that you’ve packaged your ASP.NET Web Role, you’ll need to return to the Windows Azure
account you created in Step 1 and create your Windows Azure service. Under the Windows Azure
tab choose “new service””Hosted Service” and provide a name and description for your new cloud
service.
Once the Service is created there’ll be two hosted service locations, staging and production. Under
each will be a ‘Deploy’ button. Choose Deploy under Staging. This will bring up a screen asking for
the two files created in Step 4. Provide both files, and deploy. After deploying the package and the
configuration you’ll be provided with a unique url for accessing your application. Now you’ll also see
that you have the ability to ‘Run’ the service.
11
12. The Windows Azure Platform: Articles from the Trenches
The application won’t be accessible via the url until you Run it, so press Run, and wait for it, wait for
it, wait for it…it takes a while to provision the Windows Azure infrastructure for your application, but
once you get the green light you should be good to go.
12
13. The Windows Azure Platform: Articles from the Trenches
These are just a few of the baby steps I’ve taken to become familiar with Windows Azure. With
these steps I’ve been able to demonstrate that developing for Windows Azure is largely the same
development experience that I’m accustomed to. However, one of the more intriguing
considerations when building for Windows Azure is the potential use of Windows Azure Storage as a
data store instead the more conventional relational database provided by SQL Azure.
13
14. The Windows Azure Platform: Articles from the Trenches
THE BEST TOOLS FOR WORKING WITH THE WINDOWS AZURE PLATFORM
By Sarang Kulkarni
“A platform is known by the tooling available around it!” Much clichéd but still holds true. Windows
Azure, though a fairly nascent cloud platform is aptly supported by some fantastic tooling which
make development fun and a developer’s life easy.
Let us get the usual suspects out of the way first to make way for some more interesting kids on the
block, many of which I cannot do without.
CATEGORY: THE USUAL SUSPECTS
Microsoft Visual Studio 2010®
Visual Studio 2010 (VS2010) is a stable development platform for Windows Azure. Though there are
very few changes specific to Azure when compared with VS2008, the overall development
experience is definitely superior. Windows Azure VMs support .Net Framework 4.0 from OS Version
1.2 and therefore it makes sense to use VS2010 to take advantage of the new features of .Net 4.0 in
the cloud. As always, the Express edition is free.
Microsoft SQL Server Management Studio® 2008 R2
The R2 release is recommended for working with SQL Azure. The biggest advantage being the
comfort of an SQL IDE we have grown up with. I don’t think I need to wax poetic about this one, this
is Bread and Butter. Again Express edition is free and recommended as it serves most of the needs.
Download it from:
http://www.microsoft.com/downloads/details.aspx?familyid=56AD557C-03E6-4369-9C1D-
E81B33D8026B&displaylang=en.
User Accounts and Local Security Policy Control Panel applets
I know there’s nothing specific to Azure here. But it comes very handy to have a user with
permissions as laid out at http://msdn.microsoft.com/en-us/library/dd573355.aspx to avoid any
surprises related to user rights while running in the fabric.
CATEGORY: WINDOWS AZURE STORAGE
What: Cerebrata - Cloud Storage Studio
Why: Cerebrata Cloud Storage Studio (CSS) is a WPF based client for managing Azure Storage, as well
as hosted applications. CSS started as a commendable effort by a small firm to provide an intuitive
visual access to the Azure Storage putting the Storage APIs to good use. It now stands as a one stop
solution to manage everything under the Azure Storage, as well as a lot of things in the hosted
14
15. The Windows Azure Platform: Articles from the Trenches
applications.
Figure 1: Cloud Storage Studio - Connect to Azure Account
You can design a table schema in CSS, perform CRUD operations on existing tables,
download/upload table contents to/from the disk and filter table contents. Basic querying support is
also provided which supports the WCF Data Services (formally ADO.NET Data Services) query syntax.
Linq query support would have been a welcome add-on.
Blob storage is a forte of CSS and all possible operations on Blobs and Containers are available. You
can create containers, configure access policies, list blobs in a container replete with the folder
structure, upload/download page/block blobs, rename, copy and move blobs, create and view blob
snapshots (Very useful), create signed URL for a blob. MIME type configuration support is icing on
the already nice cake. My only grudge is the very basic breadcrumb while navigating the container
structure.
CSS also features a simple yet effective service management UI. The design closely resembles that of
the actual azure developer portal. The same features are offered plus a few more. The regular
service management operations like connecting to hosted services, view, deploy, delete services,
swap deployment slots, manage API certificates and manage affinity groups are available. A very
useful feature we find here is a nifty little checkbox at the bottom of the create service deployment
dialog which reads “Automatically run the deployment after creation” – a nice touch.
15
16. The Windows Azure Platform: Articles from the Trenches
Figure 2: Cloud Storage Studio - Deploy a Service
It costs a totally worthwhile 60$ per license.
Notable alternatives are
Cerebrata’s own CSS/e
https://onlinedemo.cerebrata.com/cerebrata.cloudstorage/default.aspx which is a
Silverlight application providing very basic but useful Storage Service administration
the open source Azure Storage Explorer http://azurestorageexplorer.codeplex.com/
Finally, the far from perfect yet still useful open source alternative Azure MMC Snap-in
http://code.msdn.microsoft.com/windowsazuremmc. Azure MMC in its second version and
covers almost all bases as the Cloud Storage Studio and deserves a worthy mention.
Figure 3: Windows Azure MMC
16
17. The Windows Azure Platform: Articles from the Trenches
What: LINQPad http://www.linqpad.net/
Why: It would not be an overstatement to term LinqPad by Joseph Albahari to be the best querying
scratchpad available for Linq. LINQPad can query a varied set of data sources. Of particular interest
to this discussion are SQL Azure, WCF Data Services (Think codename “Dallas”) and Windows Azure
Table Storage. Yes Table storage! LINQPad steps in where Cloud Storage Studio stops being
adequate - the querying capabilities are superior and the interface more powerful.
Figure 4: LinqPad - Sample Query on the WADPerformanceCounters table
As usual some of the best tools come free and LinqPad surely fits the definition. There is also a pro
version available with some bells and whistles like auto-complete, Visual Studio integration etc.
CATEGORY: WINDOWS AZURE DIAGNOSTICS
What: Cerebrata – Azure Diagnostics Manager
http://www.cerebrata.com/Products/AzureDiagnosticsManager/Default.aspx
Why: Azure diagnostics has taken some time to reach the final form we see it in today. There are
few tools which provide the comfort of an Event Viewer or a comprehensive management
dashboard for working with the diagnostic data. Azure Diagnostics Manager (in public beta at the
time of writing) attempts to achieve just that.
The feature set is fairly comprehensive covering the following:
You can either connect to an Azure storage account to read the diagnostics information and
find the deployments from there and connect to the listed deployments or choose to
connect directly to a subscription and get a list of hosted services to monitor.
The Dashboard provides a bird’s eye view of all the diagnostic information collected. One
may choose to view Event Viewer, Trace Logs, Infrastructure Logs, Performance Counters, IIS
Logs, IIS Failed Request Logs, Crash Dumps and On Demand Transfer.
17
18. The Windows Azure Platform: Articles from the Trenches
If you have only deployed a service and are collecting none of these, fret not. Azure
Diagnostic monitor also provides access to the diagnostic monitor inside your Roles as well
as individual role instances through the Remote Diagnostics API. With this you can
enable/disable any of the diagnostic information being collected or you can alter the
verbosity/frequency.
Figure 5: Azure Diagnostics Manager - Performance Counter Graphs
CATEGORY: SQL AZURE
What: SQL Azure migration wizard http://sqlazuremw.codeplex.com/
Why: As most of us working with cloud solutions might have already noticed, the largest chunk of
the work coming to the System Integrators is the migration of existing applications to cloud. One of
the key aspects of this is database migration. SQL Azure migration wizard helps simplify database
migration. With the SQL Azure Migration Wizard we can analyze scripts for SQL Azure compliance,
generate scripts and can migrate databases – schema and data. Migration is supported from SQL
Server to SQL Azure, SQL Azure to SQL Server and SQL Azure to SQL Azure.
Even in its 3.2.2 version it still has its share of quirks but is vastly improved and great for the
mundane tasks in DB migration.
18
19. The Windows Azure Platform: Articles from the Trenches
CATEGORY: GENERAL DEVELOPMENT
What: Fiddler http://www.fiddler2.com/fiddler2/
Why: Fiddler is a Web Debugging proxy. It allows us to inspect all incoming and outgoing HTTP(S)
traffic on a machine. This is particularly helpful while working with the Azure Storage, Azure Service
Management API, Remote Diagnostics Manager API and anything REST. Looking at the HTTP traffic
gives an insight into how the Requests/Responses are constructed, what Responses are received and
a host of other information that every web service developer/consumer will find handy.
Figure 6: Fiddler – Statistics
Fiddler scripting engine can be used to filter in/out requests and/or responses and also issue
preconfigured responses. Fiddler can also target specific processes to filter traffic only from those
processes.
Fiddler provides an API which can be used in a .Net application to programmatically track network
traffic and use almost all of Fiddler’s features. This has enabled some nifty Fiddler Extensions like
Watcher - A Passive Security Audit tool http://websecuritytool.codeplex.com/ , Chad Oswald’s
Request to Code http://www.chadsowald.com/software/fiddler-extension-request-to-code which
gives the required code to issue captured http requests and the JSON Viewer
http://jsonviewer.codeplex.com/ which visualizes JSON objects.
19
20. The Windows Azure Platform: Articles from the Trenches
CHAPTER 2: WINDOWS AZURE PLATFORM
ARCHITECTING FOR AZURE – BUILDING HIGHLY SCALABLE APPLICATIONS
By Steven Nagy
Two key reasons organisations move to the cloud are to reduce cost and leverage economies of
scale. Unfortunately not every type of application is suited to the cloud, and more often than not,
those that are suited for the cloud are not architected for scalability. Further, the Windows Azure
Platform has a pricing model that if not considered during your architecture phase, can negate the
cost benefits of moving to the cloud to begin with. This article will address the key things to consider
when architecting highly scalable applications that are cost-optimised for the Azure platform.
PRINCIPLES OF AZURE ARCHITECTURES
The Windows Azure Platform already provides elasticity, redundancy, and abstractions from the
distributed platform on which it is run. This gives us a flying head start when designing systems for
the cloud, but there are still key measures we need to take to ensure our application doesn’t
become its own worst enemy. Here we define five key tenets to keep in mind throughout the design
and implementation phases of your project.
PARTITION DATA
Data partitioning is not a new concept 1. Traditionally it has helped us break up massive databases
into smaller more manageable pieces, and to improve query performance by splitting unrelated data
into different partitions. In scalable applications it is important for those same reasons, but also
allows us to scale more effectively; imagine serving 500 requests per minute on a single database
versus 50 requests per minute across 10 databases.
Furthermore, storage is cheap. Consider Sql Azure pricing versus Azure Table Storage 2 for 1Gb
storage: $10 and $0.15 per month respectively. Both are at least 3 times redundant. However not
only is Azure Table Storage cheaper, it has inbuilt partitioning mechanisms that allow you to allocate
every single entity (row) of data to a horizontal partition (or shard 3) based on the partition key you
provide. In Table Storage, each partition is a physically different storage node, which means queries
and requests can scale extremely efficiently. If you don’t have complex relational queries, this is the
ideal choice. Denormalising your data can help immensely by removing those relationships and
allowing ease of partitioning. This is essentially the premise of the ‘NoSql’ movement 4.
You should also consider data duplication for further performance increases. Consider a search
function for customers by age demographic or by city; by having two copies of the data in different
1
http://msdn.microsoft.com/en-us/library/ms190787%28v=SQL.100%29.aspx
2
http://www.microsoft.com/windowsazure/pricing/
3
http://en.wikipedia.org/wiki/Shard_%28database_architecture%29
4
http://en.wikipedia.org/wiki/Nosql
20
21. The Windows Azure Platform: Articles from the Trenches
partitions, your query and retrieval time is highly efficient. The flip side to this approach is the added
complexity to managing multiple copies of data.
Partitioning support in Azure can be summarised as follows:
Table entities are horizontally partitioned on partition key
Blobs are partitioned based on their container
Queues are partitioned on a per-queue basis
Sql Azure supports no partitioning
Vertical partitioning is not supported by default however it makes sense to store smaller amounts of
data together when the additional fields are not needed on the majority of requests.
COLOCATION
Sql Azure, Azure Storage, Azure Compute roles, and the AppFabric all have bandwidth costs for data
moving in and out of the data centre. It makes sense to keep this in mind when building our
applications. Azure already lets us choose our data centres and more importantly, we can co-locate
components of our system via Affinity Groups such that network traversal is minimal and faster.
Luckily this is a deployment consideration and not so important with up front design.
CACHE
A more important consideration is the various opportunities to utilise caching mechanisms. There
are many ways that cache can be harnessed to minimise transactions; from end user http requests,
for underlying data stores, or memoization5 purposes. When almost everything in the platform is
accessible via a REST interface, it pays to invest effort into caching. Some cache concepts to consider
are:
Client side timed cache – content that expires after a certain amount of time, preventing
client browsers from requesting a page, serving a local copy instead
Entity Tags6 (ETags) - Allow you to specify a ‘version’ in a http header field; server can
indicate the version has not changed, in which case no other data is exchanged, otherwise
can return all the data for that request
ASP.Net Page level Cache
Distributed Cache7 - has multiple nodes that either all share the same content (shared
everything) or have unique sections of the cache (shared nothing); shared everything
distributed caches work well in Azure because of the throwaway nature of commodity
hardware and ease of scale
STATE
5
http://en.wikipedia.org/wiki/Memoization
6
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
7
http://msdn.microsoft.com/en-us/magazine/dd942840.aspx
21
22. The Windows Azure Platform: Articles from the Trenches
State has often been cast as the enemy of concurrent programming and the same applies at higher
levels of abstraction as well, such as multiple compute instances. Mutable state requires locking and
tracking in concurrent environments, which adds overhead and complexity to applications.
Therefore reducing, or even removing state is an ideology worth pursuing.
Sometimes state is specific to a single user, such as session state. Load balancers in the Azure data
centres are round-robin, therefore as soon as you have more than one web front end you can no
longer store session state in process (default); if session state is critical to your application, look to
move it to Sql Azure or Table Storage instead. However session state is typically abused and is
generally not actually required for the situations in which it is used. As an alternative to sessions,
consider claim based security, such that any page request is accompanied by a set of claims. The
AppFabric Access Control Services can assist with this.
DISTRIBUTE WORKLOADS EFFECTIVELY
Typically when multiple sources need to access a resource there is a level of contention. Locks and
leases need to be taken and other threads are blocked until contention is resolved. As with state,
this problem exists in all forms of concurrent programming, and is as important in multi-instance
work sharing scenarios. Worker roles need to pick up items for processing, but when there are
multiple instances of the same worker role, how do we ensure that each instance does not pick up
the same work item?
The ‘Asynchronous Work Queue Pattern’ is one such solution. By providing a robust, redundant
queuing mechanism that guarantees unique distribution of work items, the workers are ignorant of
leases and locks and can focus on the job of processing work items. Such a queue will be reusable for
many different work types, and the Windows Azure Storage Queue service is an ideal candidate.
There are other messaging architectures that allow us to decouple our components. AppFabric
allows a ‘NetEventRelayBinding’ for Publish/Subscribe scenarios, for example.
MAXIMISE RESOURCES
One could argue that if your CPU is not at 100% it is being underutilised. In Azure you pay for the
core regardless of usage, so it makes sense to get the most bang for your buck.
When using worker roles, multi-threaded architectures are often forgotten. Since adding another
instance means an additional hourly cost, first ensure you are getting the most out of your current
instances. If your worker (or web role for that matter) has lots of IO work, it makes sense to use
multiple threads.
Auto-scaling resources is worth investigating also. Typically an IT department will maintain enough
servers to cope with their peak periods; consider instead starting at trough capacity, and use auto-
scaling functionality to add instances dynamically. When load starts to taper off, start scaling down,
cutting costs as you do.
22
23. The Windows Azure Platform: Articles from the Trenches
Currently you can utilise content delivery services (CDN) to push blobs out to localised edges. This
will help improve latency for your customers. Also consider what could qualify for blob storage;
essentially anything static is a contender:
PDF, Word documents
Videos
Website images
Website CSS and JavaScript libraries
Any static HTML website pages
Silverlight files (XAP)
Blob storage currently allows blobs to be stored in the root container. This feature was specifically
included so that Silverlight applications running from blob storage could place a cross domain policy
file at the root of the URL namespace (a requirement for cross domain policy files).
SUMMARY
While not extensive, this article gave you a brief overview of some key principles to keep in mind
when architecting applications to run on the Windows Azure Platform. By following these guidelines
you should be able to achieve core objectives of scalability and cost recovery in the cloud.
23
24. The Windows Azure Platform: Articles from the Trenches
THE WINDOWS AZURE PLATFORM AND COST-ORIENTED ARCHITECTURE
By Marcus Tillett
COST IS IMPORTANT
Cost-orientated development is nothing new. A low cost approach to building an application or
product is desirable but the methodology used to achieve this is not always very sophisticated.
When considering a cloud platform such as Azure the cost implications of the chosen architecture
can be significant and require a more sophisticated approach. While a traditional on premise or
hosted architecture may not consider cost as a significant factor, cost is an area that receives
significantly more focus for Azure. There are a range of costs that need to be considered; these costs
need to be considered in the context of Azure and of the end to end development and application
lifecycle management processes.
WHAT COSTS TO CONSIDER
The development process can be a significant cost
consideration for Azure. There is a continuum of development Summary
strategies for Azure; from, at one extreme, using the Azure
Cost is much more of an
environment for development, to the other extreme, architectural consideration for
developing without any reference to Azure. There are cost Azure than for a traditional on
implications and significant other pros and cons across this premise or hosted solution.
continuum. As an example, consider the use of software Cost implications of the chosen
factories. With a software factory that uses a strict assembly architecture can be significant.
Costs should be considered in the
process, the cost of using the production platform may be
context of the end to end
prohibitive due to the expense of both the platform and development and application
training required. These concerns would drive a cost-oriented lifecycle management processes.
architecture where all Azure specific components are Model costs for the chosen
abstracted from the developer or potentially replaced with architecture but most importantly
non-Azure components. While this may be an extreme test the model.
example, it does highlight one of several areas to be
considered.
Another significant topic is the methodology applied to the migration of an existing application or
the consideration for setting up data required by a new application. Migration and set up need to
include both the application and the data. The time, processes and procedures needed to transfer
large volumes of data or complex data, in particular, may be a significant undertaking. With the
potential complexity of managing changes to a live data source, the total business cost of the chosen
approach can be a critical factor.
The cost implication of the platform itself is, perhaps, the obvious area to necessitate a cost-oriented
architecture. It is natural to be drawn to, for instance, the dramatic price difference for data storage
between SQL Azure and Windows Azure storage. While this may be critical to some applications, it is
better on balance to construct a solid architecture as this will provide the best long term approach
than initially focusing on cost. This should be supported by modelling the costs of all the components
24
25. The Windows Azure Platform: Articles from the Trenches
of the application. However, it is even more important to test this model for the most cost critical
aspects of the application. Thereby providing an understanding of how the application design and
the charging mechanisms of Azure impact the cost model. With this information the architecture can
be reviewed for significant cost savings. For any aspects that are cost critical, monitoring should be
included in the final application and used to tune the system while ensuring that the evolution of
Azure and the application are analysed for significant cost implications. Indeed monitoring the whole
system as a means to verify costs and SLA is another architectural consideration.
As a way to augment the full cost modelling process, there are some scenarios where the cost of the
platform suggests a cost-orientated architecture. One of these is multi-tenanting of an application
where there are high tenant numbers. A basic on premise or hosted server model with a pair of
servers can enable the creation of a separate IIS web site and SQL Server database for each tenant.
This model supports 10’s or perhaps 100’s of tenants for near same cost as a single tenant.
Translated the same architecture to Azure might consist a Windows Azure Web Role and a 1GB SQL
Azure database. This would equate to an approximately monthly cost of US $100 per tenant but the
cost of this Azure architecture scales linearly with tenant numbers. This is not to state that Azure is
not suitable for multi-tenanted applications, but that where cost is a critical factor for the
application a different architectural approach may be required.
CONCLUSION
Whether the considerations described here could be termed cost-driven8 or cost-oriented
architecture9,10, the terminology is less important than the realisation that cost is much more of an
architectural consideration for Azure than for a traditional on premise or hosted solution.
8
Lessons Learned: Building Multi-Tenant Applications with the Windows Azure Platform
http://microsoftpdc.com/Sessions/SVC33
9
Thinking of... Delivering Solutions on the Windows Azure Platform? http://www.amazon.co.uk/Thinking-
Delivering-Solutions-Platform-Questions/dp/0956155634/
10
Windows Azure Platform for Enterprises http://msdn.microsoft.com/en-us/magazine/ee309870.aspx
25
26. The Windows Azure Platform: Articles from the Trenches
DE-RISKING YOUR FIRST WINDOWS AZURE PROJECT
By Simon Munro
Developer enthusiasm for building solutions based on Azure is not always shared by business. While
it is great (and perhaps obvious to us) that the cloud is ‘the way of the future’ some individuals and
organizations and vendors are ready for the change while others are not. Not all vendors have
technologies for the cloud and many businesses, products, industries and jobs will go as the cloud
wave washes them out to sea. Vendors are scrambling for attention and pushing their biased
marketing oriented opinions through the biggest dinosaurs of all – the print media, that culturally
could not even cope with the changes brought on by the Internet. Most anti-cloud and vendor
bashing opinion plays on fear and its business cousin risk, where the urge is to maintain the status
quo in our (currently) risk-averse environment. It is unsurprising then that the people that we need
to make decisions about cloud computing in our own organizations are confused, wary and reluctant
to make a commitment to our latest idea of running our solution on Azure. The term ‘the cloud’ has
become synonymous with ‘the web’ and is indistinct from ‘cloud computing’ platforms that we are
interested in – the unfortunate side effect being that the behaviour of Google, Facebook, Apple and
other web-consumer facing properties that willy-nilly change terms of service and sell personal data
for profit casts a shadow over business oriented cloud computing services.
While the dust may settle at some point in future, if we want to build a solution on Azure any time
soon, we will have to take responsibility for helping business understand the issues in order to gain
their support. While we may prefer to deal only with technical issues, the current reality is that in
most environments we have to proactively discuss the perceived risks and demonstrate that we, as
well as the Microsoft and the Windows Azure platform, are actively managing and reducing risk.
POPULAR RISKS
Risks to data are by far the most publicised because once data is in databases that are outside of an
organization’s locked-down data centre a degree of control and authority over the data is lost.
Unlike students that go and live in co-ed dorms, data does not get drunk and put pictures of itself up
on Facebook when it leaves home, but the suspicion still remains that off premise data is a high risk.
While the risk to data may increase, the actual risk, in most cases, is greatly exaggerated and
manageable.
Process related risks are also well known, centred on the involvement of other parties in the
operational aspects of the solution. No longer can business dictate service levels or even have
confidence in an external supplier of services that they may have had with their own internal IT. Like
with data, there are real issues here that have fairly complex contractual ramifications as customers
attempt to reduce vendor lock-in, guarantee service levels and maintain operational, security,
performance and other standards.
COVERT RISKS
While mainstream CIO information sources popularise some risks by extensive coverage, there are
many risks that are just as real but less well known, often due to their more technical nature.
26
27. The Windows Azure Platform: Articles from the Trenches
The most obvious is the lack of skills and experience in creating secure, reliable and performant
cloud computing solutions. This also related to the problem of development engineering costs that
could be higher than simply throwing hardware at performance bottlenecks.
Even Microsoft, as our trusted provider of platforms and tools still has risks embedded within Azure.
The lack of on-premise alternatives to cloud technologies such as Azure tables and queues makes
the commitment to the platform quite high (a kind of vendor lock-in) and the tooling is still
immature and unable to easily support accepted engineering practices such as continuous
integration (see ‘Using a CI build to achieve an automated deployment of your latest build’ by Grace
Mollison) .
NON-TECHNICAL TACTICS FOR REDUCING RISK
While ultimate responsibility for managing risk falls to project managers and other people within the
organization, the identification of risk still remains the responsibility of everybody on the team. By
downloading this book you have more knowledge of cloud computing than many of your co-
workers, so before getting into the technical aspects, you will need to shoulder additional
responsibility and deal with some aspects of reducing risk that do not involve code.
CHOOSE THE CORRECT APPLICATION
Choose something simple that is better suited to cloud computing, such as one that is public facing
and may have demand peaks. Build on those successes before tackling applications that contain
sensitive data, integrate with a lot of other systems, are a migration of an existing legacy system or
contain a lot of traditional database storage and reporting.
ENGAGE EARLY
Even if your project is a low profile skunk works development, you need to engage with legal,
compliance, operations, finance, audit and other parts of the business sooner than usual. Normally
we would not worry about throwing up a new website onto our existing data centre, but if you
surprise people with a rogue cloud computing application it may get shot down.
UNDERSTAND THE PRICING AND OPERATIONAL MODEL
As much as it may look simple on the surface, digging deeper into the pricing, billing, SLA’s and
related aspects of cloud computing platforms can become complicated, with broad reaching impacts
on legal positions, compliance and interdepartmental feuds. You have to at least put the Azure
prices in a spreadsheet with your estimated requirements and put an annotated printout of the SLA
in your project sponsor’s hands.
UNDERSTAND THE IMPLICATIONS
While it may be unnecessary to do a full threat model, you need to understand the possible
financial, reputational and other risks if your application is compromised or the data gets lost.
27
28. The Windows Azure Platform: Articles from the Trenches
Understanding the effects of loss should influence your approach to what data is stored on the
cloud, for how long and whether it is moved to on-premise storage.
FAMILIARISE YOURSELF WITH ON-PREMISE RISKS
Because cloud computing is seen to have security risks, the focus on security often means that the
solution is more secure than the on-premise counterparts. Whenever defending the risks of cloud
computing make sure that you compare them to the existing everyday risks of the existing on-
premise platform. Not all solutions, networks and other infrastructure can actually deliver the
availability and security that they promise.
UNDERSTAND THE APPETITE FOR RISK
Culturally, startups can absorb cloud computing risks as part of their overall risk exposure compared
more risk averse organizations such as banks that are, at least this year, less likely to absorb
additional risk. More mature organizations have processes and committees for managing risks and,
although it may ultimately be the project sponsor’s responsibility, you need to get a feel for the
ability of the organization to take on risk before you pitch your big idea.
TECHNICAL TACTICS FOR REDUCING RISK
HOW EXTREME?
Microsoft has made it quite simple to take a good ‘ol ASP.NET web application with an underlying
SQL database and throw it up onto the Azure cloud with minimal changes. On the other hand,
building a well architected solution that has been optimised for a cloud computing environment is
more difficult, involved and risky. If your system is being built within a risk averse environment and
does not need to be built for the cloud, forgo Azure storage, worker processes, federated identity
management and other cloud specific technologies and build a simple solution with web roles and a
SQL database. Azure will support you well whichever approach you choose, but you need figure out
how much on the fancy new stuff you really need and make those decisions early.
DEFINE THE APPROACH TO DATA
When it comes to cloud computing risks, data is the most sensitive and active topic and it needs to
be addressed early on in the solution design. Fortunately SQL Azure addresses many of the concerns
and risks around the NoSQL-like Azure tables by providing a familiar database platform if such
familiarity is required, but ultimately Azure storage, caching and other technologies need to be
considered in any good Azure architecture. Whatever the bias for storage in the Azure cloud, there
is still the issue that the data is in the cloud and it needs to be dealt with in your architecture. There
may be a requirement to move or copy data from Azure to an on premise database for reporting,
integration with other systems or even just the feeling that the data is safer.
MANAGE THE ENGINEERING COST
28
29. The Windows Azure Platform: Articles from the Trenches
Unless you have built a reasonable sized application on Azure and deployed it in a live environment
there are going to be unforeseen technical challenges that will present themselves. By reading this
book you are clearly on the right track and trying to learn from the experiences of others, but you
need to do a lot more than just read or learn on the job. You need to install the tools, write code,
deploy, put it under load, scale up, scale down, debug, diagnose and try out a lot of unfamiliar
patterns and technologies just to reduce the impact of unforeseen quirks.
IMPLEMENT WITH GOOD ENGINEERING PRACTICES
The future of your first Azure application is fairly unsure – cast your mind out two years and you
cannot be sure that your architectural choices were correct, technical components have been added
or abandoned, regulations have changed or the attitudes of your organization towards cloud
computing have altered. The concerns raised by the software craftsmanship movement of
maintainability, testability, extensibility are amplified in such an environment which is years from
settling down. The Azure combination of a well established platform in the .NET ecosystem and
some new technologies, approaches and thinking thrown in means that we have both the need and
the frameworks to craft solutions properly to reduce the risk that we are exposed to. Testability,
inversion of control, loose coupling and other software craftsmanship techniques are well
supported, understood and debated on the .NET platform and are therefore (reasonably) portable
onto Azure. You need to hone these skills as single layered, monolithic architectures that seem easy
at first and are encouraged by Microsoft marketing and tooling will result in an approach with high
and unnecessary risk in an already risky space.
DEVELOPER RESPONSIBILITY
While technologists may be excited at the technical opportunities of cloud computing, business and
other decision makers are probably more wary of the cloud than any other (recent) computing
technology shift. They are reading conflicting messages by vendor marketers and self proclaimed
cloud experts while their own staff are both protecting existing jobs and whispering discord in the
passageways. So while risk management and selling of architectures may not be amongst the most
exercised developer skills, cloud computing requires that we take cloud computing to the business
and take some responsibility for allaying fears.
29
30. The Windows Azure Platform: Articles from the Trenches
TRIALS & TRIBULATION S OF WORKING WITH AZ URE WHEN THERE’S MORE THAN ONE
OF YOU
By Grace Mollison
I had enormous fun working on an Azure project See the Difference that took 7 weeks from start of
development to handing over to the client
The technology stack used was: Windows Azure hosting, Windows Azure Storage, SQL Azure,
ASP.Net MVC, N2CMS, Spark View Engine, Castle Windsor, xVal, PostSharp
There was one bug bear in that the Azure development experience is NOT designed for a team of
developers and I needed to get that sorted out. So where did I start?
With a list of course. Here were the big ticket items:
The ability to set up three environments Development, Testing and UAT. Testing and UAT to
be accessible by all members of the team
Shared access to the hosted environment
Automated deployments to the cloud as part of a CI build. After all no self-respecting
development team doesn’t have a continuous integrated build do they?
DEVELOPMENT ENVIRONMENT
For the development environment we stuck to Visual studio 2008 SP1. Visual Studio 2010 was in
beta2/ RC when we undertook the development but with all the potential unknowns with Azure that
was a step too far. The Azure developer tools were installed on each developer workstation and the
Azure SDK on the build server. There was an upgrade to the Azure SDK during the development cycle
which the development team said was needed which meant updating the various machines that
constituted the environment manually ( Alas no WSUS ) . Fortunately this only happened once
during the development cycle.
In addition to Visual studio we also supplemented the development environments with a few extra
tools that provided a more complete development experience.
TEST ENVIRONMENT
The Test environment proved to be more challenging. The most pragmatic way to sort it out was to
provision another development work station running the development fabric. But (yes I know
there’s always a “but”) the Development fabric runs against the local loopback address. To get round
this a SSH tunnel had to be set up between the target machine and the Client machines that needed
to access it. Alas this proved to be slightly less than user friendly plus the fact the random allocation
of ports for the local storage fabric had to be resolved after each new deployment made it basically
unworkable. The differences between the Development fabric and Azure fabric was also impacting
the team deliverables as we ended up seeing differences in behavior or could only test certain
functionality in the staging environment. We resorted to using Azure Staging as our Test
environment.
30
31. The Windows Azure Platform: Articles from the Trenches
I was anticipating an easy ride from here on but.... yes it’s another of those “Buts”.
CERTIFICATES
The team members needed to either use their own self signed certificate or to use a certificate I
generated which is then uploaded onto Azure. As the team was small and fluid the decision went
with using one I generated. This turned out to be a good call as we did have problems with
certificate connections apparently timing out after some time for some team members for no
obvious reason. Because there was only one certificate to worry about it was relatively painless to
resolve the problems around the use of this. It is bad practise to share certificates in this way but
pragmatism was the order of the day. For a larger team with a longer development cycle I would
advocate each developer using a personal certificate which can then be easily revoked.
One thing we quickly learnt was that in the early days of development, suspending, then deleting
was the safest approach to deploying a new package. The small team meant it was easy to
communicate the change of URL this caused.
WHEN THINGS GO WRONG
It’s a fairly nerve racking experience when things go wrong as often you can do nothing but wait for
Azure to barf and throw a Dr Watson and there’s no real feedback when Azure tries to spin up the
roles.
Alas as soon as we got to UAT we then had to give up our staging environment and minimise
changes to the Staging URL as both the client and a 3rd party needed to know the URL. The loss of
this environment for system testing meant we were forced to press my personal Azure account into
service as the Staging environment.
We did get the automated deployment in place but it’s a tale too long to describe in this article.
SUMMARY
The Windows Azure Platform may not be quite ready for team development out of the box but
once you understand what needs to be addressed the barriers for team development are easily
overcome . You can with a small amount of work up front treat development for the Windows
Azure Platform as you would any other application developed using your familiar team
development tools.
31
32. The Windows Azure Platform: Articles from the Trenches
USING A CONTINUOUS INTEGRATION BUILD TO ACHIEVE AN AUTOMATED DEPLOYMENT
OF YOUR LATEST BUILD
By Grace Mollison
This article assumes familiarity with Team Foundation build and MSBuild concepts such as tasks and
properties
Setting up a Continuous Integration (CI) build to automatically push a successfully built package
directly to Azure cannot be achieved straight out the box but requires some additional work. This
article outlines an approach taken whilst delivering the See the Difference project using the
Windows Azure Platform.
GETTING THE RIGHT “BITS”
The first thing that was done was to collate and configure the components that would be needed to
allow the build server to access the Target Window Azure portal via a command line.
To do this requires using the Azure Service Management API. Using the API requires an x.509
certificate. I created a self-signed one using the makecert tool which is part of the windows SDK. An
example on how to do this is shown below:
"c:Program FilesMicrosoft SDKsWindowsv6.0Abinmakecert" -r -pe -a sha1 -n
"CN=Windows Azure Authentication Certificate" -ss My -len 2048 -sp "Microsoft
Enhanced RSA and AES Cryptographic Provider" -sy 24 MySelfSignedCert.cer.
The blog post Creating and using Self Signed Certificates for use with Azure Service Management API
explains in detail how to configure the certificate on the target Azure portal and the machine that
needs to communicate with the portal.
I downloaded the Windows Azure Service Management PowerShell CmdLets and also the Windows
Azure Service Management API Tool which are both handy for remotely accessing the Azure portal
via the Service Management API. At this stage I had no idea which one I would be using. I tried them
both as part of a Build and found that I preferred using the service management API tool csmanage
(despite being a big fan of Powershell). The blog post referred to above illustrates the use of the
x.509 certificate, the API and Powershell to deploy to the Azure staging environment.
PACKAGING FOR DEPLOYMENT
Next I looked at packaging the application ready for deployment. There are two key things when
packaging the application from the command line :
1. Obtain the role types and names as this will be needed to construct the package
2. Make sure the location of the service definition file is known
The ServiceDefintion.csdef file contains the role types and names as this is needed to construct the
package using the Windows Azure command line tool cspack. Below is a snippet from a
ServiceDefintion.csdef file illustrating a simple example with one web role. The number of instances
does not matter to cspack :
32
33. The Windows Azure Platform: Articles from the Trenches
<ServiceDefinition name="SeeTheDifference.Cloud"
xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceDefinition">
<WebRole name="SeeTheDifference.Web" enableNativeCodeExecution="true">
<InputEndpoints>
If cspack is not run from the correct place the package will not be constructed correctly hence why
the location of the ServiceDefintion.csdef file is so important.
DEPLOYING
At this stage I was able to package the application and deploy to the Azure portal via MSBuild. We
had concerns with this approach with regards problems with the actual package affecting the
deployment. In particular we were concerned about what to do after handover to the client when a
little more caution would be called for. A change of plan was decided upon.
The new plan was to push the package to blob storage and then the Client would be able to carry
out the deployment at their convenience.
To push the package to blob storage a C# console application I called LoadBlob was written that
could be called from the MSBuild script. This application pushed the package to a pre-determined
container.
It was decided that storing the configuration (.csfg) file in blob storage was also a good idea as it
would reduce the risk of non production configuration settings being used. During testing I was
unable to get the service management API to use the stored configuration file. It was only able to
use one stored on the local system, but as the end to end deployment process we were
implementing actually required a pause for breath before the push to Azure Staging or production
this issue did not affect the implementation of the CI build process.
Finally after testing all the constituent parts, they were incorporated as part of the CI build.
Below is a snippet from a TFSbuild.proj file where I overrode the target AfteDropBuild.
The AfterDropBuild task is called after dropping the built binaries and I used it to insert some
commands to allow the build to use cspack ( equivalent to zipping the dlls and configuration files )
to package the cloud service package which is then pushed up to blob storage ready for deploying
to staging or Production.
<PropertyGroup>
<PathToAzureTools>c:Program FilesWindows Azure
SDKv1.0bincspack.exe</PathToAzureTools>
<cPkgCmd>"$(PathToAzureTools)" SeeTheDifference.Cloud.csxServiceDefinition.csdef
/role:SeeTheDifference.Web;seeTheDifference.Cloud.csxrolesSeeTheDifference.Webapproot;See
TheDifference.Web.dll</cPkgCmd>
<LoadblobPath>c:TOOLSAzureDeployment</LoadblobPath>
<LoadBlobCmd>$(LoadblobPath)Loadblob.exe </LoadBlobCmd>
</PropertyGroup>
33
34. The Windows Azure Platform: Articles from the Trenches
<Target Name="AfterDropBuild" DependsOnTargets="DeriveDropLocationUri" Condition="
'$(IsDesktopBuild)'!='true' ">
<Message Text=" cspack creating a package for deployment"/>
<Exec Command="$(cPkgCmd) /out:c:DropsSD_Deploy$(BuildNumber).cspkg"
WorkingDirectory="c:Dropstest$(BuildNumber)ReleaseSeeTheDifference" />
<!-- Load blob to Azure into deployment container set via config file settings target container
will be cleared before uploading -->
<Message Text =" Copying '$(BuildNumber)'.cspkg to deployment container in Azure " />
<Exec Command ="$(LoadBlobCmd) -upload $(BuildNumber).cspkg"
WorkingDirectory="c:DropsSD_Deploy" />
</Target>
The screenshots below show the uploaded cskpg in blob storage:
The deployment could then be completed by using a user friendly tool like Cerebreta Cloud Storage
Studio.
34
35. The Windows Azure Platform: Articles from the Trenches
USING JAVA WITH THE WINDOWS AZURE PLATFORM
By Rob Blackwell
With a name like Windows Azure, you could be forgiven for thinking that Microsoft’s cloud
computing offering is a Microsoft-only technology. In fact it has a lot to offer Java developers
through its use of open standards and RESTful APIs.
ACCESSING WINDOWS AZ URE STORAGE FROM JAVA
WindowsAzure4J is an open source library that can be used to access Windows Azure Storage from
Java applications, running on Windows Azure or elsewhere. Download the JAR file from
http://www.windowsazure4j.org/ . You’ll also need to grab some other dependencies
commons-collections-3.2.1.jar
commons-logging-1.1.1.jar
dom4j-1.6.1.jar
httpclient-4.0-beta2.jar
httpcore-4.0.jar
httpcore-nio-4.0.jar
httpmime-4.0-beta2.jar
jaxen-1.1.1.jar
log4j-1.2.9.jar
To get started, you’ll need an account name and account key from the Windows Azure portal. Paste
these into the sample code provided with WindowsAzure4j to use Blobs, Queues or Tables.If you are
an Eclipse user, you can also install the Windows Azure Tools for Eclipse
http://www.windowsazure4e.org/
35
36. The Windows Azure Platform: Articles from the Trenches
The Windows Azure Storage Explore running in Eclipse.
RUNNING JAVA CODE ON WINDOWS AZURE
If you want to host a Java application in Windows Azure, there are a number of considerations.
The first thing to note is that even if your Java application is a Web application you probably won’t
want to use an Azure Web Role. The principle difference between web roles and worker roles is
whether Internet Information Services (IIS) is included. Most Java developers will want to use a Java
specific web server or framework, so it’s usually best to go with a worker role and include your
choice of web server within your deployment package.
You’ll also need to bootstrap Java from a small .NET program that will essentially invoke the Java
runtime through a Process.Start call.
Both web roles and worker roles are provisioned behind a load-balancer so either is suitable for
hosting web applications. In a worker role you just have to do some additional plumbing to connect
up your web server to the appropriate load-balanced Input End Point. So for example, the public
facing port 80 of yourapp.cloudapp.net might get mapped to, say port 5100 in your worker role.
The following code allows you to determine this port at runtime:
RoleEnvironment.CurrentRoleInstance.InstanceEndpoints["Http"].IPEndpoint.Po
rt
36
37. The Windows Azure Platform: Articles from the Trenches
Fortunately both The Tomcat Solution Accelerator and AzureRunMe handle all of these
technicalities for you.
The Tomcat Solution Accelerator is a good choice if you have a traditional Java based web
application. It supports Java Servlet and Java Server Pages applications, possibly packaged as a WAR
file. It can be downloaded from http://code.msdn.microsoft.com/winazuretomcat . The accelerator
walks you through the process of creating an Azure cloud services package file that contains your
application as well as the Tomcat server and Java Runtime. It automatically handles the necessary
configuration. Just upload the resulting cspkg file to Windows Azure, wait for it to deploy then bring
up your web browser and browse to http://yourapp.cloudapp.com
AZURERUNME
AzureRunme (http://azurerunme.codeplex.com/) doesn’t assume any particular web server or
framework. In fact you could just run a straightforward command line application with no visible
user interface. That said, I’ve used it successfully with both Restlet (http://www.restlet.org/) and
Jetty (http://jetty.codehaus.org/jetty/ ).
Imagine that you were going to run your application from a USB drive and that you weren’t allowed
to install any software onto the machine – you’d have to include the Java Runtime Executive (JRE) ,
all the library JAR files and any data all in subdirectories of the USB stick. You’d probably create a
.BAT file at the top level to run everything. Like this:
cd MyApp
..jrebinjava -cp MyApp.jar;lib* Start %1
AzureRunme takes a similar approach – put all these files together in a single ZIP file, upload it to
Blob storage. Download AzureRunMe cspkg file and use this to bootstrap your Java code.
Notice that the batch file takes a parameter %1 – This is the port that you should use if you want to
bring up a web server – the load balancer will direct all HTTP traffic to your application on this port.
AzureRunme comes with a Trace listener that uses the Service Bus to relay standard output and any
log4j messages back to a command window on your desktop machine. It makes it easy to see trace
messages, watch your application’s progress and see any exception messages.
37
38. The Windows Azure Platform: Articles from the Trenches
AzureRunMe Trace Listener showing log messages relayed via the AppFabric Service Bus.
For more information about Interoperability on the Microsoft platform see
http://www.interoperabilitybridges.com/
38
39. The Windows Azure Platform: Articles from the Trenches
CHAPTER 3: WINDOWS AZURE
AUTO-SCALING WINDOWS AZURE COMPUTE INSTANCES
By Steven Nagy
INTRODUCTION
There are many reasons applications need to scale. Some applications have on/off periods of batch
processing (for example overnight render farms), some have predictable peak loads (for example
share market applications peak during open and close of the market) and some might have
unpredictable peak periods (for example your website gets linked by Slashdot).
In the case of predictable peak loads we can easily log in to the Windows Azure portal and adjust our
configuration file to increase the number of instances of our web and worker roles. However, when
application load peaks unexpectedly, we want our applications to respond immediately. For
applications with global reach, this might be when we least expect it. Without appropriate
monitoring techniques we may not even know the extent to which we are failing to serve requests.
On the flip side, we are paying for every CPU core hour we use. Thus we want to be able to scale
down instances that are underutilised.
We need to know how to auto-scale; our applications need to become smart.
A BASIC APPROACH
There are a number of jigsaw pieces that need to fit together to build the auto-scaling picture. The
first piece is monitoring, which lets us pull information from the roles that need to auto-scale. The
next piece is about establishing rules and measuring against thresholds to determine when to scale
up and scale down. The third piece establishes trust between the service that is doing the
monitoring (referred to from here on as the ‘Scale Agent’), and the roles that are being monitored.
Finally, the Scale Agent needs to instruct the Windows Azure Portal to add or remove instances of
those roles as it deems
necessary.
Monitoring Rules
Scale Agent
Trust Instruct
THE SCALE AGENT
The Scale Agent is responsible for monitoring your application, applying rules and instructing the API
to scale your roles, and can be hosted in different ways. One option is hosting the agent as another
process on your existing Azure roles, but a role can have many identical instances, so which instance
would it run on? And the agent will take some CPU resources, could that impact on its ability to
39
40. The Windows Azure Platform: Articles from the Trenches
assess the other work running on the same role? It makes more sense to move the Scale Agent to a
separate location that doesn’t interfere with the standard workload, where its own workload won’t
pollute the statistics.
The agent can be hosted as another worker role, separate to the main work being done by the
application. This worker role would never need to scale, and could be geo-located and co-located
near the compute instances that it needs to monitor. This removes external bandwidth costs and
allows for faster processing/assessment.
You could also host the agent off site completely, perhaps in your own data centre, as a windows
service. This means you have more control over the agent, but the agent will be slightly slower
communicating to the instances, getting performance counter logs, and issuing scale commands.
A dedicated worker role is usually the best option but also the hardest to configure for trust as we’ll
see further on.
MONITORING: RETRIEVING DIAGNOSTIC INFORMATION
Before we can make decisions about scaling, we need to know some simple statistics about the
services we want to scale. These statistics in turn let us make informed decisions.
Diagnostic helper processes will put performance counter information into table and blob storage,
so this will require an Azure Storage project. There are lots of counters to choose from, but we
usually want to monitor memory usage, CPU usage, and number of requests per second, and if any
one of those exceeds an upper threshold then we want to scale up.
The role that needs to auto-scale will be responsible for gathering its own performance information
and dumping it into a storage table. This is done via configuration classes, available in
Microsoft.WindowsAzure.Diagnostics namespace:
var perfConfig = new PerformanceCounterConfiguration();
perfConfig.CounterSpecifier = @"Processor(0)% Processor Time";
perfConfig.SampleRate = TimeSpan.FromSeconds(5);
We create a configuration item for a performance counter we want to track – in this example we
want information about CPU utilisation. The average utilisation will be gathered over 5 second
intervals.
var config = DiagnosticMonitor.GetDefaultInitialConfiguration();
config.PerformanceCounters.DataSources.Add(perfConfig);
config.PerformanceCounters.ScheduledTransferPeriod =
TimeSpan.FromMinutes(1);
DiagnosticMonitor.Start("DiagnosticsConnectionString", config);
40
41. The Windows Azure Platform: Articles from the Trenches
We then add the performance counter to the list of items we want the DiagnosticMonitor to track
for us. The DiagnosticMonitor runs in a separate process on the virtual machine instance so it won’t
interfere with our normal application code. Every minute new performance counter information will
be written back to a storage account as specified in the ‘DiagnosticsConnectionString’, into a table
called ‘WADPerformanceCountersTable’. We can verify the counter information made it into the
table using 3rd party tools
You can see that the table has an entity which has a property called ‘CounterValue’ which contains
our CPU utilisation.
I won’t go into the code required to view an entity in table storage; this is very well documented
already11. Your Scale Agent will retrieve these values by polling the table occasionally and keeping
track of the utilisation, scaling when needed.
RULES: ESTABLISHING WHEN TO SCALE
11
http://blogs.msdn.com/jnak/archive/2010/01/06/walkthrough-windows-azure-table-storage-nov-2009-and-later.aspx
41
42. The Windows Azure Platform: Articles from the Trenches
The Scale Agent now knows what levels your various role instances are at based on the performance
counter information. However deciding when to scale up/down is difficult and can easily become an
exercise in advanced mathematics. Although the rules are different for every application, here are
some common issues to consider:
You usually need a certain amount of head room, in case you get a sudden spike in load
before your Scale Agent can spin up more instances
Immediately after scaling up, your original instances might still be over the threshold –
prevent your agent from scaling up again immediately until enough time has passed that you
can be positive that more scale is needed
Aggregate your usage from all instances – if a single instance is spiking but the rest are under
normal load, you don’t really need to scale
If you do need more instances, scale up based on how many instances you currently have.
For example, if you only have 5 instances, you might want to add 2 more (40% increase)
before checking again. If you have 50 you may only want to add 10 (20% increase)
Try to predict load based on patterns of behaviour. For instance, if over the last 15 minutes
you’ve been steadily climbing by 5% utilisation per minute, you can predict that you will
probably go over your threshold in X number of minutes. Why wait until you are over loaded
and losing connections before scaling? Analysing these kinds of patterns can let you scale up
“just in time”
Predictive patterns can get very complicated – if at 4pm every day you seem to have
additional load, prepare in advance for scale rather than waiting for auto-scale to kick in
Keep in mind that long running requests can provide false positives – if all web threads are
used for an instance but all those threads are held up in IO requests, you will still have low
CPU utilisation, so consider a range of performance counters specific to your type of
application and architecture
Hard limits – If your average is 3 instances, would you want your application to be allowed to
auto-scale up to 500 instances? That’s probably not a credit card bill you want to receive, so
consider imposing some hard limits to scale, or provide some reasonable alerting (SMS,
email, etc) so that if your app DOES scale to 500, you can find out immediately and hop
online to see why
TRUST: AUTHORISING FOR SCALE
There is a rich management API that can be used to control your Windows Azure projects, however
in order to issue commands there needs to be trust between the Scale Agent and the API of the
account hosting the roles – this trust is established via X509 certificates.
Generating certificates is also well documented. Once created, we need to provide our certificate in
3 places:
The Windows Azure Account – for the Service Management API to check requests against
The virtual machine issuing commands – in our case, where the Scale Agent is hosted
The service configuration and definition for our Scale Agent project
42
43. The Windows Azure Platform: Articles from the Trenches
In the Windows Azure portal for the account you wish to manage, there is an ‘Account’ tab where
you can upload DER encoded certificates with a .CER extension:
You must also upload the certificate in the Personal Information Exchange format with a .PFX
extension and the matching password to your service project so that the certificate becomes
available to any virtual machine instance provisioned from that entire project. This can be found
under the Certificates section of your service deployment:
Click on ‘Manage’ and upload the .PFX version of your certificate. It is important to note that this is
not installing the certificate to the role instances under this service. Instead it is making the
certificate available to any role that requests it. To make that request we have to complete the third
step and tell our Scale Agent role that it will require that certificate.
43
44. The Windows Azure Platform: Articles from the Trenches
While it is possible to enter the required XML manually, it
is much easier to use the property pages instead. For the
role that needs the certificate (i.e. your Scale Agent role)
find it in your Cloud Service project, right click and select
properties. In the property pages, find the Certificates tab
on the left.
Select ‘Add Certificate’ from the top and enter the details. The important part here is finding your
certificate under the right Store Location and Name. This screen presumes the certificate is installed
locally as it uses local machine stores to search for it. If you don’t have it installed locally, you can
just paste in the thumbprint manually.
That wraps up all 3 parts of the certificate process. When your role is deployed to Windows Azure, it
will ask for the certificate with that thumbprint to be installed into the virtual machine.
SCALING – THE SERVICE MANAGEMENT API
We know we need to scale, we have established trust, all we need to do is issue the command:
scale!
All API calls are RESTful, but there is no API that exists solely for scaling up and down. Instead this is
done through the service configuration file, which is maintained separately from the service
deployment. You can at any time go and change the configuration for your deployment through the
portal, and the API is just an extension of this functionality.
The steps required are:
1. Request the configuration file for a service deployment
2. Find the XML element for the instance count on the role you are scaling
3. Make the change
4. Post the configuration file back to the service API
If you don’t want to manually manipulate the REST API yourself, Microsoft has posted code samples
to assist you, including samples on scale12 and services management API 13.
12
http://code.msdn.microsoft.com/azurescale
13
http://code.msdn.microsoft.com/windowsazuresamples
44
45. The Windows Azure Platform: Articles from the Trenches
SUMMARY
This short article provides you with the theory to scale up your applications reactively. Scheduled
scale up/down can also be automated with the same technique defined above but instead of scaling
reactively, you can also scale proactively.
While this article has presented just one way of scaling automatically, there are other derivatives
and approaches you could follow. For example, the Scale Agent could pull diagnostic information
from the roles via the Diagnostic Manager classes, rather than the roles pushing that information.
Open source framework Lokad.Cloud14 takes another approach by allowing roles to auto-scale
themselves. Find the approach that’s right for you and capitalise on economies of scale today!
14
http://code.google.com/p/lokad-cloud/
45
46. The Windows Azure Platform: Articles from the Trenches
BUILDING A CONTENT-BASED ROUTER SERVICE ON WINDOWS AZURE
By Josh Tucholski
Some applications, depending on their nature, require priority processing based on request content.
It is typical in these scenarios to develop an application layer to route requests from the client to a
specific business component for further processing. Implementing this in Windows Azure is not
straightforward due to its built-in load balancer. The Windows Azure load balancer only exposes a
single external endpoint that clients interact with; therefore it is necessary to know the unique IP
address of the instance that will be performing the work. IP addresses are discoverable via the
Windows Azure API when marked as internal (configured through the web role’s properties).
While this tutorial may seem more of an exercise on WCF than on Windows Azure, it is important to
understand how to perform inter-role communication without the use of queues.
In order to filter requests by content, an internal LoadBalancer class is created. This class ensures
requests are routed to live endpoints and not dead nodes. The LoadBalancer will need to account for
endpoint failure and guarantee graceful recovery by refreshing its routing table and passing requests
to other nodes capable of processing. Following is the class definition for the LoadBalancer to detect
endpoints and recover from unexpected failures that occur.
public class LoadBalancer
{
public LoadBalancer()
{
if (IsRoutingTableOutOfDate())
{
RefreshRoutingTable();
}
}
private bool IsRoutingTableOutOfDate()
{
//Retrieve all of the instances of the Worker Role
var roleInstances = RoleEnvironment.Roles["WorkerName"].Instances;
//Check current amount of instances and confirm sync with the LoadBalancer’s //record
if (roleInstances.Count() != CurrentRouters.Count())
{
return true;
46
47. The Windows Azure Platform: Articles from the Trenches
}
foreach (RoleInstance roleInstance in roleInstances)
{
var endpoint = roleInstance.InstanceEndpoints["WorkerEndpoint"];
var ipAddress = endpoint.IPEndpoint;
if (!IsEndpointRegistered(ipAddress))
{
return true;
}
}
return false;
}
private void RefreshRoutingTable()
{
var currentInstances = RoleEnvironment.Roles["WorkerName"].Instances;
RemoveStaleEndpoints(currentInstances);
AddMissingEndpoints(currentInstances);
}
private void AddMissingEndpoints(ReadOnlyCollection<RoleInstance> currentInstances)
{
foreach (var instance in currentInstances)
{
if
(!IsEndpointRegistered(instance.InstanceEndpoints["WorkerEndpoint"].IPEndpoint
))
{
//add to the collection of endpoints the LoadBalancer is aware of
}
}
}
private void RemoveStaleEndpoints(ReadOnlyCollection<RoleInstance> currentInstances)
{
//reverse-loop so we can remove from the collection as we iterate
for (int index = CurrentRouters.Count() - 1; index >= 0; index--)
{
bool found = false;
foreach (var instance in currentInstances)
{
//determine if IP address already exists set found to true
}
if (!found)
{
//remove from collection of endpoints LoadBalancer is aware of
}
}
}
private bool IsEndpointRegistered(IPEndpoint ipEndpoint)
{
foreach (var routerEndpoint in CurrentRouters)
{
if (routerEndpoint.IpAddress == ipEndpoint.ToString())
{
return true;
}
}
return false;
}
public string GetWorkerIPAddressForContent(string contentId)
{
//Custom logic to determine an IP Address from one of the CurrentRouters
//that the load balancer is aware of
}
47