These are the slides from my talk about the AppScale project at the SBonRails meetup. It covers AppScale as well as Google App Engine and the research projects have come out of it, including Neptune, a Ruby DSL focused on computation-heavy workloads.
3. Overview
• Google App Engine
• AppScale - now with 50% Ruby!
• Research Directions
• Neptune - A Ruby DSL for the cloud
Thursday, March 10, 2011
4. Google App Engine
• A web framework introduced in 2008
• Python and Java supported
• Offers a Platform-as-a-Service: Use
Google’s APIs to achieve scale
• Upload your app to Google
Thursday, March 10, 2011
6. Data Model
• Not relational - semi-structured schema
• Compare to models in Rails
• Exposes a get / put / delete / query
interface
Thursday, March 10, 2011
7. Storing Data
• Datastore API - Persistent storage
• Memcache API - Transient storage
• User can set expiration times
• Blobstore API - Store large files
• need to enable billing to use it
Thursday, March 10, 2011
8. Be Social!
• Mail API - Send and receive e-mail
• XMPP API - Send and receive IMs
• Channel API - Creating persistent
connections via XMPP
• Use for chat rooms, games, etc.
Thursday, March 10, 2011
9. Background Tasks
• Cron API - Access a URL periodically
• Descriptive language: “every 5 minutes”,
“every 1st Sun of Jan, Mar, Dec”, etc.
• Uses a separate cron.yaml file
• Taskqueue API - Within your app, fire off
tasks to be done later
Thursday, March 10, 2011
10. Dealing with Users
• Users API: Uses Google Accounts
• Don’t write that ‘forgot password’ page
ever again!
• Authorization: via app.yaml:
• anyone, must login, or admin only
Thursday, March 10, 2011
11. When Services Fail
• Originally: failures throw exceptions
• Just catch them all!
• Capabilities API: Check if a service is
available
• Datastore, Memcache, and so on
Thursday, March 10, 2011
12. Deploying Your App
• Develop locally on SDK
• Stub implementations of most APIs
• Then deploy to Google
Thursday, March 10, 2011
13. How to Scale
• Limitations on the programming model:
• No filesystem interaction
• 30 second limit per web request
• Language libraries must be on whitelist
• Sandboxed execution
Thursday, March 10, 2011
14. Enter AppScale
• App Engine is easy to use
• but we really want to tinker with the
internals!
• Need an open platform to experiment on
• test API implementations
• add new APIs
Thursday, March 10, 2011
15. Enter AppScale
• Lots of NoSQL DBs out there
• Hard to compare DBs
• Configuration and deployment can be
complex
• Need one-button deployment
Thursday, March 10, 2011
16. Storing Data
• Datastore API - AppServers use a database
agnostic layer - sends requests to PBServer
• Named for data format: Protocol Buffers
• Memcache API - memcached
• Blobstore API - Custom server
Thursday, March 10, 2011
17. Be Social!
• Mail API - sendmail (disabled by default)
• XMPP API - ejabberd
• Channel API - strophejs
Thursday, March 10, 2011
18. Background Tasks
• Cron API - Uses Vixie Cron
• Taskqueue - Separate thread fetches web
page
• Both make a single attempt
• Will replace with distributed, fault-
tolerant versions
Thursday, March 10, 2011
19. Dealing with Users
• Users API: Defers users to
AppLoadBalancer
• Password reset via command-line tools
• Authorization: no major changes here
Thursday, March 10, 2011
20. Deploying Your App
• Develop locally on SDK
• Stub implementations of most APIs
• Then deploy to AppScale!
• Use your own cluster or via Amazon
• Command-line tools mirror Amazon’s
Thursday, March 10, 2011
21. Deploying Your App
• run-instances: Start AppScale
• describe-instances:View cloud metadata
• upload-app: Deploy an App Engine app
• remove-app: Un-deploy an App Engine
app
• terminate-instances: Stop AppScale
Thursday, March 10, 2011
22. Deployment Models
• Cloud deployment: Amazon EC2 or
Eucalyptus (the open source
implementation of the EC2 APIs)
• Just specify how many machines you need
• Non-cloud deployment via Xen or KVM
Thursday, March 10, 2011
24. AppController
• The brains of the outfit
• Runs on every node
• Handles configuration and deployment of
all services (including other
AppControllers)
• Written in Ruby
Thursday, March 10, 2011
25. Load balancer
• Routes users to their app via nginx
• haproxy makes sure app servers are live
• Can’t assume the user has DNS:
• Thus we wrote the AppLoadBalancer
• Rails app that routes users to apps
• Performs authentication as well
Thursday, March 10, 2011
27. App Server
• We modified the App Engine SDK
• Easier for Python (source included)
• Harder for Java (had to decompile)
• Removed non-scalable API implementations
• Goal: Use open source whenever
possible
Thursday, March 10, 2011
29. Database Options
• Open source / open APIs / proprietary
• Master / slave v. peer-to-peer
• Differences in query languages
• Data model (key/val, semi-structured)
• In-memory or persistent
• Data consistency model
• Interfaces - REST / Thrift / libraries
Thursday, March 10, 2011
30. In AppScale:
• BigTable clones:
• Master / slave relationship
• Master stores metadata
• Slaves store data
• Fault-tolerant to slave failure
• Partially tolerant to master failure
Thursday, March 10, 2011
31. In AppScale:
• Variably consistent DBs
• Voldemort and
• Both are peer-to-peer: no SPOF
• Voldemort: Specify consistency per table
• Cassandra: Specify consistency per request
Thursday, March 10, 2011
32. In AppScale:
• Relational:
• Not NoSQL but used like NoSQL
• Document-oriented:
• Targets append-heavy workloads
Thursday, March 10, 2011
33. In AppScale:
• Key-value datastores:
• MemcacheDB: like memcached but
persistent and replicated
• Scalaris: in-memory, no persistence
• SimpleDB: semi-structured but used as
key-value (will update this in the future)
Thursday, March 10, 2011
34. Research Ideas
• Placement support
• Monitoring
• Shared memory
• Cost modeling
• Hybrid cloud
• Active Cloud DB
• Disaster Recovery
• Neptune
Thursday, March 10, 2011
37. Shared memory
• Since AppServer + DB are co-located,
reduce message overhead
• no serialization
• Leverage CoLoRs to do so across
languages
• AS is in Python or Java, DBS is Python
• Can be orders-of-magnitude faster
Thursday, March 10, 2011
38. Cost modeling
• Can we reproduce Google’s cost model?
• We can reproduce memory, network
bandwidth in / out, size and types of data
• Can’t reproduce CPU - it’s based on
Google’s load, which we can’t capture
• varies based on placement and time of
day
Thursday, March 10, 2011
40. Database Agnostic
Transactions
• Want to support disparate DBs with ACID
• Leverage ZooKeeper for versioning
• And PBServer as the DB agnostic layer
• Needs strong consistency from DB itself
• And row-level atomicity on updates
Thursday, March 10, 2011
41. Active Cloud DB
• Need a common interface to DBs
• But not just for Java / Python
• Named after Rails’ ActiveRecord
• Exposes REST interface for DB
• Included in AppScale 1.3
Thursday, March 10, 2011
42. Disaster Recovery
• People are using App Engine as a
production level environment
• Need a way to automatically back up data
• Can leverage this data for data analytics
• Need to also seamlessly switch to AppScale
version if App Engine version goes down
Thursday, March 10, 2011
43. Neptune
• Need a simple way to run compute-intensive jobs
• We have the code from the ‘net
• We have the resources - the cloud
• But the average user does not have the know how
• Our solution: create a domain specific language
for configuring cloud apps
• Based on Ruby
Thursday, March 10, 2011
46. Extensibility
• Experts can add support for other
computational jobs
• Biochemists can run simulations via DFSP
and dwSSA
• Embarassingly parallel Monte Carlo
simulations
Thursday, March 10, 2011
47. Compiling Code
• You may not have the binaries, so compile
from source!
• Auto-generates makefiles for beginners
neptune :type => “compile”,
:code => “/home/appscale/mpi_nqueens”
Thursday, March 10, 2011
48. Installing Neptune
• Just use good old ‘gem’:
• gem install neptune
• Current version is 0.0.4, fully compatible
with AppScale 1.5
• More info at our web page:
• http://neptune-lang.org
Thursday, March 10, 2011
49. Wrapping It Up
• Thanks to the AppScale team, especially:
• Co-lead Navraj Chohan and advisor
Professor Chandra Krintz
• Check us out on the web:
• http://appscale.cs.ucsb.edu
• http://code.google.com/p/appscale
Thursday, March 10, 2011