The document summarizes an App Engine update presentation given by David Chandler, a Google Developer Advocate. The presentation covered new App Engine features including improved SLAs, paid support options, security audits, backends, pull queues, the High Replication Datastore, query planner improvements, and XG transactions. It also provided examples of App Engine customers and common app types, and tips for optimizing performance including using memcache and content caching.
Streamlining Python Development: A Guide to a Modern Project Setup
Google App Engine Update: What's New and How to Optimize Performance
1. Google App Engine
…is an expression of how Google
thinks about infrastructure.
DevNexus App Engine Update
Mar 21, 2012
David Chandler
Developer Advocate
drfibonacci@google.com
Friday, March 23, 2012
2. Agenda
App Engine intro
‣ 60 second demo
Whatʼs new in App Engine?
‣ SLA, premier accounts, security audits
‣ Backends, pull queues
‣ HRD, query planner, XG transactions
‣ Cloud SQL, Files API
Load testing recipe
Minimize response time, cost
Case study (Mike Lawrence)
Wrap up / Questions?
Friday, March 23, 2012
3. Cloud in a box
Easy to build
‣ Local SDK - Java, Python, Go
‣ Lots of APIs
Easy to maintain
‣ Admin Console
‣ Site Reliability Engineers
‣ No clunky knobs/dials
Easy to scale
‣ Google scale infrastructure
‣ No limits to how much data your can put in the datastore
Friday, March 23, 2012
16. Chillingo Crystal
Gaming meets Social
16
Zombie Dash Angry Birds LITE Underground Meltdown Cogs
Mission Deep Sea Speed Forge Guerilla Bob Ravensword: Angry Birds
Extreme The Fallen King
Friday, March 23, 2012
18. Feature overview
URL Fetch
Frontends
XMPP
Compute Backends Network
Channel API
Task Queues
Mail API
Cron
Datastore Images API
Storage
Memcache App Identity
Services
Namespaces Users API
Blobstore MapReduce API
Cloud SQL Pipeline API
Static content Prospective Search API
18
Friday, March 23, 2012
20. URL Fetch Network
HTTP/1.1
‣ GET, POST, PUT, HEAD, and DELETE
Maximum request deadlines
‣ 60s (default 5s): user facing (online) requests
‣ 10 minutes (new): cron, task queue (offline) requests
Execution options
‣ Synchronous
‣ Async
Friday, March 23, 2012
21. Talking to the outside world Network
XMPP / Jabber
‣ Send/receive messages
‣ Send chat invites
‣ Request presence, status
‣ Set status
Channel API
‣ Server push
Mail API
‣ Send and receive
Use cases
‣ Chat bot
‣ Realtime notifications
‣ Chat user interface
Friday, March 23, 2012
23. Prospective search Services
Register your queries
‣ Define interesting search criteria
Create your data
‣ Users and systems interact with your app
Receive callbacks
‣ App Engine tells you about data which matches searches
Friday, March 23, 2012
24. Identity Services
Users API
‣ Require users to login and identify
- Email address
- Unique, stable id
‣ Servlet filter for dynamic requests
App Identity
‣ Who am I?
appIdentityService.getServiceAccountName()
‣ Assert your identity to applications
- GAE handles key security, key rotation for you
- Other Google APIs
appIdentityService.getAccessToken()
- On other systems
appIdentityService.signForApp()
Friday, March 23, 2012
25. MapReduce & Pipeline API Services
code.google.com/p/appengine-pipeline/
25
Friday, March 23, 2012
26. High Replication Datastore (HRD) Storage
Schemaless
Object datastore
‣ Entities & Entity Groups
Query engine
‣ Standard & custom indexes
Atomic transactions
Features
‣ Highly scalable
- More distributed as more data is stored
‣ Highly reliable
- Synchronous, multi-datacenter replication
‣ Managed
www.google.com/events/io/2011/sessions/more-9s-please-
under-the-covers-of-the-high-replication-datastore.html
Friday, March 23, 2012
27. Advanced query planner New! Storage
Datastore never requires an exploding index
‣ code.google.com/appengine/docs/python/datastore/
queries.html#Big_Entities_and_Exploding_Indexes
dev_appserver
‣ Understands query planner changes
Zigzag merge-join queries
‣ Will continue scanning for up to 30 sec
‣ No more "needs index" errors (although some may timeout)
‣ www.google.com/events/io/2009/sessions/
BuildingScalableComplexApps.html
Friday, March 23, 2012
30. Query planner performance - cost trade-offs
Option #1 - Write to more indexes at write time
‣ Entity writes are expensive
‣ Possible queries must be known in advance
- Others will fail with Needs Index error
‣ Predictable / fast query time [for supported queries]
Option #2 - Scan multiple indexes at read time
‣ Entity writes much cheaper
‣ More queries possible with fewer indexes
‣ Ad hoc queries
‣ Query response time depends on shape of data
- Specific cases can be optimized by providing addʼl indexes
Friday, March 23, 2012
31. Using the new query planner Storage
Upgrade to 1.5.4+
Turn off auto index generation
‣ Python: dev_appserver.py --require_indexes
‣ Java: <datastore-indexes autoGenerate="false">
Remove unnecessary custom index definitions
‣ Python:
- index.yaml
‣ Java:
- WEB-INF/datastore-indexes.xml
- WEB-INF/appengine-generated/datastore-indexes-auto.xml
Test your app
Deploy to production
‣ Judiciously run: appcfg vacuum_indexes
Friday, March 23, 2012
32. XG Transactions New! Storage
In a single transaction…
‣ Access 1..5 entity groups - read and/or write
‣ Does not provide transactional ancestor-less query
‣ Does allow 1..5 separate ancestor queries within transaction
Concurrency exceptions possible
‣ 1st access of each entity group
‣ For reads as well as writes
Availability
‣ Only in HRD apps
‣ App Engine 1.5.5 release
Non-ancestor (transaction-less) queries may see…
‣ Partially committed non-XG transaction (same as before)
‣ Partially committed XG transactions (more likely)
Friday, March 23, 2012
34. Cloud SQL (labs) Storage
New! Cloud SQL support in Google Plugin for Eclipse 2.5
Client
Client
Client
Client
34
Friday, March 23, 2012
35. Cloud SQL (labs) Storage
Developer console
‣ Easy to use
Fully managed
‣ Site Reliability Engineers (SREs)
High availability
‣ Synchronous replication to multiple data centers
Integrated with Google App Engine
‣ Java: JDBC, Python: DB-API
‣ Use with High Replication Datastore
Migration
‣ MySQL Import / export
Friday, March 23, 2012
36. Google Cloud Storage / Files API Storage
Google Cloud Storage
‣ Buckets in the cloud
‣ REST API
‣ Command line
‣ Web UI tool
‣ Enable in APIs console
Use from App Engine
‣ Files API (new)
‣ No more OAuth hassle
‣ http://code.google.com/appengine/docs/java/googlestorage/
overview.html
Friday, March 23, 2012
38. Appstats
“I used to be blind,
but now I can see :-)”
--An early Appstats user
38
Friday, March 23, 2012
39. Load testing - Guidelines
Test traffic must be representative of real user traffic
Ramp over a period of several minutes
his!
o ts ! s
t d hi e i etter
on’ T ic
D N
uc h b
m
Friday, March 23, 2012
40. Load testing - basic recipe
Send steady 5 qps traffic to your site
‣ Uncover basic problems, e.g. entity contention on global data
Monitor for 15-20 minutes and expect to see:
‣ Stable request latency <1000ms (no longer an issue)
‣ Zero quota denials
‣ Low error rate
‣ Tasks queues are keeping up
‣ No elevated datastore contention
Slowly ramp up; increase traffic by 2x
‣ Uncover next application bottleneck, if any
Repeat until serving desired qps
‣ Plan your next feature
Friday, March 23, 2012
41. Memcache
2 ms
App
App
Client Engine
App
Client Engine
App
Frontend
Engine
Frontend
Engine
Backend
Backend 20m
s
41
Friday, March 23, 2012
42. Enable content caching
Free lunch possibility
‣ And your users will be happier too
Use memcache
‣ Only if you care about your users
Static resource files
‣ Do not consume instance hours
HTTP/1.1 caching
Cache-Control: public, s-maxage=...
‣ Respected by browsers, ISPs and App Engine
‣ Saves bandwidth
Replace dynamic content with
‣ Static content
‣ Blobstore (custom HTTP headers)
Friday, March 23, 2012
43. Best practices in client-side code
Clients all
t his!
Av oid checking in at a
specific time of
day
Spread out load:
- Randomize client checkin times
Clients retr y
failures after a
t his!
And fixe d interval
Avoi d a Denials-of-Ser vice
(DoS) on yo ur app after a
failure:
- Use exponential back-off:
1s, 2s, 4s, 8s, 16s, …
- Use a fuzz factor:
randomize retr y times
43
Friday, March 23, 2012
44. Best practices in server-side code
Reduce instance hours
‣ Reduce request latency
‣ Use Memcache
‣ Datastore/Memcache batch get/put
‣ Async URL Fetch, Datastore, Memcache
- Parallelize RPCs
‣ Enable HTTP session async writes
Spin up fewer instances
‣ Task queue rate and max_concurrent_requests
‣ X-AppEngine-FailFast
code.google.com/appengine/articles/managing-resources.html
Friday, March 23, 2012
45. Best practices in server-side code
Enable concurrent requests
‣ Java appengine-web.xml
‣ Python 2.7
Discounted reserved instance hours
‣ $0.08/hr → $0.05/hr
Scheduler knobs
‣ Max Idle Instances
‣ Min Pending Latency
Friday, March 23, 2012
46. Manage storage costs
Discard stuff you no longer need
‣ Entities
‣ Indexes
‣ Blobs
‣ Tasks
Drop unwanted indexes
‣ Vacuum custom indexes
‣ New query planner
Explicitly mark properties as unindexed
‣ To migrate existing entities, put() again with:
Python: foo = db.StringProperty(indexed=False)
Java: entity.setUnindexedProperty(“foo”, “bar”)
Friday, March 23, 2012
47. Manage 'datastore ops' costs
Number of write ops in SDK
Fewer indexes & indexed properties → fewer write ops
Take advantage of query planner improvements
Replace queries with
‣ keys-only queries (cheaper)
‣ fetch-by-key (optimal)
‣ datastore cursors (for pagination)
Friday, March 23, 2012
48. Enable warming requests
Request logs
/_ah/warmup
Default behavior
‣ Enabled for Java apps / Disabled for Python apps
‣ All jar files are indexed in memory
‣ Initializes your application and filters
- Servlets marked in <load-on-startup>
- ServletContextListener filters
Custom warmup servlet
‣ Override built-in _ah_warmup servlet
<servlet>
<servlet-name>_ah_warmup</servlet-name>
<servlet-class>foo.MyWarmupServlet</servlet-class>
</servlet>
Friday, March 23, 2012
49. Only upload what you need
Python
static_files, static_dir
skip_files
Java
<static-files>
<resource-files>
Diagnostics
appcfg.sh --retain_upload_dir update …
Friday, March 23, 2012