Presentation on how to chat with PDF using ChatGPT code interpreter
Caching up is hard to do: Improving your Web Services' Performance
1. Caching Up is Hard To
Do
Chad McCallum
ASP.NET MVP
iQmetrix Software
www.rtigger.com - @ChadEmm
Improving the performance of your web
services
2. Here’s your API
Client makes a request Bounces around the
internet to web server
Web server runs
application, queries DB
Database returns data
from query
Application serializes
response and writes to
client
Bounces back through
internet to client
Client receives data
Client The Internet Web Server Database
3. An example request
Client The Internet Web Server Database
Client makes a request Bounces around the
internet to web server
Web server runs
application, queries DB
Database returns data
from query
Application serializes
response and writes to
client
Bounces back through
internet to client
Client receives data
270ms
598ms3474ms
4342ms
528kb
4. Start at the data
• Optimize the database server
• Logical Design – efficient queries, application-specific schema, constraints,
normalization
• Physical Design – indexes, table & system settings, partitions, denormalization
• Hardware – SSDs, RAM, CPU, Network
5. Between your App and the Data
• Reduce the complexity of your calls – get only the data you need
• Reduce the number of calls – return all the required data in one query
• Make calls async – perform multiple queries at the same time*
• The fastest query is the one you never make
• Cache the result of common queries in an application-level or shared cache
598ms 315ms
47%
Application Time
6. Caching Data
• Great for static or relatively unchanged data
• Product Catalogs
• Order History
• Not so great for volatile data
• Store Quantity
• Messages
• Comes with a memory price
• Shared Cache when working with a web farm
7. Inside your App
• Standard “MVC” Flow
• Request comes into Web Server over network connection
• Framework parses request URL and other variables to determine which
controller and method to execute, checking against routes
• Framework creates instance of controller class and passes copy of request
object to appropriate method
• Method executes, returning an object to be sent in response
• Framework serializes response object into preferred type as requested by
client
• Web server writes response back to client over network connection
8. Inside your App
• The most we can reasonably do is optimize our controller’s method
• “Reasonably” meaning not doing crazy things to the underlying framework
code / dependencies
• The fastest method is one you don’t execute
• Cache the serialized result of common API calls
598ms 296ms
51%
Application Time
9. Caching Responses
• Great for endpoints that don’t take parameters
• Get
• Not so great for endpoints that do take parameters
• Get By ID
• reports with date ranges
• Get with filters
• Cache all supported serialization formats
• Same cache concerns – memory usage, shared cache in farm setup
10. From Server to Client
• We can’t really change the topology of a client’s network connection
• We can send less data
• HTTP Compression
3474ms 1083ms
69%
528kb 129kb
76%
Response Size Response Time
11. HTTP Compression
• Trading response size for server CPU cycles
• Output can be cached (and often is) by web server to avoid re-
compressing the same thing
• Client requests compression using Accept-Encoding header
598ms 624ms
4%
Application Time
12. Paging
• Don’t send everything!
• Only returning 20 items
• Page objects using OData Queries in WebAPI
• Returning IEnumerable<T> will page in-memory
• Returning IQueryable<T> will (attempt to) page at the database layer
3474ms 7ms
99.8%
528kb 10kb
98.1%
Response Size Response Time
13. Conditional Headers
• Server can send either an ETag and/or Last-Modified header with
response
• ETag = identifier for a specific version of a resource
• Last-Modified = the last time this resource was modified
• Clients can include that data in subsequent requests
• If-None-Match: “etag value”
• If-Modified-Since: (http date)
• Server can respond with a simple “304 Not Modified” response
14. Conditional Headers
3474ms <1 ms
99.9%
528kb 0.3kb
99.9%
• Avoid database calls to validate requests
• Cache last modified times & etag values
• May have to modify client code to retain and send Last-Modified and
ETag values
• Most browsers will automatically include If-Modified-Since, but some do
not include If-None-Match
• Non-browser code (SDKs, WebClient, HttpClient)
Response Size Response Time
598ms 323ms
54%
Application Time
15. Client-Side Caching
• Most browsers have a local cache – tell your clients to use it!
• Expires header tells client how long it can reuse a response
• Expires: Thu, 03 Apr 2014 03:19:37 GMT
• Cache-Control: max-age=## (where ## is seconds) header does the
same, but applies to more than just the client cache…
• In either case it’s up to the client whether it uses the cache or not
• Most browsers cache aggressively
16. Intermediate Caching
• Cache-Control header specifies who can cache, what they can
cache, and how things can be cached
• Public / Private – whether a response can be reused for all requests, or is
specific to a certain user
• max-age – the longest a response can be cached in seconds (overrides Expires
header)
• must-revalidate – if the response expires, must revalidate it with the server
before using it again
• no-cache – must check with the server first before returning a cached
response
17. Client-Side Caching
• Great for static or relatively static data
• Static HTML, JS, CSS files, or read-only lists of data that rarely change
• Not so great for dynamic or mission-critical data
• Hard to force clients to get latest version of data when they don’t even talk to
the server
• If you have to update before Expires or Max-Age runs out, you’ve got a
problem
4342ms 100ms
97.7%
Response Time
18. Review
• Optimize your database for your application
• Cache on the server
• Common database calls
• Serialized results
• Send less data
• HTTP Compression
• Paging
• Conditional Headers / 304 Not Modified
• Cache on the client
• Expires and Cache-Control headers
- Before animations, show “standard” endpoint code- Note on “bounces around to web server” – that’s about 8 hops from my home network to our azure instance in East Asia
Mention the cool new in memory OLTP / tables and compiled stored procedures in SQL Server 2014
You can return multiple result sets in one query – it takes some manual tweaking of the EDMX file and/or extra code in Entity Framework, but it is possibleKeep in mind high traffic + async can result in database overloadAfter last point, show call to cached-db endpoint
Shared cache like memcached or a faster? database call (i.e. nosql, in memory table, etc)
Show applicationhost.config transform, make request with Accept-Encoding: gzip header