Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.


4 071 vues

Publié le

Publié dans : Technologie
  • Soyez le premier à commenter


  1. 1. TuningCouchbaseTim Smith, Engineer
  2. 2. IntroductionI am:● tim@couchbase.com● http://github.com/couchtim● Support Engineer● Sales Engineer
  3. 3. IntroductionYou are:● Using Membase● Using CouchDB● Using it in production● 100ms response● 2ms response● Smarter than I am
  4. 4. Simple, Fast, Elastic*Membase● 5-minute cluster setup● Memcached API● Memcached-fast responses (working set)● Saturate network with minimal CPU● Pleasant admin UI● Rebalance on the fly
  5. 5. Simple, Fast*, ElasticCouchDB● Webophilic: JSON, RESTy HTTP, Javascript● Append-only, crash-only, MVCC (hide the fancy bits)● Bidi replication, /_changes● Replication & app-level sharding scale out● Your data. Everywhere.
  6. 6. Simple*, Fast, ElasticCouchbase Server 2.0● Auto-sharding, clustered elasticity● Caching, predictable low-latency● Scatter-gather, incremental map/reduce● Rich hands-on-your-data features
  7. 7. API ComboCouchbase Server 2.0● Both memcached binary protocol● And CouchDB HTTP protocol● SDK provides consistent interface● Optionally synchronous persistence
  8. 8. Lots to Learn…We welcome your discoveries, ingenioussolutions and feedback.But we’re not starting from scratch!Best practices from Membase and CouchDBstill apply.● Hardware and system resources● Client and API usage● Data modeling
  9. 9. HardwareMembase:● RAM, RAM, RAM!● Fast disk (throughput) helpful for write- intensive applications, and disk-heavy ops (rebalance)● Network bandwidth may become an issue● Adding more nodes can help with all three
  10. 10. Hardware—RAMProper cluster sizing is #1● Couchbase.org wiki: Sizing Guidelines● Main variables include total number of items, size of working set, replicas and per-item overhead● Under-provisioning reduces elasticity
  11. 11. Hardware—DiskFast disk not too important…until it is● Rebalance can move a lot of data around● Especially when disk > RAM● Warm-up time after node restart● Under-provisioning reduces elasticity● More nodes spread out the I/O● SSD, RAID, the usual stuf
  12. 12. HardwareCouchDB:● CPU usage can be signifcant with view updates, replication flters, formatting via /_list and /_show● Fast disk helpful for many applications, and disk-heavy ops (compaction)● Separate data and view on diferent filesystems to improve I/O● RAM can’t hurt
  13. 13. Hardware—CloudCloud hosting brings variability● Disk bandwidth can occasionally drop● Even identical instances may perform diferently● Large instances more reliable● More instances provide redundancy● Best bang-for-the-buck still an open question
  14. 14. ConfigurationMembase client● Use a Membase-aware smart client (spymemcached for Java, Enyim for C#) ● Or, run moxi on the client host ● Minimizes network hops, preserves bandwidth● Value compression (often automatic)
  15. 15. ConfigurationCouchDB client● Caching, Etag / If-None-Match● Compression, Accept-Encoding: gzip● Keep-Alive● By the way, Couchbase Single Server has some killer performance increases (coming soon to an Apache CouchDB release)
  16. 16. API UsageMemcached API● Binary protocol● Multi-get and multi-set● Incr, decr, append, prepend● TTL expiration, get-and-touch
  17. 17. API UsageCouchDB API● HEAD vs. GET, ?limit=1● ?startkey, not ?skip● Use built-in reduce functions: _sum, _count, _stats; write views in Erlang● Keep view index size in mind—emit just what you need
  18. 18. API UsageCouchDB API● Use ?group_level to aggregate over structured keys● Emit null, and use ?include_docs to get more data (faster view generation)● Emit more data, so ?include_docs isn’t needed (avoid random I/O on query)
  19. 19. Modeling—Doc SizeBundle related info into one document● Fewer items → less caching overhead● Reduce number of requests clients make● Promotes server-side processing with _show functions● More context available for flexible maps
  20. 20. Modeling—Doc SizeBreak up serial items to separate docs● E.g., comments, events, other “feeds”● Each entry is self-described● Avoids write contention on a container● Avoid read/write of container contents just to make a small addition● May be gathered with map/reduce view
  21. 21. Modeling—Key SizeUse short key values● At the clustering layer, all keys are kept in RAM, tracked for replicas, etc.● 255 bytes max length, but prefer short keys● At CouchDB layer, id is likewise used in many places, and short ids are more efficient● Semantic keys
  22. 22. Modeling—IndexesConsider other index types● Full-text integration● Geo-spatial (can be used for non-spatial data, too)● Hadoop connector w/ Couchbase Server (via TAP)
  23. 23. Modeling—K/V TricksNon-obvious models in key/value space● Example: level of indirection to “remove” a bunch of keys without knowing their keys:● Defne a master key, e.g. obj_rev: 3● Defne subordinate attribute keys with the master value in the key name, e.g. obj_foo-3, obj_bar-3● Increment obj_rev, and rely on TTL to reap stale attribute items
  24. 24. Diagnostic StatsMonitoring Couchbase Server● Ops/sec● RAM usage vs. high/low water marks● Growth of RAM usage (mem_used)● Growth of metadata usage (ep_overhead)
  25. 25. Diagnostic StatsMonitoring Couchbase Server● RAM ejections for active/replica data (*eject*)● Cache miss ratio (get_hits vs. ep_bg_fetched)● Disk write queue size (ep_queue_size + flusher_todo)● Disk space available
  26. 26. Diagnostic StatsError condition stats● Disk write errors (*failed*)● Uptime resets● Out of memory conditions (*oom*)● Swap usage
  27. 27. What’d I MissBefore questions, I want assertions.That is, you’re smarter than I am● …and you’ve got more experience● What’s the most important tip you know?● What mistakes did I make?
  28. 28. Thank You Tim Smith tim@couchbase.comhttp://github.com/couchtim @couchtim