Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Realtime web2012

Prochain SlideShare
EhTrace -- RoP Hooks
EhTrace -- RoP Hooks
Chargement dans…3

Consultez-les par la suite

1 sur 35 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Realtime web2012 (20)


Plus récents (20)


Realtime web2012

  1. 1. Building Real-Time Web http://tinyurl.com/realtime2012 http:// Timothy Fitz .com CTO Canvas
  2. 2. What is “Realtime web”
  3. 3. What does “Realtime” look like?
  4. 4. What does “Realtime” look like?
  5. 5. What does “Realtime” look like?
  6. 6. “Push, not pull.” REALTIME WEB
  7. 7. Talking to the browser High concurrency Scaling up 3 HARD PROBLEMS
  8. 8. Talking to the browser • Short Polling • Long Polling • WebSocket • Flash Socket
  9. 9. Short Polling
  10. 10. Long Polling
  11. 11. Flash Socket
  12. 12. WebSocket
  13. 13. High Concurrency • Blocking I/O – Thread per process – Tops out at 200 to 1k connections • Non-blocking I/O – One process, one thread – 10k to 100k connections
  14. 14. Django
  15. 15. Django Apache
  16. 16. There is no apache for realtime
  17. 17. Non-blocking I/O Servers • Python – Twisted – Tornado – gevent • Not python – Node.js – Erlang something
  18. 18. Twisted • Pro – Can talk every protocol ever – Oldest and most widely used in production • Con – Overkill for web-only tasks – Not simple
  19. 19. Tornado • Pro – Simple – Does HTTP stuff simply • Con – Might not interface with what you need • Confusing – You can run Tornado (HTTP layer) on top of Twisted (networking layer)
  20. 20. gevent • Pro – Coroutines are a better model than callbacks – As such, very easy to write complicated logic • Con – Least well documented – Least consensus on best practices – New, uncertain about production readiness
  21. 21. Node.js • Pro – Best documentation by far – Socket.IO abstracts away browser communication • Con – Can’t share logic between Django app – New, but has fairly large install base
  22. 22. Erlang • Pro – Hands down best for complex realtime tasks – Forces you to think about concurrency/scale – Abstracts away the network – Old and reliable • Con – Forces you to think about concurrency/scale – Can’t share logic between Django app – High spin-up cost (functional, concurrency driven)
  23. 23. Just one Frontend nodes x Backend nodes More architecture decisions! SCALING UP!
  24. 24. Just one • Everything in memory • Django nodes talk directly to box • Spare for availability • Failover = realtime data loss – Make realtime 100% redundant
  25. 25. Probably good enough! – WARNING: NAPKIN MATH – 10k daily visits * 10.0min avg visit = 70 average concurrent users – One box can easily be built out to handle 3-5k = Roughly 450k-700k daily visits
  26. 26. Frontend nodes x Backend nodes • Frontend handle users / connections • Backend handles channels
  27. 27. More architecture decisions! • In memory backend – Redis Pub/Sub – ZeroMQ – Roll your own • Persisted to Disk: – ActiveMQ – RabbitMQ – Amazon SQS
  28. 28. Redis Pub/Sub • Simplest to setup • Simplest model • SUBSCRIBE channel_name • PUBLISH channel_name “Hello World!”
  29. 29. ZeroMQ • Publish/Subscribe semantics • Request/Response • Push/Pull (round robin) • Extremely fast
  30. 30. Roll your own • Same language as your frontend – (Twisted/Node/Whatever) • Only do this if you have per-channel business logic – You probably don’t. • Erlang maps really really well to this domain.
  31. 31. Full Stack Services • REST APIs to push to the browser • http://pusher.com • http://beaconpush.com
  32. 32. Canvas Amazon ELB Nginx + Twisted Redis
  33. 33. Final Recommendations • Need python? Twisted • Don’t? Node.js/SocketIO • Need scale/reliability? Redis backend. • Complex? Going big? Erlang all the way.
  34. 34. Questions?
  35. 35. Further Reading • IMVU IMQ talk http://www.slideshare.net/JonWatte/message-queuing- on-a-large-scale-imvus-stateful-realtime-message-queue • Twilio talk on gevent + zeromq (given by Jeff Lindsay, highly recomended): http://www.twilio.com/conference/video/distributed-systems-with- gevent-and-zeromq • Last.fm scaling Eralng/Mochiweb to 1 million concurrent connections on one machine: http://www.metabrew.com/article/a-million-user-comet- application-with-mochiweb-part-1 • The original Comet blog post: http://infrequently.org/2006/03/comet-low- latency-data-for-the-browser/ • Django + Socket.IO + gevent: http://codysoyland.com/2011/feb/6/evented-django-part-one-socketio- and-gevent/

Notes de l'éditeur

  • Also known as Comet (in response to AJAX)And before that, under the umbrella of “DHTML” (throwback to the late 90s!)
  • Latency often doesn’t matter at all (3-5s wouldn’t be noticed, for popular hashtags 1 minute wouldn’t make a difference)
  • Chat (which is pubsub on steroids)Presence (the fact that you’re connected is important)Latency matters some, but you wouldn’t notice 1s of lag.
  • Gaming, networked simulated physics / simulated spaces. Latency is critical in both directions (~200ms matters)
  • Also a dozen other methods, and aggregate methods that have built-in fall back semantics.
  • Supported absolutely everywhereIncredibly efficientIncredibly easy to implement, hard to get wrongRight for infrequent realtime, or tied to existing expensive operation (most common example: short poll Paypal/payment gateway for success confirmation)
  • Works everywhere (desktop and mobile)Supports most use cases (twitter, etc)
  • Requires flash support (user has it, no flashblock, desktop only for the most part)Bidirectional and binary.Bidirectional really only matters for realtime interactive apps (games, virtual spaces, motion is one of the few places where 200ms latency matters)Flash is dying, but if your app already requires (or if your UI is already in flash, hello vidya game) then this might be the best solution.
  • Works on Chrome, FF, Safari, iOS mobile, IE10 previews. Coming to Android Mobile soon.Bidirectional, but UTF-8 (probably doesn’t matter)Very new (RFC hit “Proposed Standard” in Dec 2011, which means the spec is solidified. “Internet Standard” is then next step, and reserved for two independent interoperable implementations, very close)Great but you’ll probably have to support fallback for a while 
  • Super simplifying, lots of options exist including hybrids.Often run one non-blocking process per core (if you have to scale to multiple machines, using the same strategy for multiple processes is trivial)
  • Okay this is kind of a lie, there are hacky ways but you lose most of what makes Django, Django: sessions, users, auth, ORM, and most 3rd party libraries
  • There is no consensus. There are some good python options. There are a LOT of options I’m not even mentioning, almost every language has two or three non-blocking I/O webservers. Python might be important, especially if you have logic you want to reuse between your Django application and your non-blocking I/O app
  • Can have two for redundancy