SlideShare une entreprise Scribd logo
1  sur  31
Twitter Streaming
API Architecture
John Kalucki
@jkalucki
Infrastructure
Heading
 •   Immediate User Experience

 •   More event types

 •   Full fidelity

 •   Easier integrations
REST?
•   Downsides

    •   Latency

    •   Complexity

    •   Expense

•   Prevents

    •   At-Scale Integrations

    •   Features

    •   Fidelity
Needy
•   Authenticate

•   Rate Limit

•   Query vast caches

•   Query deep data stores

•   Render. Render. Render.

•   All just to say: “No new tweets. Try later.”
Policy
•   Prove Relationship via Auth Token

•   Terms of Use

•   No resyndication

•   Protect users, content, ecosystem
Streaming Solution
•   Gather events

•   Route to all servers

•   Present to decision logic:
    Exactly once per server

•   Move ‘em out:
    Send just once* to clients
Win, Huge
•   Low latency

•   Little duplicated computation

•   No wasted bandwidth

•   New event types without new endpoints
Win, Huge
•   Low latency

•   Little duplicated computation

•   No wasted bandwidth

•   New event types without new endpoints
Properties
•   At Least Once

•   Roughly Sorted (K-Sorted) by Time

•   Middleware - No rendering

•   More at 2:30pm talk - Thinking In Streams
Properties
•   At Least Once

•   Roughly Sorted (K-Sorted) by Time

•   Middleware - No rendering

•   More at 2:30pm talk - Thinking In Streams
Gather
All Events
Push
Events
Latency
<created_at> - client arrival
        Policy
200ms



100ms



0 ms

                 1 hour period
<created_at> - client arrival
        Policy
200ms



100ms



0 ms

                 1 hour period
User Streams
 chirpstream/2b/user.json
Future
•   User Streams refinement and launch

•   More data types

•   Better query support

•   Large scale integration support
Twitter Streaming API Architecture

Contenu connexe

En vedette

REST и HATEOAS
REST и HATEOASREST и HATEOAS
REST и HATEOASArtem Bey
 
Securing RESTful services with Spring HATEOAS & Hdiv
Securing RESTful services with Spring HATEOAS & HdivSecuring RESTful services with Spring HATEOAS & Hdiv
Securing RESTful services with Spring HATEOAS & HdivHdiv Security
 
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...VMware Tanzu
 
Hypermedia api (HATEOAS)
Hypermedia api (HATEOAS)Hypermedia api (HATEOAS)
Hypermedia api (HATEOAS)MitinPavel
 
HCLT Whitepaper: Legacy Modernization
HCLT Whitepaper: Legacy Modernization HCLT Whitepaper: Legacy Modernization
HCLT Whitepaper: Legacy Modernization HCL Technologies
 
BPM for SOA+ESB+API and cloud
BPM for SOA+ESB+API and cloud BPM for SOA+ESB+API and cloud
BPM for SOA+ESB+API and cloud Alexander SAMARIN
 
How to become a Product Samurai - Chris Lukassen
How to become a Product Samurai - Chris LukassenHow to become a Product Samurai - Chris Lukassen
How to become a Product Samurai - Chris LukassenAvisi B.V.
 
Legacy to industry leader: a modernization case study
Legacy to industry leader: a modernization case studyLegacy to industry leader: a modernization case study
Legacy to industry leader: a modernization case studyOSSCube
 
Twilio Signal 2016 API Architecture
Twilio Signal 2016 API ArchitectureTwilio Signal 2016 API Architecture
Twilio Signal 2016 API ArchitectureTwilio Inc
 
Transform Your Business with API-led Connectivity
Transform Your Business with API-led ConnectivityTransform Your Business with API-led Connectivity
Transform Your Business with API-led ConnectivityMuleSoft
 
LeaseWeb API Architecture @ APINL Meetup
LeaseWeb API Architecture @ APINL MeetupLeaseWeb API Architecture @ APINL Meetup
LeaseWeb API Architecture @ APINL MeetupRolph Haspers
 
Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Fabernovel
 
SOA Pattern : Legacy Wrappers
SOA Pattern : Legacy Wrappers SOA Pattern : Legacy Wrappers
SOA Pattern : Legacy Wrappers WSO2
 
Bold Predictions for the 2016 API Economy
Bold Predictions for the 2016 API EconomyBold Predictions for the 2016 API Economy
Bold Predictions for the 2016 API EconomyNeha Sampat
 
Updating Legacy Systems: Making the Financial Case for a Modernization Project
Updating Legacy Systems: Making the Financial Case for a Modernization Project Updating Legacy Systems: Making the Financial Case for a Modernization Project
Updating Legacy Systems: Making the Financial Case for a Modernization Project ILM Professional Services
 
IO State In Distributed API Architecture
IO State In Distributed API ArchitectureIO State In Distributed API Architecture
IO State In Distributed API ArchitectureOwen Rubel
 
Api Abstraction & Api Chaining
Api Abstraction & Api ChainingApi Abstraction & Api Chaining
Api Abstraction & Api ChainingOwen Rubel
 
Legacy modernization, cloud orchestration, api publishing
Legacy modernization, cloud orchestration, api publishingLegacy modernization, cloud orchestration, api publishing
Legacy modernization, cloud orchestration, api publishingkumar gaurav
 

En vedette (19)

REST и HATEOAS
REST и HATEOASREST и HATEOAS
REST и HATEOAS
 
Securing RESTful services with Spring HATEOAS & Hdiv
Securing RESTful services with Spring HATEOAS & HdivSecuring RESTful services with Spring HATEOAS & Hdiv
Securing RESTful services with Spring HATEOAS & Hdiv
 
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...
Modernizing the Legacy - How Dish is Adapting its SOA Services for a Cloud Fi...
 
Hypermedia api (HATEOAS)
Hypermedia api (HATEOAS)Hypermedia api (HATEOAS)
Hypermedia api (HATEOAS)
 
HCLT Whitepaper: Legacy Modernization
HCLT Whitepaper: Legacy Modernization HCLT Whitepaper: Legacy Modernization
HCLT Whitepaper: Legacy Modernization
 
BPM for SOA+ESB+API and cloud
BPM for SOA+ESB+API and cloud BPM for SOA+ESB+API and cloud
BPM for SOA+ESB+API and cloud
 
How to become a Product Samurai - Chris Lukassen
How to become a Product Samurai - Chris LukassenHow to become a Product Samurai - Chris Lukassen
How to become a Product Samurai - Chris Lukassen
 
Apiworld
ApiworldApiworld
Apiworld
 
Legacy to industry leader: a modernization case study
Legacy to industry leader: a modernization case studyLegacy to industry leader: a modernization case study
Legacy to industry leader: a modernization case study
 
Twilio Signal 2016 API Architecture
Twilio Signal 2016 API ArchitectureTwilio Signal 2016 API Architecture
Twilio Signal 2016 API Architecture
 
Transform Your Business with API-led Connectivity
Transform Your Business with API-led ConnectivityTransform Your Business with API-led Connectivity
Transform Your Business with API-led Connectivity
 
LeaseWeb API Architecture @ APINL Meetup
LeaseWeb API Architecture @ APINL MeetupLeaseWeb API Architecture @ APINL Meetup
LeaseWeb API Architecture @ APINL Meetup
 
Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.Why should C-Level care about APIs? It's the new economy, stupid.
Why should C-Level care about APIs? It's the new economy, stupid.
 
SOA Pattern : Legacy Wrappers
SOA Pattern : Legacy Wrappers SOA Pattern : Legacy Wrappers
SOA Pattern : Legacy Wrappers
 
Bold Predictions for the 2016 API Economy
Bold Predictions for the 2016 API EconomyBold Predictions for the 2016 API Economy
Bold Predictions for the 2016 API Economy
 
Updating Legacy Systems: Making the Financial Case for a Modernization Project
Updating Legacy Systems: Making the Financial Case for a Modernization Project Updating Legacy Systems: Making the Financial Case for a Modernization Project
Updating Legacy Systems: Making the Financial Case for a Modernization Project
 
IO State In Distributed API Architecture
IO State In Distributed API ArchitectureIO State In Distributed API Architecture
IO State In Distributed API Architecture
 
Api Abstraction & Api Chaining
Api Abstraction & Api ChainingApi Abstraction & Api Chaining
Api Abstraction & Api Chaining
 
Legacy modernization, cloud orchestration, api publishing
Legacy modernization, cloud orchestration, api publishingLegacy modernization, cloud orchestration, api publishing
Legacy modernization, cloud orchestration, api publishing
 

Dernier

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Dernier (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Twitter Streaming API Architecture

  • 1.
  • 2. Twitter Streaming API Architecture John Kalucki @jkalucki Infrastructure
  • 3.
  • 4. Heading • Immediate User Experience • More event types • Full fidelity • Easier integrations
  • 5.
  • 6. REST? • Downsides • Latency • Complexity • Expense • Prevents • At-Scale Integrations • Features • Fidelity
  • 7. Needy • Authenticate • Rate Limit • Query vast caches • Query deep data stores • Render. Render. Render. • All just to say: “No new tweets. Try later.”
  • 8.
  • 9. Policy • Prove Relationship via Auth Token • Terms of Use • No resyndication • Protect users, content, ecosystem
  • 10.
  • 11. Streaming Solution • Gather events • Route to all servers • Present to decision logic: Exactly once per server • Move ‘em out: Send just once* to clients
  • 12. Win, Huge • Low latency • Little duplicated computation • No wasted bandwidth • New event types without new endpoints
  • 13. Win, Huge • Low latency • Little duplicated computation • No wasted bandwidth • New event types without new endpoints
  • 14. Properties • At Least Once • Roughly Sorted (K-Sorted) by Time • Middleware - No rendering • More at 2:30pm talk - Thinking In Streams
  • 15. Properties • At Least Once • Roughly Sorted (K-Sorted) by Time • Middleware - No rendering • More at 2:30pm talk - Thinking In Streams
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 27. <created_at> - client arrival Policy 200ms 100ms 0 ms 1 hour period
  • 28. <created_at> - client arrival Policy 200ms 100ms 0 ms 1 hour period
  • 30. Future • User Streams refinement and launch • More data types • Better query support • Large scale integration support

Notes de l'éditeur

  1. So, where are we going with the Streaming API? What are our constraints? We have four big goals for Streaming. First, We want users to have a low latency experience. Instant feels like the right speed for Twitter. Not 18 seconds later. More or less right now. Second, every write into Twitter is an event that someone, somewhere might be interest in. We want to expose more event types than just new Tweets.
  2. Third, we also want to provide full fidelity data for those that need it. Sometimes you just need everything. And you also a place to put it. And finally, we need to make Large Scale integrations with other services as easy as possible. You shouldn&amp;#x2019;t have to wrestle with parallel fetching, rate limiting, and all that. It should be easier for all developers to get data out of Twitter.
  3. The REST API is not the solution for a low latency experience, for large scale integrations, or for exposing more and more event types. The REST model may great for many things, but for real-time Twitter where you just want to know what&amp;#x2019;s changed we&amp;#x2019;ve already pushed Request Response too far. It&amp;#x2019;s painful.
  4. You can&amp;#x2019;t quickly poll for deltas on the social graph, friends timeline, user timeline, mentions, favorites, searches, lists, and trends for just one user, never mind all your followings. Or a million users. Or ten million. Impossible. As Twitter adds more features, this just gets worse. It&amp;#x2019;s just not practical to lift rate limits high enough to meet everyone&amp;#x2019;s goals. The real-time REST model is near a the point of collapse.
  5. Why is REST so expensive? A lot of effort goes into responding to each API request. There&amp;#x2019;s a lot to do, a lot of data to gather, and none of it is on that front end box handling the request. To make matters more difficult, the the cost and latency distributions are very wide -- from a cheap cache hit to a deep database crawl. Keeping latency low is a struggle.
  6. Any solution, powerful enough to solve all of these problems, is going to be a bit dangerous. It needs some controls especially if rate limits are removed. And, will still need to preserve all of our policies around abuse, privacy, terms of use, and so forth.
  7. We&amp;#x2019;ve really tried to think through all of the policy implications here. Everyone has to play by the same rules and it must be possible for everyone to have a chance at building a sustainable business. We&amp;#x2019;ve come to some win win decisions about the firehose and other elevated access levels. I think we can make nearly everyone very happy. Go to the Corp Dev Office hours at 2:30pm for more detail about our Commercial Data Licenses. Keep this in mind- solving these policy issues are requirements, just as much as the technology issues are.
  8. Our Solution for all this is the Streaming API. We&amp;#x2019;ve already proven that we can offer low latency streams of all Twitter events. We&amp;#x2019;ve been streaming these events to ourselves for quite some time. Twitter Analytics, for example, takes various private streams to feed experimental and production features. Pleasant. So, how does Streaming work?
  9. In a nutshell, we gather interesting events everywhere in the Twitter system and apply those events to each Streaming server. Inside the server, we examine the event just once, and route to all interested clients.
  10. This approach is a huge huge win over Request Response. It has turned out to be practical, stable and very efficient. Little effort is wasted. Yes, we look at each event on each of our streaming servers, but that&amp;#x2019;s really nothing compared to processing billions of requests only to say: sorry no new tweets yet. Since each event is delivered only once, there&amp;#x2019;s no bandwidth wasted. Latency is very low too. More on that later.
  11. There&amp;#x2019;s a flexibility bonus here too. We can add new event types to streams without having everyone recode to hit new endpoints. Just like adding new fields to JSON markup is future proof, we can also easily add new events to existing streams. When you are ready to use the new events, you can, otherwise, ignore them.
  12. What does a stream look like? Well, its a continuous stream of discrete JSON or XML messages. We deliver events at least once and in roughly sorted order. In general, during steady state, you&amp;#x2019;ll see each event exactly once with a practical K sorting. I&amp;#x2019;ll talk more about how these properties affect you at my other talk.
  13. These properties mean that you need to do at least a little post processing on your end. The data isn&amp;#x2019;t always display ready or even display worthy -- you need to post-process the Streaming API. Also, the streaming api servers don&amp;#x2019;t do much markup rendering -- that happens upstream in Ruby Daemons -- so whatever rendering quirks you are used to on the REST API, well, they&amp;#x2019;ll be here too. At least it&amp;#x2019;s always the same quirks.
  14. So how does the Streaming API fit within the Twitter system? It&amp;#x2019;s all a downstream model. Users do things, stuff happens, and we route a copy to Streaming. Let&amp;#x2019;s look at how we handle a common event: the creation of a new tweet.
  15. In the user visible loop, the FEs validate the input and update critical stores. They ack the user, then drop a message into a Kestrel message queue for offline processing. This way we can give user feedback, yet defer the heavy lifting to our event driven architecture. The tweets are fanned out to internal services: search, streaming, facebook, mobile, lists, and timelines. As an example, timeline processing daemons read the event, serially look up all the followers in Flock and re-enqueue large batches of work. Even before this flock lookup completes, another timeline daemon pool reads these batches then updates the memcache timeline vector of all the followers in a massively parallel fashion. The other server do their own thing, and the tweet is eventually published everywhere.
  16. Here we can see how events are fanned out from the Kestrel cluster into a single Hosebird cluster. Hosebird is the name of the Streaming server implementation. I really don&amp;#x2019;t like it. But the name stuck. Anyway, we use kestrel fanout queues to present each event to each fanout Hosebird process. Fanout queues duplicate each message for each known reader. Kestrel queues are bomb-proof and relatively inexpensive, but they aren&amp;#x2019;t free.
  17. So, within a streaming cluster, we get cheap, by cascading. Cascading is where a hosebird process reads from a peer via streaming HTTP, just like any other streaming client. No coordination is needed and we&amp;#x2019;re eating our own dogfood. There&amp;#x2019;s hardly any latency added by cascading, but the cost savings are considerable when there&amp;#x2019;s a large cluster of hosebird machines. Also, we get rack locality of bandwidth, as the hosebirds are generally together in a rack, while the kestrels are located on another isle.
  18. OK. We&amp;#x2019;ve routed the events to all of the hosebird servers. How do the servers work internally? Hosebird runs on the JVM. It&amp;#x2019;s written in Scala. And uses an embedded Jetty webserver to handle the front end issues. We feed each process 8 cores and about 12 gigs of memory. And they each can send a lot of data to many many of clients.
  19. Events flow through Scala actors that host the application logic. Filtered events are sent through a Java queue then read by the connection thread which handles the socket writing details. We use the Grabby Hands kestrel client to provide highly parallel and low latency blocking transactional reads from Kestrel. We use our own Streaming client in the cascading case. Both fetching clients are very efficient and hardly use any CPU.
  20. I used the Scala Actor model wherever practical, to prevent a lot of worrying about concurrency issues. It&amp;#x2019;s not a panacea but it has made much of this work trivial. Actors currently fall down if you have too many of them, so we use the Java concurrency model to host the connections. Otherwise its all Actors.
  21. You may notice the apostasy of burning a thread per connection. The year 1997 is calling to mock me, I&amp;#x2019;m sure. But so far it hasn&amp;#x2019;t mattered. The memory utilization isn&amp;#x2019;t a limiting factor, and it keeps things very simple.
  22. Feeds are logical groupings of events -- public statuses, direct messages, social graph changes, etc. Feeds keep a circular buffer of recent events to support the count parameter and some historical look back. I had to parallelize the JSON and XML parsing, which turned out to be the big CPU burn and probably our major tweets per second scaling risk.
  23. Feeds can be reconfigured to internally forward events to other feeds. Arbitrary composition in conf files a pretty powerful concept. So, to create user streams, I just had to forward events from all these other existing feeds into the User feed and write some custom delivery logic. Yes, there are streams of direct messages. And social graph changes. And other interesting things. We can&amp;#x2019;t expose them just yet due to privacy policy issues. But, we&amp;#x2019;ll get there. Plans have been laid.
  24. It doesn&amp;#x2019;t take long for a tweet to be created, pass through all of these components, and be presented to your stream. If all is running well with all of the upstream systems -- tweets and other events are usually delivered with an average latency of about 160ms.
  25. Here&amp;#x2019;s a ganglia monitoring graph, one of hundreds just for the Streaming API. Sometimes I find it funny that outside devs say &amp;#x201C;hey, did you know that you are throwing 503s on this endpoint&amp;#x201D;. Yes, we know. There&amp;#x2019;s a graph for it. If there isn&amp;#x2019;t a graph -- we immediately add one. And we roll the key ones up into a grid of 12 summary graphs that everyone watches. There&amp;#x2019;s also a bank of graph monitors in ops.
  26. Each line above represents the average latency from each of several hosebird clusters. This was taken during peak load on a typical weekday. You can see a blip about half way through. Given that all clusters moved in unison, there was probably an upstream garbage collection in kestrel, or something similar. We&amp;#x2019;ve put a lot of effort into lowering Twitter latency and keeping it low and predictable. (If visible, blue line is a cascaded cluster, where yellow and green are fanout only.)
  27. User streams offer a much more engaging way to interact with Twitter. You get to see a lot that happens to you -- who favorited your tweet, who followed you, and so forth -- in real time. You also get to see what your followings are doing. Who they favorited and followed. There&amp;#x2019;s a huge opportunity for discovery here with User Streams. If two friends favorite a tweet, and two others follow the tweeter, show me the tweet! We know that User Streams are transformative. Goldman and I were watching #chirp during Ryan&amp;#x2019;s talk. It was incredible to watch them scroll by. In the few days we&amp;#x2019;ve been using them at the office, everyone has been transfixed. Engineering productivity has plummeted! It&amp;#x2019;s the Farmville of Twitter.
  28. OK, what next for Streaming? First we&amp;#x2019;re going to get user streams out there. We have some more critical features to add and we have to add capacity to handle potentially millions of connections. We&amp;#x2019;ve announced the details for a user stream preview period. Read them carefully before coding or planning anything. There are also some interesting events that we don&amp;#x2019;t yet publish. We&amp;#x2019;ll see what we can get out there for you. Once user streams are in a good spot, we want to get back to some interesting large scale integration features.
  29. With user streams its all coming together. Real time Twitter. Lots of event types. More engagement. More discovery. New user experiences are now possible. Go out and build something great!