3. What is MoveOn?
● Grassroots campaigning
● Fighting for social justice, progressive policies,
progressive candidates
● A community of millions of progressives in all 50 states
4. What is MoveOn?
● Small, scrappy, fully-distributed team
● Nationally impactful programs powered by tech tools and
data
● A complex ecosystem of 30+ websites and tools that have
really interesting scaling problems to solve
5. Who am I?
● MoveOn’s CTO since 2015
● Software engineer and technical leader for 15+ years
● Alum of Carnegie Mellon, Amazon, Rosetta Stone,
handful of startups, consulting companies
● Excited about building tech that powers collective action
6. Agenda
● Virality, social media, and the new attention economy
● Story: a protest goes viral
● The tech behind mass mobilization infrastructure
● How to scale a complex system architecture in the new
attention economy, on a nonprofit budget
7. A walk down memory lane
Show of hands:
Who remembers Slashdot?
Who remembers the internet before
big social media?
9. The “Slashdot” effect
A massive surge of web traffic that occurs when a popular website
links to a smaller website.
10. The attention economy
● As the volume of information and news grows, attention
becomes a scarce resource
● All websites and in particular the big social media platforms
compete for this aggregate attention
● When content goes viral:
○ Content creators win the battle for attention
○ Social media platforms attempt to annex engagement
around viral content
11. When Content Goes Viral
● Viral impact is measured as
a sudden steep increase in
views or user interaction,
usually followed by an
exponential decay
● “Going viral” = a singularity
of collective attention
12. Virality on social graphs
Content shared at a rate of
[any value > 1.0] per view, will
in O(log(n)) time saturate a
social graph, where n =
users.
Social media platforms
amplify sharing behavior, and
define the social graph
13. The attention economy evolves
● Previous generation:
○ Social news sites like Slashdot aggregated attention
○ Virality happened via direct user actions, like upvoting
● Today:
○ Dominance of social media platforms
○ Platforms make the rules around who sees what, when and
why
14. The attention economy evolves
The news cycle is a dumpster fire, and social media feedback
loops are very effective at quickly amplifying the most inflammatory
content to virality.
16. Nov 8, 2018 was an exciting day
● Nov 6: US election day. Everyone working on elections is
proud and exhausted. Highest turnout for a midterm since
1914!
● Nov 7 2:40pm: Trump crosses a Mueller investigation
“red line”: fires Jeff Sessions and replaces with loyalist
● Nov 7 5:10pm: Trump Is Not Above the Law’ protest
coordination network launches
17. Trump Is Not Above the Law
● Nov 7 5:10pm: Protest hub website shows 700 protest
events listed nationwide, 400K people RSVPed
● Nov 7, 7pm: Protest call-to-action goes viral on Twitter,
we observe moderate surges of traffic
● Nov 7, 9pm: Rachel Maddow mentions protest website on
evening show, traffic surges to 3.5MM views, site falls
over (but quickly comes back up)
19. Trump Is Not Above The Law
● 11/8/2018 12pm ET: Protest hub website has
accumulated ~1000 events nationwide, ~500K people
RSVP. 300 new events and 100K more RSVPs in 24
hours!
● 11/8/2018 5pm local time: Nationwide protests!
22. Observed traffic patterns
● Before Nov 7: traffic to protest hub website was mostly earned:
we emailed, called, SMSed MoveOn members and posted
social media updates
● Day of Nov 7: moderate surges of traffic when protest call-to-
action went viral on Twitter. Interestingly, the same content,
photos, messaging were under-engaged on Facebook.
● Nov 7-8: organic virality did not lead to traffic surges that broke
our infrastructure, until the Maddow mention
23. Key Technical Takeaways
● Today, the observed behavior of virality is tightly
controlled by the social media platforms
● “Going viral” only means traffic surges if the platforms
decide it does.
● With a major exception: celebrities can still generate
organic viral behavior
● Hence: Rachel Maddow is the new Slashdot Effect
26. The tech behind protest networks
● Hub website: a database of protest events, protest prep
material content hub, event map and search tools
● Crowdsourced event creation: anyone can host a
protest
● Mobilization tools drive event creation and RSVPs: we
email, text, and buy targeted social media ads to find
people interested in nearby protest events
27. Scaling through viral moments
● A surge in traffic and attention is a gift: it means our work
and mission matter
● Our systems need to be architected to scale through viral
moments
● ... on a nonprofit budget!
28. Problems to Solve
● Can’t predict or control when content will go viral
● Can’t afford to maintain big company levels of tech
infrastructure all the time
● Our infrastructure = a complex 30+ entity ecosystem of in-
house and vendor tools. Scale testing complex
architecture is very time-consuming.
29. Monitoring and Measurement
● Monitoring is key: monitor everything, through the
architectural stack, including vendor tools
● SLAs are key:
○ Aggressive SLAs for in-house tools
○ Observe vendor uptime and availability
○ Plan around cascading failures
30. A note on vendors
● Your system doesn’t scale if your vendors don’t scale.
● Get SLAs and incident response plans into your contracts
● Build a strong relationship with vendors before the next
big scaling emergency.
● Do regular build vs buy and platform analysis and
understand the cost of switching if you need to
31. Scaling Incident Response Plans
● What to do before, during and after a scaling incident
● Who to call, what to check, what decisions to make
● Hot backup failover plans for in-house systems
● Static or stopgap backups for vendor systems.
32. Granular Autoscaling
● Fast reaction time is key
● Breakout virality will have a 100x scaling impact within
minutes, not hours
● User action curve will be order of magnitude minutes
● We can’t miss 15min waiting for autoscaling to kick in
33. Granular Autoscaling
● Consider microservices for scaling bottlenecks:
spinning up additional containers is much faster than
booting up additional virtual machines
● It’s often cheaper: the per-invocation cost of handling a
traffic surge is 10% of the cost of dedicated hardware
during the scaling period
34. Granular Autoscaling
● Scaling response plan should include all distributed
systems scaling levers:
○ Quickly add servers (or containerized capacity)
○ Just-in-time upgrade hardware
○ Enable additional caching
○ Queue up bursts of writes to process later
35. Don’t forget the CAP theorem
● Consistency, Availability and Partition Tolerance: pick 2
● Analyze your architecture ahead of the scaling incident
and map out the choices to make in the event of loss of
data consistency, component availability, and network
partitioning
● Include this in your scaling incident response plan, and be
prepared to make hard choices
36. Conclusion
● Big social media companies have changed the attention
economy, and what “going viral” means
● Yet, influencers can still create organic viral behavior
● Traffic surges happen in O(minutes) instead of O(hours)
● Scale planning is harder
● Scale planning is also key: monitor everything, create
scaling emergency response plans, get granular