UiPath Platform: The Backend Engine Powering Your Automation - Session 1
The World Is On Fire And So Is Your Website
1. The World Is On Fire And
So Is Your Website
Architecting systems for extremely bursty web
traffic driven by the news cycle
Ann Lewis, CTO @
@ann_lewis
3. What is MoveOn?
● Grassroots campaigning
● Fighting for social justice, progressive policies,
progressive candidates
● A community of millions of progressives in all 50 states
4. What is MoveOn?
● Small, scrappy, fully-distributed team
● Nationally impactful programs powered by tech tools and
data
● A complex ecosystem of 30+ websites and tools that need
to scale on a nonprofit budget
5. Who am I?
● MoveOn’s CTO since 2015
● Software engineer and technical leader for 15+ years
● Alum of Carnegie Mellon, Amazon, Rosetta Stone, handful
of startups, consulting companies
● Excited about building tech that powers collective action
6. Agenda
● The new attention economy
● Story: a protest goes viral
● The tech behind mass mobilization infrastructure
● How to scale a complex system architecture in the new
attention economy, on a nonprofit budget
7. A walk down memory lane
Show of hands:
Who remembers Slashdot?
Who remembers the internet before
big social media?
9. The “Slashdot” effect of the 90s
A massive surge of web traffic that occurs when a popular website
links to a smaller website.
10. The attention economy
● As the volume of information and
news grows, attention becomes a
scarce resource
● All content publishers compete for
this aggregate attention
● Social media platforms attempt to
control engagement around viral
content
Content
Attention
11. The attention economy evolves
● Previous generation:
○ Social news sites like Slashdot aggregated attention
○ Virality happened via cumulative direct user actions, like
upvoting
● Today:
○ Dominance of social media platforms
○ Virality is controlled by the platforms, who make the rules
around who sees what, when and why
12. Feedback Loops
The news cycle is a dumpster fire, and social media feedback
loops are very effective at quickly amplifying the most inflammatory
content to virality.
13. Oligarchy?
● America’s economic oligarchy: over the last
generation, a small number of people have grown
more rich while middle and working class wages
have stagnated
● On most social media platforms, 0.1% of users
have > 100K followers, and 2% have 10K-100K
followers
● Most everyone else has 700 followers or less
● Social media is an oligarchy too!
14. Influencers
● Influencers: social media users with > 100K
followers
● Micro-influencers: social media users with
10K-100K followers
● Influencers control the nature of virality in
today’s attention economy
● Yes, your favorite gen Z-er was right about
becoming an instagram influencer
16. No One Is Above The Law
● Nov 6: US election day. Everyone working on elections is
proud and exhausted. Highest turnout for a midterm since
1914!
● Nov 7 2:40pm: Trump crosses a Mueller investigation
“red line”: fires Jeff Sessions and replaces with loyalist
● Nov 7 5:10pm: Trump Is Not Above the Law’ protest
coordination network launches
17. Trump Is Not Above the Law
● Nov 7 5:10pm: Protest hub website lists 700 protest
events nationwide, 400K people RSVPed
● Nov 7, 7pm: Protest call-to-action gets 10Ks of retweets,
we observe moderate surges of traffic
● Nov 7, 9pm: Influencer Rachel Maddow mentions protest
website on evening show, traffic surges to 3.5MM views,
site falls over (but quickly comes back up)
19. Trump Is Not Above The Law
● 11/8/2018 12pm ET: Protest hub website has
accumulated ~1000 events nationwide, ~500K people
RSVP. 300 new events and 100K more RSVPs in 24
hours!
● 11/8/2018 5pm local time: Nationwide protests!
21. Key Technical Takeaways
● Today, the observed behavior of virality is
tightly controlled by the social media
platforms
● “Going viral” only means traffic surges if
the platforms decide it does.
● With a major exception: influencers can
still generate organic viral behavior
23. The Tech Behind Protest Networks
● Hub website: a database of protest events, protest prep
material content hub, event map and search tools
● Crowdsourced event creation: anyone can host a
protest
● Mobilization tools drive event creation and RSVPs: we
email, text, and buy targeted social media ads to find
people interested in nearby protest events
24. Stepping Up to Big Moments
● No one knows when the next big moment will happen
● We need to be able to react and launch quickly
● Massive scale is critical to impact
● ... all on a nonprofit budget!
25. Problems to Solve
● Can’t predict or control when content will go viral
● Can’t afford to maintain big company levels of tech
infrastructure all the time
● Our infrastructure = a complex 30+ entity ecosystem of
in-house and vendor tools. Scale testing complex
architecture is very time-consuming.
26. Monitoring and Measurement
● Monitoring is key: monitor everything,
through the architectural stack, including
vendor tools
● SLAs are key:
○ Aggressive SLAs for in-house tools
○ Observe vendor uptime and availability
○ Plan around cascading failures
EWarren has a plan.
Do you?
27. Vendors
● Your system doesn’t scale if your vendors don’t scale.
● Get SLAs and incident response plans into your contracts
● Build a strong relationship with vendors before the next
big scaling emergency.
● Do regular build vs buy and platform analysis and
understand the cost of switching if you need to
28. Scaling Incident Response Plans
● What to do before, during and after a scaling incident
● Who to call, what to check, what decisions to make
● Hot backup failover plans for in-house systems
● Static or stopgap backups for vendor systems.
29. Granular Autoscaling
● Fast reaction time is key
● Breakout virality will have a 100x scaling impact within
minutes, not hours
● User action curve will be order of magnitude minutes
● We can’t miss 15min waiting for autoscaling to kick in
30. Granular Autoscaling
● Consider microservices for scaling bottlenecks:
spinning up additional containers is much faster than
booting up additional virtual machines
● It’s often cheaper: the per-invocation cost of handling a
traffic surge is 10% of the cost of dedicated hardware
during the scaling period
31. Granular Autoscaling
● Scaling response plan should include all distributed
systems scaling levers:
○ Quickly add servers (or containerized capacity)
○ Just-in-time upgrade hardware
○ Enable additional caching
○ Queue up bursts of writes to process later
32. Don’t Forget the CAP Theorem
● Consistency, Availability and Partition Tolerance: pick 2
● Analyze your architecture ahead of the scaling incident
and map out the choices to make in the event of loss of
data consistency, component availability, and network
partitioning
● Include this in your scaling incident response plan, and be
prepared to make hard choices
33. Conclusion
● Big social media companies have changed the shape of
the attention economy
● Social media is an oligarchy, and influencers win
● Traffic surges happen in O(minutes) instead of O(hours)
● Scale planning is harder
● Scale planning is also key: monitor everything, create
scaling emergency response plans, get granular