How to develop innovative, scalable systems

•Télécharger en tant que PPTX, PDF•

2 j'aime•1,664 vues

Max Kossatz

My talk at the wearedevelopers.org conference and at the ViennaJS meetup

Internet

How To Develop Innovative,
Scalable Systems?

• Authentication
• Slowmode
• Moderator
• Admins
• Subscribers
• Timeout
• Ban
• Limits per user
• Notify messages
• Imagelog
• Unlimited Rooms
Not that easy!
• IP-Ban
• Raffle
• Voting
• Whisper
• Blacklist
• Block user
• Posting Images
• DDOS Protection
• Limits per channel
• Chatlog
• System messages

WebSocket
Web Server
UI
Let’s try this
PHP
Mysql
Auto scaling

WebSocket Load
balancing
Permission and
security model
(Admin, Mods, ...)
Frontend Server Backend Server
UI
Ok, so let’s try this!
Frontend
Server
NodeJs
data storage
Redis
Cluster
hitbox
REST-API
PHP
Nginx
Backend
Server
NodeJs
Auto scalingAuto scaling
Average roundtrip / message: < 300ms

• Small, cheap machines
• Handle the connections, no logic
• When it breaks it breaks only for a few user
• Automatic Failover to another chat frontend server
• Socket.io for handling websockets
• Carrier for sending messages between front & back
• Up & Downscale possible as needed
Frontend Server

• Small, cheap machines
• Handles all the logic
• Stateless, can be restarted/upgraded any time
• Easy expandable with new features
• Up & Downscale possible as needed
• Load balancing via round robin
Backend Server

• Fast
• I mean, REALLY fast!
• You can cluster it
• Easy to back up
Redis

WebSocket Load
balancing
Permission and
security model
(Admin, Mods, ...)
Frontend Server Backend Server
UI
Ok, let’s fix Websockets
Frontend
Server
NodeJs
data storage
Redis
Cluster
hitbox
REST-API
PHP
Nginx
Backend
Server
NodeJs
Auto scaling
Auto scaling
Long Polling Fallback
Fallback
Server
NodeJs

• Frontend servers report CPU load every 10 Seconds
• Lowest X frontend servers are send to the UI
• UI selects a frontend server randomly from this five
• If UI gets disconnected it removes server from list
• UI tries another frontend server
• IF no servers left UI gets X new frontend servers from API
Load Balancing

Contenu connexe

Tendances

MongoDB MEAN Stack Webinar October 7, 2015Valeri Karpov

Signal rity1Yaniv Rodenski

Lessons in Open Source from the MongooseJS ODMValeri Karpov

Docker Dublin: Just What is a Service Mesh, and if I get one will it make eve...Elton Stoneman

TDD a REST API With Node.js and MongoDBValeri Karpov

Windows communication foundation (part2) jaliya udagedaraJaliya Udagedara

Node.JSKristaps Kūlis

Building Advanced Web UI in The Enterprise Worldefim13

Javascript for Wep AppsMichael Puckett

Node.jsRTigger

Jakarta JS April 2014Hafiz Badrie Lubis

Vert.xDiego Pacheco

Scaling and Orchestrating Microservices with OSGi - N Bartlettmfrancis

Node.jsTechizzaa

Weblogic server shortSamatha Kamuni

MongoDB + Node.JS + EPAM ROADSzilveszter Molnár

Cloud App DevelopFin Chen

WebAssembly vs JavaScript: What is faster?Alexandr Skachkov

Zitec+ +new business+-+3iun2010Agora Group

Ejerecicio 2Brayan Stiven Palomino

Tendances (20)

MongoDB MEAN Stack Webinar October 7, 2015

Signal rity1

Lessons in Open Source from the MongooseJS ODM

Docker Dublin: Just What is a Service Mesh, and if I get one will it make eve...

TDD a REST API With Node.js and MongoDB

Windows communication foundation (part2) jaliya udagedara

Node.JS

Building Advanced Web UI in The Enterprise World

Javascript for Wep Apps

Node.js

Jakarta JS April 2014

Vert.x

Scaling and Orchestrating Microservices with OSGi - N Bartlett

Node.js

Weblogic server short

MongoDB + Node.JS + EPAM ROAD

Cloud App Develop

WebAssembly vs JavaScript: What is faster?

Zitec+ +new business+-+3iun2010

Ejerecicio 2

Similaire à How to develop innovative, scalable systems

How hard can it beMax Kossatz

Debugging the Web with FiddlerIdo Flatow

(GAM404) Hunting Monsters in a Low-Latency Multiplayer Game on EC2Amazon Web Services

A brief intro to nodejsJay Liu

Chirp 2010: Scaling TwitterJohn Adams

Serverless applicationsmbaric

Architecture evolutionamit bezalel

SQL Server: Now It's Everywhere You Want to BeEd Leighton-Dick

How Applications Manager helps with application performance monitoringManageEngine, Zoho Corporation

John adams talk cloudyJohn Adams

Redis everywhere - PHP LondonRicard Clau

IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...Serdar Basegmez

Serverless Node.jsThe Software House

iMobileMagic Teck Talk Scale UpPedro Machado

Server 2016 sneak peekMichael Rüefli

Get the EDGE to scale: Using Cloudfront along with edge compute to scale your...Amazon Web Services

ScalingÒscar Vilaplana

Fixing twitterRoger Xia

Fixing_Twitterliujianrong

Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror

Similaire à How to develop innovative, scalable systems (20)

How hard can it be

Debugging the Web with Fiddler

(GAM404) Hunting Monsters in a Low-Latency Multiplayer Game on EC2

A brief intro to nodejs

Chirp 2010: Scaling Twitter

Serverless applications

Architecture evolution

SQL Server: Now It's Everywhere You Want to Be

How Applications Manager helps with application performance monitoring

John adams talk cloudy

Redis everywhere - PHP London

IBM Connect 2017: Your Data In the Major Leagues: A Practical Guide to REST S...

Serverless Node.js

iMobileMagic Teck Talk Scale Up

Server 2016 sneak peek

Get the EDGE to scale: Using Cloudfront along with edge compute to scale your...

Scaling

Fixing twitter

Fixing_Twitter

Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...

Dernier

(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...Escorts Call Girls

Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe

Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh

Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo

AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12

Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany

Al Barsha Night Partner +0567686026 Call Girls DubaiEscorts Call Girls

All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445ruhi

Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...SUHANI PANDEY

VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...SUHANI PANDEY

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC

Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝soniya singh

WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)Delhi Call girls

Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Call Girls in Nagpur High Profile

Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceDelhi Call girls

valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh

Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.soniya singh

VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC

Dernier (20)

(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...

Moving Beyond Twitter/X and Facebook - Social Media for local news providers

Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝

Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available

AWS Community DAY Albertini-Ellan Cloud Security (1).pptx

Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...

Al Barsha Night Partner +0567686026 Call Girls Dubai

All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445

Russian Call Girls Pune (Adult Only) 8005736733 Escort Service 24x7 Cash Pay...

VVIP Pune Call Girls Sinhagad WhatSapp Number 8005736733 With Elite Staff And...

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024

Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝

WhatsApp 📞 8448380779 ✅Call Girls In Mamura Sector 66 ( Noida)

Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...

Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service

valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...

Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝

Call Now ☎ 8264348440 !! Call Girls in Sarai Rohilla Escort Service Delhi N.C.R.

VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...

'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...

How to develop innovative, scalable systems

1. How To Develop Innovative, Scalable Systems?

5. Chat System

6. How hard can it be?

7. • Authentication • Slowmode • Moderator • Admins • Subscribers • Timeout • Ban • Limits per user • Notify messages • Imagelog • Unlimited Rooms Not that easy! • IP-Ban • Raffle • Voting • Whisper • Blacklist • Block user • Posting Images • DDOS Protection • Limits per channel • Chatlog • System messages

8. Realtime

10. No IRC!

11.

12. WebSocket Web Server UI Let’s try this PHP Mysql Auto scaling

13.

14. Less than 2k

15. Back to the drawing Board!

16.

17. WebSocket Load balancing Permission and security model (Admin, Mods, ...) Frontend Server Backend Server UI Ok, so let’s try this! Frontend Server NodeJs data storage Redis Cluster hitbox REST-API PHP Nginx Backend Server NodeJs Auto scalingAuto scaling Average roundtrip / message: < 300ms

18. • Small, cheap machines • Handle the connections, no logic • When it breaks it breaks only for a few user • Automatic Failover to another chat frontend server • Socket.io for handling websockets • Carrier for sending messages between front & back • Up & Downscale possible as needed Frontend Server

19. • Small, cheap machines • Handles all the logic • Stateless, can be restarted/upgraded any time • Easy expandable with new features • Up & Downscale possible as needed • Load balancing via round robin Backend Server

20. • Fast • I mean, REALLY fast! • You can cluster it • Easy to back up Redis

21. No single point of failure

22.

23.

24. Websockets...

25. WebSocket Load balancing Permission and security model (Admin, Mods, ...) Frontend Server Backend Server UI Ok, let’s fix Websockets Frontend Server NodeJs data storage Redis Cluster hitbox REST-API PHP Nginx Backend Server NodeJs Auto scaling Auto scaling Long Polling Fallback Fallback Server NodeJs

26.

27. Script Kiddies

28.

29.

30.

31. Validate everything!

32.

33.

34. • Frontend servers report CPU load every 10 Seconds • Lowest X frontend servers are send to the UI • UI selects a frontend server randomly from this five • If UI gets disconnected it removes server from list • UI tries another frontend server • IF no servers left UI gets X new frontend servers from API Load Balancing

35.

36.

37. 2000 Messages/seconds

38. Async.js for the win!

39.

40.

41. „Self“ DDOS

42.

43. Cache early, Cache often!

44. Stupid Software Design

45. Sometimes Realtime is bad!

46.

47.

48. Monitor Everything!

49. Etsy‘s statsd

50.

51. So, is it Working?

52.

53.

54. Thank you! max@hitbox.tv

55. We are hiring! jobs.hitbox.tv

Notes de l'éditeur

What i want to talk about
Thats me, 1980/81 with my first computer, anyone know the computer? I have studed arts, lived in new york & berlin, have made startups and have crashed startups
What is hitbox? This is the frontpage
This is a streamer, he plays games and streams them. Most of them are also entertainer, making money wiht advertising & subscriptions 6 M uniques/month, number 2 in the world
Sounds easy, or? Exists since 30 years.
So, how hard can it be?
Lot of things to do! And thats just the beginning!
Most important is realtime, you write something and all others should see it as fast as possible.
For example, he dances (and lost 20 kg this way) and people cheer him up in the chat.
So back to the chat, IRC is a protocol that is used since 30 years, we wanted to make something new, something modern, something without netsplits, etc.
We started with this because our backend is already in php, lets see if this works out!
Easy setup: And mysql as database. Well, these two sentences tell you already all of the problems....
Imagine „Long running php process to server multiple websocket connections“
It worked for up to 2000 connections, not very scaleable!
So back to the drawing board, we wanted something modern, so lets use modern software!
We went with nodejs & redis, anyone here has experience with nodejs as servers?
We use a two way setup: Frontend servers and backendservers and redis as a data storage. If we loose the redis data we just loose who is in what chatroom, just press f5 and you are back in.
We use AWS Single core machines
Same machines as for frontends
I can only recommend it, i never saw a redis instance failing (except for getting slow)
So, looks like a perfect system, lets code it! We did and...
it worked! So we could party!
Not so fast!!!!!!! There we had our first problem, something everyone should support
Its a fucking standard! But there are firewalls that block it, there are mobile devices that block it or even worse, tell you that a websocket connection is working but it isnt, they just lie to you! 0,5-1% have this problem, but they where mailing us like hell....
0,5-1% So we had to use fallback servers for long polling. Long polling means a lot of overhead from the http-protocoll, so these servers can handle only 1/10 of the normal frontend server, but it works!
So we thought we can party again.
Well, the hitbox audience is young, so they try a lot... You wont imagine how often we get ddosed or people try to abuse the api.... And last yeatr, someone managed it:
It was during at that time biggest event ever, 60k people on one stream and suddenly all of them saw this.
And we did this!
Well, they did not managed to break our system or steal any userdata, the only think they did was insert in the „nameColor“ some javascript, and we did not validated it. We validated everything else, but not this one, because it is only a number... so
Really, everything, really, really everything! Again, we thought we can party!
But.... Then others came and did this
A websocket DDOS! Sending massive amounts of join commands to the chat. So we had to think about how we can distribute this load better or make it harder for them to reeach all frontend server, remember, they are up & down scaling automatic.
So this is our way how we do load balancing on the frontend servers, works really good. If they ddos a few servers this servers will not get new connections and from the upscale we get new servers that are not ddosed. Why the random factor in the ui? F5, more on this later. So once again we party hard!
Until he came
Rezigiusz a polish Youtuber & streamer with a lot of fans that love to type Think of it as one direction of poland
When he is streaming he has around 1-15k viewers and they type 2000 messages a second into the chat! 1995 get blocked, but the backend servers have to check this.... So the event loop of nodejs exploded....
But, using async.js, whic is a great tool to queue work we could clean up the event loop, delaying some messages a few milliseconds but letting the main tasks working fine
So for example we made queues for the most important function, login, logout, chatmsg, etc. So, we can party again!
But, dont forget one of the biggest problems you can run into...
I know, this sound s stupid, but i will give you two examples: Imagine you have a stream with 100k viewers. Every time a new viewer comes to this stream he/she gets the info about how to get the stream from our server. Now imagine the streamer has a problem, lets say his computer crashes and the stream drops, mean is getting black or stucked. What does 100k people do?
This. And lets hope that your api can handle this! And they wont stop until trhe have a stream again!
We learned a lot about caching, otherwise you cannot handle this, memcache & redis are your friend here. The second example is stupid sotware design: It is quite often that streamer announce when they start to stream and then people are waiting already on the page for them to go online. Well, we have the chat already connected anyway, why not send a special message over the chat to trigger the start of the stream... Sounds easy, for our system it is
Because than again you weill self ddos yourself, imagine this with 100k people waiting...
So sometimes realtime is really bad, because it is realtime... And it can destroy you
So we got back to the good old interval because then you distribute the 100k connections over 30 seconds, giving you much more time handle the load. So, we can party again!
The same guy as at the beginning, he has its own website with animated gifs 
Well, at the end something that is very important for me, monitor everything!
Our swiss army knife is statsd from etsy, a great peace of software written in nodes that monitors stuff via udp and works great.
We use it in cobimation with graphite and monitor really everything. See the down-spike on active chat connections? That is when node is not able to keep the 10 seconds timing for the reporting of the stats, you get used to it 
Well, and at the end, is the chat system working? Does it scale?
Well, i dont have a screenshot about our latest record that was close to 200k, but this one shows you a channel with 100k people. All 154k connections where handled by 16 frontend servers and 8 backend servers, costing us around $20 for the evening.
And dont forget the network traffic! Around 160-200Mbit per machine, only text outgoing! These cheap machines are limited by around 200mbit. Thats it, thank you!
Just one mor think:
We are hiring!

How to develop innovative, scalable systems

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à How to develop innovative, scalable systems

Similaire à How to develop innovative, scalable systems (20)

Dernier

Dernier (20)

How to develop innovative, scalable systems

Notes de l'éditeur