Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Anomaly Detection using ML in Elisa Viihde CDN

1 618 vues

Publié le

Jere Nieminen
Service Architect – Elisa
Jere is experienced architect specialized in video streaming technologies. He is currently working on making video streaming as smooth as possible for Elisa Viihde customers.

Publié dans : Technologie
  • Soyez le premier à commenter

Anomaly Detection using ML in Elisa Viihde CDN

  1. 1. 19.12.2018 1 Anomaly Detection using ML in Elisa Viihde CDN Jere Nieminen 13.12.2018 Elisa and Elisa Viihde • Elisa • Telecommunications, ICT and digital service company operating mainly in Finland and Estonia • Over 2.8 million customers who have over 6.2 million subscriptions • Elisa Viihde • Finland’s most popular entertainment service • Several original series and exclusive distribution rights for certain movies and series • Linear TV channels, Network PVR, Catchup, TVOD/SVOD/EST • More than 300 000 household subscribers 2
  2. 2. 19.12.2018 2 Elisa Viihde CDN 3 • Features focused on • Streaming Video • Cache/Network Optimization • Team with 6 members focused on • SW Development and integrations • Daily operations • QoS and QoE High Level Architecture Background - Elastic Stack 101 • Elasticsearch • JSON data store with Restfull API • Beats & Logstash • Ingest data to Elasticsearch • Kibana • Search and Visualize data in Elasticsearch • Machine Learning (X-Pack) • Anomaly Detection
  3. 3. 19.12.2018 3 Terminology • Anomaly • A deviation in the normal behaviour • Machine Learning • Make predictions or decisions without being explicitly programmed to perform the task • Unsupervised Anomaly Detection • Searching for instances that fit the least to remaining unlabeled data set where it is assumed that most of the data is normal • In our use-case, we let the machine learn from the data and detect anomalies, but do not allow the machine to carry out any ”smart” tasks related to it 5 N otifications History of Detecting Streaming Issues 2016 Early days 01 Logging Trials 02 2017 04 2018 Q1 Q2 Q3 Q4Q1 Q2 Q3 Q4 03 Elastic w ith Access Logs Stream ing Session 05 Anom aly D etection Trials 06 Anomaly Detection in Action
  4. 4. 19.12.2018 4 Visual Dashboard - Incorrect caching configuration 7 Increasing daily error rate Fix deployed Reaction time ML Detection Example - Broken Content 8 Fragmented MP4 asset 1920x1080@7Mbps 1280x720@4.5Mbps 1024x576@2Mbps 640x360@800kbps 480x270@300kbps Timeline Timecode drift ML Job Config
  5. 5. 19.12.2018 5 ML Detection Example - Network Issue 9 ML Job Config ML Detection Example – RR Performance 10 ML Job Config Production v1.0-52 Canary v1.0-53
  6. 6. 19.12.2018 6 Ask the Right Questions / Survivorship Bias 11 Image credit to Daniel G. Siegel https://www.dgsiegel.net/talks/the-bullet-hole-misconception Is the CDN performing well? Are the clients getting the best quality of experience? Based on the server side metrics can we answer following questions: ML Example - Anomalies from Client QoE data 12
  7. 7. 19.12.2018 7 ML Example – How to get fooled 13 Anomaly New normal Key Takeaways 14 • Focus on the Data • Logs • Usually made for humans to read • Log also the successful events • Do all the tricks like split, parse etc. before storing • Logging vs. Monitoring • Needless battle • Manual thresholds are still not outdated • Creating ML jobs is easy, but… • Understanding the events is sometimes really hard • Process to investigate all the anomalies • Enhance the data set