SlideShare une entreprise Scribd logo
1  sur  26
Building a recommendation system for
IPTV on a fast streaming architecture
Iva Sorić
Tomislav Hlupić
About us
Focus on customer satisfaction
• LEADING DW/BI IMPLEMENTER IN SEE
• 200+ REALISED PROJECTS
• 90+ USERS IN 20 COUNTRIES
• 110+ EMPLOYEES
• 90+ CONSULTANTS/IMPLEMENTERS
• 70 TECHNICAL CONSULTANTS
• 20 BUSINESS CONSULTANTS
• 5 PROJECT MANAGERS
• OVER 600 MAN/YEARS OF EXPERIENCE IN LEADING
TECHNOLOGIES
Our business expertise fields
Innovative approach in the business decision support area
Strategic ICT consulting
• Analysis
• Design
• Development
• Implementation
• Support
• Education
 Introduction – Content delivery services & CX for content delivery services
 PI Content Analytics System – Business requirements & Technical solution
 Implementation of recommendation system as part of our Content Analytics System
 Conclusion
Agenda
Communication operators are offering to their consumers many
services that enable them to consume video content, using
different fixed and mobile technologies through different devices,
either in their homes or mobile.
Big part of the offer is understanding the user’s needs, and in the
case of the operators, the data is available, but there are systems
built for tis specific purpose.
Introduction
Analysis of the consumers’ behaviour in the real-time and over
loger period of time gives operators possibility to
• maximize revenue
• minimize costs
• Serve their customers better
• Reach the highest possible level of customer experience
For that reason, operators are using sophisticated
recommendation engines to propose to their consumers’
content that may be interesting for them.
Introduction
• Enable consumers to watch only broadcasts in real-time
• Include following types of services:
• Digital Terrestrial TV
• Digital Satellite TV
• Digital Cable TV
• Usually the data can’t be binded to a particular customer
Broadcasting services
• Using Internet and IP protocol for distribution of the content.
• Consumers can watch real-time broadcasts and historical broadcasts that are
stored on providers’ infrastructure.
• Includes the following types of services:
• Internet Protocol television (IPTV)
• Over the Top content services (OTT)
• Mobile TV (TV on the go)
Streaming services
IPTV vs. OTT
IPTV Over-the-top technology
Content provider Local telecom
Studio, channel, or
independent service
Transmission network
Local telecom - dedicated
network
Public Internet + local telecom
Receiver
Local telecom provides (set-top
box)
Purchased by consumer (TV,
computer or mobile)
Display device Screen provided by consumer Screen provided by consumer
Analytical systems for content delivery services shall enable operators to:
• Understand consumers’ behavior, which content they consume, through which
channel at what time and on what device
• Analyze the performance of offered service packages and package options
• Use rating information to negotiate content with content providers
• Segment consumers based on their behavior
• Approach consumers with appropriate offers and recommendations
Business requirements
Architecture of the solution
Scalable architecture that can process up to hundreds of millions records daily
IPTV, Digital TV
Video on demand
OTT providers
Channels
Devices
Recommendations / Open API
Insights
ML / Predictive
models
• Solution supports both batch processing and real-time streaming
• Data can be loaded to Vertica by running a series of COPY statements, each of which
loads small amounts of data into Vertica database
• For real-time streaming, Kafka integration feature can be used to automatically load data
to Vertica database as it streams through Kafka channel
Technical solution – Flexible data ingestion
• Data enters Kafka as a message, typically in JSON or AVRO format
• custom parser can be built for other data formats.
• Feed of messages in a common category come together to form topics.
• Kafka divides the topics up into partitions that it can be fed in parallel to
Vertica target tables for further analysis.
Technical solution – Kafka integration feature
Real-time usage reports and
dashboards
• Calculate different metrics, Apply
filters
• Drill-down to fine granularity data
• Time series (trend) analysis
• Map view
Business Solution
Analytics
Detailed usage and behavior analytics
• Consumer
• Channel
• Content item
• Device
• Operating system
• Delivery type (live, catchup, VOD...)
• Action that was taken by user to view specific
content item
Predictive models for segmentation,
recommendations and cross & up-sell
Business Solution
Analytics
Recommender system for content
delivery
• Information overload problem
• Improve customer experience
• Increase revenue (cross-sell /
upsell)
 Approach consumers with
appropriate offers and
recommendations
Recommender systems for content delivery
Two main approaches:
• Content-based recommender systems – use profile information filtering
• information from customer profile (demographic data, answers from a suitable questionnaire), and
• information about the content and its attributes (i.e. for movies that will be the genre, director, starring
actors, box office popularity)
• Collaborative filtering recommender systems – use interaction information
• explicit user ratings, or user interactions with content delivery platform
• make predictions (filtering) about the interests of a user by analysing preferences from many users
(collaborating)
• CF methods usually produce better results, but have one important disadvantage – cold start problem
(they cannot make prediction for new users)
 Hybrid recommender systems
Approaches
Important issues for recommender system for content delivery are:
• Absence of rating results from users for delivered content items
• Same user account is used by one or more people in the same household, thus resulting in a
user behavior that’s the union of the behaviors of all household members
• Available items for recommendations are constantly changing
• Prices for content change over time
• Some additional rules shall be applied for particular users (i.e. filtering of adult only content)
• Recommendations must be created very often to be able to have current recommendations for
large number of users
Issues
Input data:
• Source data for the recommendations engine is stored
in the Vertica platform
• It is automatically refreshed and maintained
on daily basis
• Two main tables:
ouser-item interaction table
oitems metadata
Recommender system as a part of PI Content
Analytics system
• For video-on-demand type of content on the IPTV platform we used
model-based collaborative filtering, where users and items are both
represented by a set of latent factors.
• Factors, or features, are inferred from the ratings patterns and represent
their characteristics that do not necessarily have to be human-
interpretable; they are implicitly present computer-calculated dimensions
used as characterizations of users/items in the calculations.
• Matrix factorization techniques are
used to learn these factors
Chosen approach
• Spark ML, Python
• Vertica Connector for Apache Spark
• Spark uses ALS (alternating least squares) to minimize the squared error
on the set of known ratings in order to learn the latent factors.
• ALS works by rotating between fixing one side and solving a least squares
problem on the other, and vice-versa. These steps are performed
iteratively until convergence.
leverage parallelization
Tools and techniques
• IPTV platform does not provide explicit user ratings, so we used the usage data to estimate users'
preferences, i.e. to assign implicit ratings, based on the percentage of the show duration they
watched, by the following formula:
𝑟𝑢(𝑖) =
5 , 𝑝𝑐𝑡 𝑢(𝑖) ≥ 1
2 + 3 ∗ 𝑝𝑐𝑡 𝑢 (𝑖) , 0 < 𝑝𝑐𝑡 𝑢(𝑖) < 1
𝑟𝑢(𝑖) = estimated rating of user u to item i (minimum rating = 1, maximum rating = 5)
𝑝𝑐𝑡 𝑢(𝑖) = percentage of the show i the user u watched (0.8 means he watched 80% of the show
duration)
 The more he watches, the greater the level of confidence in his preference estimation
 Implicit ratings are inferred from the users' activities
Implicit ratings
• The recommender engine processing is scheduled and performed on daily basis, during
the low activity periods
• The output of the processing is top 10 recommendations for every user
• The results are stored in Vertica and fetched to display to the user as his top 10
recommendations while he is looking for something to watch (integration with IPTV
platform through API)
• For implementation for real-time broadcasts, recommendation table shall be extended
with show start timestamp, to be able to propose to the user the show that has not started
yet and will start soonest.
• Cold start problem – most popular items recommendations
Storing results and generating recommendations
• Recommender systems are playing a very important role in improving
customer experience in many digital industries
• We designed and implemented a Content Analytics System with two main
features:
capability of fast processing of semi-structured and unstructured
heterogeneous data originating from digital content delivery services
recommendation engine that continuously gathers data and generates
recommendations
Conclusion
iva.soric@inteligencija.com
tomislav.hlupic@inteligencija.com

Contenu connexe

Similaire à Building an recommendation system for IPTV on a fast streaming architecture - Tomislav Hlupic, Iva Soric

A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVFrancisco Couto
 
Evolution of Online Delivery | Scott Youngblom
Evolution of Online Delivery | Scott YoungblomEvolution of Online Delivery | Scott Youngblom
Evolution of Online Delivery | Scott YoungblomLavaConConference
 
Spatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformitySpatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformityIlkka Rinne
 
Semaphore cloud development webinar
Semaphore cloud development webinarSemaphore cloud development webinar
Semaphore cloud development webinarAnn Kelly
 
Chapter 7 Development Strategies
Chapter 7 Development StrategiesChapter 7 Development Strategies
Chapter 7 Development StrategiesMeryl C
 
IT Profile CANVAS Presentation
IT Profile CANVAS PresentationIT Profile CANVAS Presentation
IT Profile CANVAS PresentationIT-Profile
 
Predictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesPredictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesDavid Zibriczky
 
Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)byteLAKE
 
Steve Tuppen - Digital Service Management
Steve Tuppen - Digital Service ManagementSteve Tuppen - Digital Service Management
Steve Tuppen - Digital Service ManagementitSMF UK
 
MULTIMEDIA Cocomo forum version5
MULTIMEDIA Cocomo forum version5 MULTIMEDIA Cocomo forum version5
MULTIMEDIA Cocomo forum version5 philipsinter
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableTim Case
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisationShwetabh Jaiswal
 
Out of the Box Experience
Out of the Box ExperienceOut of the Box Experience
Out of the Box Experienceimec
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my viewOuti Aramo
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2
 
Fibotalk: Increase trial conversion
Fibotalk:  Increase trial conversionFibotalk:  Increase trial conversion
Fibotalk: Increase trial conversionrytangle
 
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...Sumo Logic
 

Similaire à Building an recommendation system for IPTV on a fast streaming architecture - Tomislav Hlupic, Iva Soric (20)

A Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TVA Flexible Recommendation System for Cable TV
A Flexible Recommendation System for Cable TV
 
Evolution of Online Delivery | Scott Youngblom
Evolution of Online Delivery | Scott YoungblomEvolution of Online Delivery | Scott Youngblom
Evolution of Online Delivery | Scott Youngblom
 
Spatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformitySpatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE Conformity
 
Semaphore cloud development webinar
Semaphore cloud development webinarSemaphore cloud development webinar
Semaphore cloud development webinar
 
Chapter 7 Development Strategies
Chapter 7 Development StrategiesChapter 7 Development Strategies
Chapter 7 Development Strategies
 
IT Profile CANVAS Presentation
IT Profile CANVAS PresentationIT Profile CANVAS Presentation
IT Profile CANVAS Presentation
 
Predictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment BusinessesPredictive Solutions and Analytics for TV & Entertainment Businesses
Predictive Solutions and Analytics for TV & Entertainment Businesses
 
Sandeep Gupta
Sandeep GuptaSandeep Gupta
Sandeep Gupta
 
Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)Self-Checkout (AI for Restautants)
Self-Checkout (AI for Restautants)
 
Steve Tuppen - Digital Service Management
Steve Tuppen - Digital Service ManagementSteve Tuppen - Digital Service Management
Steve Tuppen - Digital Service Management
 
MULTIMEDIA Cocomo forum version5
MULTIMEDIA Cocomo forum version5 MULTIMEDIA Cocomo forum version5
MULTIMEDIA Cocomo forum version5
 
Igniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner CableIgniting Audience Measurement at Time Warner Cable
Igniting Audience Measurement at Time Warner Cable
 
Business analytics and data visualisation
Business analytics and data visualisationBusiness analytics and data visualisation
Business analytics and data visualisation
 
Out of the Box Experience
Out of the Box ExperienceOut of the Box Experience
Out of the Box Experience
 
Using analytics in ux design my view
Using analytics in ux design   my viewUsing analytics in ux design   my view
Using analytics in ux design my view
 
WSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product OverviewWSO2 Data Analytics Server - Product Overview
WSO2 Data Analytics Server - Product Overview
 
Fibotalk: Increase trial conversion
Fibotalk:  Increase trial conversionFibotalk:  Increase trial conversion
Fibotalk: Increase trial conversion
 
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...
Feature Spotlight: How TuneIn Uses Outlier Detection and Predictive Analytics...
 
sagar
sagarsagar
sagar
 
evalmyBRAND-SGN.pptx
evalmyBRAND-SGN.pptxevalmyBRAND-SGN.pptx
evalmyBRAND-SGN.pptx
 

Plus de Institute of Contemporary Sciences

Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Institute of Contemporary Sciences
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicInstitute of Contemporary Sciences
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena PekezInstitute of Contemporary Sciences
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovInstitute of Contemporary Sciences
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Institute of Contemporary Sciences
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Institute of Contemporary Sciences
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Institute of Contemporary Sciences
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Institute of Contemporary Sciences
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicInstitute of Contemporary Sciences
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicInstitute of Contemporary Sciences
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionInstitute of Contemporary Sciences
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentInstitute of Contemporary Sciences
 

Plus de Institute of Contemporary Sciences (20)

First 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip PanjevicFirst 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip Panjevic
 
Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena Pekez
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar Dilov
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...
 
From Zero to ML Hero for Underdogs - Amir Tabakovic
From Zero to ML Hero for Underdogs  - Amir TabakovicFrom Zero to ML Hero for Underdogs  - Amir Tabakovic
From Zero to ML Hero for Underdogs - Amir Tabakovic
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
The price is right - Tomislav Krizan
The price is right - Tomislav KrizanThe price is right - Tomislav Krizan
The price is right - Tomislav Krizan
 
When it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela CulibrkWhen it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela Culibrk
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos Solujic
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir Brusic
 
Improving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity SearchImproving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity Search
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognition
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local government
 
Geospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and ClimateGeospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and Climate
 

Dernier

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 

Dernier (20)

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 

Building an recommendation system for IPTV on a fast streaming architecture - Tomislav Hlupic, Iva Soric

  • 1. Building a recommendation system for IPTV on a fast streaming architecture Iva Sorić Tomislav Hlupić
  • 2. About us Focus on customer satisfaction • LEADING DW/BI IMPLEMENTER IN SEE • 200+ REALISED PROJECTS • 90+ USERS IN 20 COUNTRIES • 110+ EMPLOYEES • 90+ CONSULTANTS/IMPLEMENTERS • 70 TECHNICAL CONSULTANTS • 20 BUSINESS CONSULTANTS • 5 PROJECT MANAGERS • OVER 600 MAN/YEARS OF EXPERIENCE IN LEADING TECHNOLOGIES
  • 3. Our business expertise fields Innovative approach in the business decision support area Strategic ICT consulting • Analysis • Design • Development • Implementation • Support • Education
  • 4.  Introduction – Content delivery services & CX for content delivery services  PI Content Analytics System – Business requirements & Technical solution  Implementation of recommendation system as part of our Content Analytics System  Conclusion Agenda
  • 5. Communication operators are offering to their consumers many services that enable them to consume video content, using different fixed and mobile technologies through different devices, either in their homes or mobile. Big part of the offer is understanding the user’s needs, and in the case of the operators, the data is available, but there are systems built for tis specific purpose. Introduction
  • 6. Analysis of the consumers’ behaviour in the real-time and over loger period of time gives operators possibility to • maximize revenue • minimize costs • Serve their customers better • Reach the highest possible level of customer experience For that reason, operators are using sophisticated recommendation engines to propose to their consumers’ content that may be interesting for them. Introduction
  • 7. • Enable consumers to watch only broadcasts in real-time • Include following types of services: • Digital Terrestrial TV • Digital Satellite TV • Digital Cable TV • Usually the data can’t be binded to a particular customer Broadcasting services
  • 8. • Using Internet and IP protocol for distribution of the content. • Consumers can watch real-time broadcasts and historical broadcasts that are stored on providers’ infrastructure. • Includes the following types of services: • Internet Protocol television (IPTV) • Over the Top content services (OTT) • Mobile TV (TV on the go) Streaming services
  • 9. IPTV vs. OTT IPTV Over-the-top technology Content provider Local telecom Studio, channel, or independent service Transmission network Local telecom - dedicated network Public Internet + local telecom Receiver Local telecom provides (set-top box) Purchased by consumer (TV, computer or mobile) Display device Screen provided by consumer Screen provided by consumer
  • 10. Analytical systems for content delivery services shall enable operators to: • Understand consumers’ behavior, which content they consume, through which channel at what time and on what device • Analyze the performance of offered service packages and package options • Use rating information to negotiate content with content providers • Segment consumers based on their behavior • Approach consumers with appropriate offers and recommendations Business requirements
  • 11. Architecture of the solution Scalable architecture that can process up to hundreds of millions records daily IPTV, Digital TV Video on demand OTT providers Channels Devices Recommendations / Open API Insights ML / Predictive models
  • 12. • Solution supports both batch processing and real-time streaming • Data can be loaded to Vertica by running a series of COPY statements, each of which loads small amounts of data into Vertica database • For real-time streaming, Kafka integration feature can be used to automatically load data to Vertica database as it streams through Kafka channel Technical solution – Flexible data ingestion
  • 13. • Data enters Kafka as a message, typically in JSON or AVRO format • custom parser can be built for other data formats. • Feed of messages in a common category come together to form topics. • Kafka divides the topics up into partitions that it can be fed in parallel to Vertica target tables for further analysis. Technical solution – Kafka integration feature
  • 14. Real-time usage reports and dashboards • Calculate different metrics, Apply filters • Drill-down to fine granularity data • Time series (trend) analysis • Map view Business Solution Analytics
  • 15. Detailed usage and behavior analytics • Consumer • Channel • Content item • Device • Operating system • Delivery type (live, catchup, VOD...) • Action that was taken by user to view specific content item Predictive models for segmentation, recommendations and cross & up-sell Business Solution Analytics
  • 16. Recommender system for content delivery
  • 17. • Information overload problem • Improve customer experience • Increase revenue (cross-sell / upsell)  Approach consumers with appropriate offers and recommendations Recommender systems for content delivery
  • 18. Two main approaches: • Content-based recommender systems – use profile information filtering • information from customer profile (demographic data, answers from a suitable questionnaire), and • information about the content and its attributes (i.e. for movies that will be the genre, director, starring actors, box office popularity) • Collaborative filtering recommender systems – use interaction information • explicit user ratings, or user interactions with content delivery platform • make predictions (filtering) about the interests of a user by analysing preferences from many users (collaborating) • CF methods usually produce better results, but have one important disadvantage – cold start problem (they cannot make prediction for new users)  Hybrid recommender systems Approaches
  • 19. Important issues for recommender system for content delivery are: • Absence of rating results from users for delivered content items • Same user account is used by one or more people in the same household, thus resulting in a user behavior that’s the union of the behaviors of all household members • Available items for recommendations are constantly changing • Prices for content change over time • Some additional rules shall be applied for particular users (i.e. filtering of adult only content) • Recommendations must be created very often to be able to have current recommendations for large number of users Issues
  • 20. Input data: • Source data for the recommendations engine is stored in the Vertica platform • It is automatically refreshed and maintained on daily basis • Two main tables: ouser-item interaction table oitems metadata Recommender system as a part of PI Content Analytics system
  • 21. • For video-on-demand type of content on the IPTV platform we used model-based collaborative filtering, where users and items are both represented by a set of latent factors. • Factors, or features, are inferred from the ratings patterns and represent their characteristics that do not necessarily have to be human- interpretable; they are implicitly present computer-calculated dimensions used as characterizations of users/items in the calculations. • Matrix factorization techniques are used to learn these factors Chosen approach
  • 22. • Spark ML, Python • Vertica Connector for Apache Spark • Spark uses ALS (alternating least squares) to minimize the squared error on the set of known ratings in order to learn the latent factors. • ALS works by rotating between fixing one side and solving a least squares problem on the other, and vice-versa. These steps are performed iteratively until convergence. leverage parallelization Tools and techniques
  • 23. • IPTV platform does not provide explicit user ratings, so we used the usage data to estimate users' preferences, i.e. to assign implicit ratings, based on the percentage of the show duration they watched, by the following formula: 𝑟𝑢(𝑖) = 5 , 𝑝𝑐𝑡 𝑢(𝑖) ≥ 1 2 + 3 ∗ 𝑝𝑐𝑡 𝑢 (𝑖) , 0 < 𝑝𝑐𝑡 𝑢(𝑖) < 1 𝑟𝑢(𝑖) = estimated rating of user u to item i (minimum rating = 1, maximum rating = 5) 𝑝𝑐𝑡 𝑢(𝑖) = percentage of the show i the user u watched (0.8 means he watched 80% of the show duration)  The more he watches, the greater the level of confidence in his preference estimation  Implicit ratings are inferred from the users' activities Implicit ratings
  • 24. • The recommender engine processing is scheduled and performed on daily basis, during the low activity periods • The output of the processing is top 10 recommendations for every user • The results are stored in Vertica and fetched to display to the user as his top 10 recommendations while he is looking for something to watch (integration with IPTV platform through API) • For implementation for real-time broadcasts, recommendation table shall be extended with show start timestamp, to be able to propose to the user the show that has not started yet and will start soonest. • Cold start problem – most popular items recommendations Storing results and generating recommendations
  • 25. • Recommender systems are playing a very important role in improving customer experience in many digital industries • We designed and implemented a Content Analytics System with two main features: capability of fast processing of semi-structured and unstructured heterogeneous data originating from digital content delivery services recommendation engine that continuously gathers data and generates recommendations Conclusion

Notes de l'éditeur

  1. Naše igralište je cijeli svijet, a ovo su samo neka od područja kojima se bavimo: • implementacija skladišta podataka (DWH) • analitika velikih podataka (Big data analitika) • integracija podataka (Data lake) • poslovna inteligencija (BI) • rudarenje podataka (Data mining) • upravljanje rizikom • upravljanje matičnim podacima
  2. overwhelming for customers aid in the discovery process unpersonalized recommendations (most popular products) R.s. are used for many years in e-commerce. Amazon, Netflix, Youtube big impact in recommender systems for content delivery was done by Netflix Prize contest in 2006
  3. 2 types of information - 2 main approaches: CONTENT -> profiles for users and products to characterize them; user profile: age, gender, education, interests, etc. + product profile -> provide recommendation by matching user’s interests with description and attributes of items (content) CF -> predicting what users will like based on their similarity to other users; crowd wisdom -> The underlying assumption is that if user A has the same opinion as user B on an issue, A is more likely to have B's opinion on a different issue. -> generally more accurate, captures data aspects that are dificult to profile using content filtering -> neighborhood methods and latent factor models
  4. Detailed historical data on customer activities is used User-item table describes customer activities, i.e. which shows did the user watch, when, for how long, etc.  The data is aggregated on the level of subscriber id and item id. loaded using defined rules for filtering records not related to actual content consummation (<30 sec: browsing through the channels, couple hours: tv left on) Content item table - info about TV shows (title, type, length, genre, description, director, year, etc.)
  5. Factors not necessarily human-interpretable; they are implicitly present computer-calculated dimensions used as characterizations of users/items in the calculations.
  6. V2S & S2V - Spark DataFrames to Vertica tables - data from Vertica to Spark RDDs or DataFrames for use with Python, R, Scala and Java ALS - one of the methods used in matrix factorization models - since there are two unknowns in the optimization process – item features vector and user features vector – als rotates between .. - user and item factors are computed independently of other user/item factors, Spark is a parallel data processing engine -> leverage ALS technique for better performance on large datasets
  7. The logic behind it is that, if the user at least started to watch the show, that means he showed interest for it, and his rating will be at least two.  Implicit ratings are inferred from the users' activities; recommendations are based only on the users' past behavior.
  8. - 10 items that the algorithm predicted the user will most likely want to watch The list of most popular items is also kept in a table and regularly refreshed.
  9. one of the most important applications is for content delivery