SlideShare une entreprise Scribd logo
1  sur  30
PERSONALIZED RETWEET PREDICTION IN TWITTER


        Liangjie Hong*, Lehigh University
        Aziz Doumith, Lehigh University
  1
        Brian D. Davison, Lehigh University
OVERVIEW
 Motivation
 Related Work

 Our Method

 Experimental Results




                         2
MOTIVATIONS
Social information platforms




                               3
MOTIVATIONS
Information overload




                       4
MOTIVATIONS
Information shortage




                       5
MOTIVATIONS




                                                                                          6


              Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
MOTIVATIONS




                                                 7


              Photo from: http://graphlab.org/
MOTIVATIONS




                                               8


              Photo from: http://kexino.com/
MOTIVATIONS




                                                                        9


              Photo from: http://performancemarketingassociation.com/
TASKS
Given a target user and his/her friends, provide a ranked
list of tweets from these friends such that the tweets that
are potentially retweeted will be ranked higher.




                                                                              10


                    Photo from: http://performancemarketingassociation.com/
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]



                                             11
RELATED WORK
Generic Popular Tweets Analysis/Prediction
 [Suh et al., SocialCom 2010]

 [Y. Kim and K. Shim, ICDM, 2011]

 [Uysal and W. B. Croft, CIKM 2011]

 [Hong et al., WWW 2011]

Personalized Tweets Analysis/Prediction
 [Chen et al., SIGIR 2012]

 [Peng et al., ICDM Workshop 2011]


       Understanding users’ behaviors & content modeling
                                                           12
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          13
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering




   Latent factor models



                                        14
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features




   Latent factor models
       Factorization Machines [Rendle, ACM TIST 2012]
                                                         15
OUR METHOD
Factorization Machines
 Generic enough
       matrix factorization
       pairwise interaction tensor factorization
       SVD++
       neighborhood models
       …
   Technically mature
       [Rendle, ICDM 2010]
       [Rendle et al., SIGIR 2011]
       [Freudenthaler et al., NIPS Workshop 2011]
       [Rendle et al., WSDM 2012]
                                                     16
       [Rendle, ACM TIST 2012]
OUR METHOD
Extending Factorization Machines
 Non-negative decomposition of term-tweet matrix
       Compatible to standard topic models
   Co-Factorization Machines
       Multiple aspects of the dataset
         Shared feature paradigm
         Shared latent space paradigm

         Regularized latent space paradigm




                                                    17
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability




                                          18
OUR METHOD
Design requirements
 Learning objective functions for different aspects
     User decisions
         Ranking-based loss
          Weighted Approximately Rank Pairwise loss (WARP)
     Content modeling
       Log-Poisson loss
       Logistic loss




                                                             19
OUR METHOD
WARP loss
 Proposed by [Usunier et al., ICML 2009]

 Image retrieval tasks and IR tasks
    [Weston et al., Machine Learning 2010]
    [Weston et al., ICML 2012]
    [Weston et al., UAI 2012]
    [Bordes et al, AISTATS 2012]
   Can mimic many ranking measures
       NDCG, MAP, Precision@k


   Applied to collaborative filtering
                                             20
OUR METHOD
Design requirements
 Utilize users’ historical behaviors

 Collaborative filtering

 Incorporating a rich-set of features

 Coupled modeling with content

 Learning a correct objective function

 Scalability (Stochastic Gradient Descent)




                                              21
EXPERIMENTS
Twitter data
 0.7M target users with 11M tweets

 4.3M neighbor users with 27M tweets

 “Complete” sample for each target user

 Mean Average Precision (MAP) as measure

 Train/test on consecutive time periods




                                            22
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     23
EXPERIMENTS
Comparisons
 Matrix factorization (MF)

 Matrix factorization with attributes (MFA)

 CPTR [Chen et al, SIGIR 2012]

 Factorization machines with attributes (FMA)

 CoFM with shared features (CoFM-SF)

 CoFM with shared latent spaces (CoFM-SL)

 CoFM with latent space regularization (CoFM-REG)



                                                     24
EXPERIMENTS




              25
EXPERIMENTS




              26
EXPERIMENTS




              27
EXPERIMENTS
Examples of topics are shown. The terms are top ranked terms in
each topic. The topic names in bold are given by the authors.
Entertainment


album music lady artist video listen itunes apple produced movies #bieber bieber new songs

Finance

percent billion bank financial debt banks euro crisis rates greece bailout spain economy


Politics


party election budget tax president million obama money pay bill federal increase cuts
                                                                                             28
CONCLUSIONS
   Main contributions
     Propose Co-Factorization Machines (CoFM) to handle
      two (multiple) aspects of the dataset.
     Apply FM to text data with constraints to mimic topic
      models
     Introduce WARP loss into collaborative filtering/recsys
      models
     Explore a wide range of features and demonstrate the
      effectiveness of feature sets with significant
      improvement over several non-trival baselines.


                                                                29
THANK YOU.




   Liangjie Hong
   PhD candidate
   WUME Lab
   Lehigh University
   lih307@cse.lehigh.edu   30

Contenu connexe

Similaire à Personalized Retweet Prediction in Twitter

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development Jean Vanderdonckt
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Jeremy Bradbury
 
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...Pedro Luis Mateo Navarro
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE mathsjournal
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software ijseajournal
 
Discreate eventsimulation idef
Discreate eventsimulation idefDiscreate eventsimulation idef
Discreate eventsimulation idefMandar Trivedi
 
Productivity mdd mdb_code_centric
Productivity mdd mdb_code_centricProductivity mdd mdb_code_centric
Productivity mdd mdb_code_centricSantiago Meliá
 
OO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentOO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentRandy Connolly
 
PresentationTest
PresentationTestPresentationTest
PresentationTestbolu804
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph MiningSabri Skhiri
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender SystemsWQ Fan
 
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...IJCSIS Research Publications
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsijcsit
 
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...acijjournal
 
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...acijjournal
 
Self-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML ModelsSelf-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML Modelsamalanda1
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 

Similaire à Personalized Retweet Prediction in Twitter (20)

Towards Method Engineering of Model-Driven User Interface Development
Towards Method Engineering ofModel-Driven User Interface Development Towards Method Engineering ofModel-Driven User Interface Development
Towards Method Engineering of Model-Driven User Interface Development
 
Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)Automating Software Development Using Artificial Intelligence (AI)
Automating Software Development Using Artificial Intelligence (AI)
 
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
A Context-aware Model for the Analysis of User Interaction and QoE in Mobile ...
 
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
REQUIREMENTS VARIABILITY SPECIFICATION FOR DATA INTENSIVE SOFTWARE
 
Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software Requirements Variability Specification for Data Intensive Software
Requirements Variability Specification for Data Intensive Software
 
Discreate eventsimulation idef
Discreate eventsimulation idefDiscreate eventsimulation idef
Discreate eventsimulation idef
 
Productivity mdd mdb_code_centric
Productivity mdd mdb_code_centricProductivity mdd mdb_code_centric
Productivity mdd mdb_code_centric
 
OO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented DevelopmentOO Development 1 - Introduction to Object-Oriented Development
OO Development 1 - Introduction to Object-Oriented Development
 
Sub1583
Sub1583Sub1583
Sub1583
 
PresentationTest
PresentationTestPresentationTest
PresentationTest
 
Large Graph Mining
Large Graph MiningLarge Graph Mining
Large Graph Mining
 
2008.11560v2.pdf
2008.11560v2.pdf2008.11560v2.pdf
2008.11560v2.pdf
 
Fundamentals of Deep Recommender Systems
 Fundamentals of Deep Recommender Systems Fundamentals of Deep Recommender Systems
Fundamentals of Deep Recommender Systems
 
Introduction to MDE
Introduction to MDEIntroduction to MDE
Introduction to MDE
 
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
Enhanced Feature Analysis Framework for Comparative Analysis & Evaluation of ...
 
Systems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature conceptsSystems variability modeling a textual model mixing class and feature concepts
Systems variability modeling a textual model mixing class and feature concepts
 
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
A Model To Compare The Degree Of Refactoring Opportunities Of Three Projects ...
 
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
A MODEL TO COMPARE THE DEGREE OF REFACTORING OPPORTUNITIES OF THREE PROJECTS ...
 
Self-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML ModelsSelf-adaptation Driven by goals in SysML Models
Self-adaptation Driven by goals in SysML Models
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 

Dernier

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 

Dernier (20)

What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 

Personalized Retweet Prediction in Twitter

  • 1. PERSONALIZED RETWEET PREDICTION IN TWITTER Liangjie Hong*, Lehigh University Aziz Doumith, Lehigh University 1 Brian D. Davison, Lehigh University
  • 2. OVERVIEW  Motivation  Related Work  Our Method  Experimental Results 2
  • 6. MOTIVATIONS 6 Photo from: http://www.jenful.com/2011/06/google-1-and-the-filter-bubble/
  • 7. MOTIVATIONS 7 Photo from: http://graphlab.org/
  • 8. MOTIVATIONS 8 Photo from: http://kexino.com/
  • 9. MOTIVATIONS 9 Photo from: http://performancemarketingassociation.com/
  • 10. TASKS Given a target user and his/her friends, provide a ranked list of tweets from these friends such that the tweets that are potentially retweeted will be ranked higher. 10 Photo from: http://performancemarketingassociation.com/
  • 11. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] 11
  • 12. RELATED WORK Generic Popular Tweets Analysis/Prediction  [Suh et al., SocialCom 2010]  [Y. Kim and K. Shim, ICDM, 2011]  [Uysal and W. B. Croft, CIKM 2011]  [Hong et al., WWW 2011] Personalized Tweets Analysis/Prediction  [Chen et al., SIGIR 2012]  [Peng et al., ICDM Workshop 2011] Understanding users’ behaviors & content modeling 12
  • 13. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 13
  • 14. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Latent factor models 14
  • 15. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Latent factor models  Factorization Machines [Rendle, ACM TIST 2012] 15
  • 16. OUR METHOD Factorization Machines  Generic enough  matrix factorization  pairwise interaction tensor factorization  SVD++  neighborhood models  …  Technically mature  [Rendle, ICDM 2010]  [Rendle et al., SIGIR 2011]  [Freudenthaler et al., NIPS Workshop 2011]  [Rendle et al., WSDM 2012] 16  [Rendle, ACM TIST 2012]
  • 17. OUR METHOD Extending Factorization Machines  Non-negative decomposition of term-tweet matrix  Compatible to standard topic models  Co-Factorization Machines  Multiple aspects of the dataset  Shared feature paradigm  Shared latent space paradigm  Regularized latent space paradigm 17
  • 18. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability 18
  • 19. OUR METHOD Design requirements  Learning objective functions for different aspects  User decisions  Ranking-based loss Weighted Approximately Rank Pairwise loss (WARP)  Content modeling  Log-Poisson loss  Logistic loss 19
  • 20. OUR METHOD WARP loss  Proposed by [Usunier et al., ICML 2009]  Image retrieval tasks and IR tasks [Weston et al., Machine Learning 2010] [Weston et al., ICML 2012] [Weston et al., UAI 2012] [Bordes et al, AISTATS 2012]  Can mimic many ranking measures  NDCG, MAP, Precision@k  Applied to collaborative filtering 20
  • 21. OUR METHOD Design requirements  Utilize users’ historical behaviors  Collaborative filtering  Incorporating a rich-set of features  Coupled modeling with content  Learning a correct objective function  Scalability (Stochastic Gradient Descent) 21
  • 22. EXPERIMENTS Twitter data  0.7M target users with 11M tweets  4.3M neighbor users with 27M tweets  “Complete” sample for each target user  Mean Average Precision (MAP) as measure  Train/test on consecutive time periods 22
  • 23. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 23
  • 24. EXPERIMENTS Comparisons  Matrix factorization (MF)  Matrix factorization with attributes (MFA)  CPTR [Chen et al, SIGIR 2012]  Factorization machines with attributes (FMA)  CoFM with shared features (CoFM-SF)  CoFM with shared latent spaces (CoFM-SL)  CoFM with latent space regularization (CoFM-REG) 24
  • 28. EXPERIMENTS Examples of topics are shown. The terms are top ranked terms in each topic. The topic names in bold are given by the authors. Entertainment album music lady artist video listen itunes apple produced movies #bieber bieber new songs Finance percent billion bank financial debt banks euro crisis rates greece bailout spain economy Politics party election budget tax president million obama money pay bill federal increase cuts 28
  • 29. CONCLUSIONS  Main contributions  Propose Co-Factorization Machines (CoFM) to handle two (multiple) aspects of the dataset.  Apply FM to text data with constraints to mimic topic models  Introduce WARP loss into collaborative filtering/recsys models  Explore a wide range of features and demonstrate the effectiveness of feature sets with significant improvement over several non-trival baselines. 29
  • 30. THANK YOU. Liangjie Hong PhD candidate WUME Lab Lehigh University lih307@cse.lehigh.edu 30