SlideShare une entreprise Scribd logo
1  sur  57
Télécharger pour lire hors ligne
Machine Learning at
                           PeerIndex


                             @fhuszar


     Ferenc Huszár
Wednesday, 16 May 12
PeerIndex.com: understand your influence




Wednesday, 16 May 12
PeerPerks.com: free stuff for influencers




Wednesday, 16 May 12
PeerPerks: free stuff for influencers




Wednesday, 16 May 12
Machine Learning @ PeerIndex




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later
                       •   visualise different aspects of influence, in an engaging way




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later
                       •   visualise different aspects of influence, in an engaging way
                       •   influence maximisation - submodular optimisation




Wednesday, 16 May 12
Inferring networks of influence




Wednesday, 16 May 12
Inferring networks of influence

           Social network




Wednesday, 16 May 12
Inferring networks of influence

           Social network                Propagation probabilities




                                                pi,j




Wednesday, 16 May 12
Inferring networks of influence

           Social network                                                            Propagation probabilities




                                                                                            pi,j



            Information cascade logs
     http://www.pcworld.com/article/239719    http://techcrunch.com/2011/11/21/...

          1079306 2011-08-25T00:03:06+01:00       259725 2011-10-24T03:32:19+01:00
          4549198 2011-08-25T04:32:25+01:00        76539 2011-10-24T03:32:23+01:00
          2662975 2011-08-25T00:35:11+01:00      1922351 2011-10-24T04:28:47+01:00
          2333224 2011-08-25T01:43:18+01:00         9183 2011-10-24T03:30:57+01:00
          3141371 2011-08-25T01:52:06+01:00      3335398 2011-10-24T03:34:01+01:00
          3482720 2011-08-25T07:18:24+01:00      1616885 2011-10-24T03:48:16+01:00
          1403682 2011-08-25T03:52:58+01:00        82198 2011-10-24T03:48:29+01:00
          4679657 2011-08-25T01:07:48+01:00       906390 2011-10-24T23:13:51+01:00
            32460 2011-08-25T01:11:43+01:00      1051322 2011-10-24T03:40:02+01:00




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                 1
                                        pi,j
                                               din (j)




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly


                •      data-driven heuristics
                                  number of items shared by j after i shared it
                          pi,j
                                         number of items shared by i




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly


                •      data-driven heuristics
                                  number of items shared by j after i shared it
                          pi,j
                                         number of items shared by i



              How do you solve this with machine learning?

Wednesday, 16 May 12
The likelihood




Wednesday, 16 May 12
The likelihood




          P( D |                        ✓ )

Wednesday, 16 May 12
The likelihood




          P( D |                                                ✓ )
                       http://www.pcworld.com/article/239719

                            1079306 2011-08-25T00:03:06+01:00
                            4549198 2011-08-25T04:32:25+01:00
                            2662975 2011-08-25T00:35:11+01:00
                            2333224 2011-08-25T01:43:18+01:00
                            3141371 2011-08-25T01:52:06+01:00
                            3482720 2011-08-25T07:18:24+01:00
                            1403682 2011-08-25T03:52:58+01:00
                            4679657 2011-08-25T01:07:48+01:00
                              32460 2011-08-25T01:11:43+01:00




Wednesday, 16 May 12
The likelihood




          P( D |                                                       )
                       http://www.pcworld.com/article/239719

                            1079306 2011-08-25T00:03:06+01:00
                            4549198 2011-08-25T04:32:25+01:00
                            2662975 2011-08-25T00:35:11+01:00
                            2333224 2011-08-25T01:43:18+01:00
                            3141371 2011-08-25T01:52:06+01:00
                            3482720 2011-08-25T07:18:24+01:00
                            1403682 2011-08-25T03:52:58+01:00
                            4679657 2011-08-25T01:07:48+01:00
                              32460 2011-08-25T01:11:43+01:00
                                                                pi,j




Wednesday, 16 May 12
The likelihood




          P( D |                                                           )
                           http://www.pcworld.com/article/239719

                                1079306 2011-08-25T00:03:06+01:00
                                4549198 2011-08-25T04:32:25+01:00
                                2662975 2011-08-25T00:35:11+01:00
                                2333224 2011-08-25T01:43:18+01:00
                                3141371 2011-08-25T01:52:06+01:00
                                3482720 2011-08-25T07:18:24+01:00
                                1403682 2011-08-25T03:52:58+01:00
                                4679657 2011-08-25T01:07:48+01:00
                                  32460 2011-08-25T01:11:43+01:00
                                                                    pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un




Wednesday, 16 May 12
The likelihood




          P( D |                                                                      )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade




Wednesday, 16 May 12
The likelihood




          P( D |                                                                      )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1




Wednesday, 16 May 12
The likelihood




          P( D |                                                                              )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                       pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1(1            (1        p0,u2 ) (1                      pu1 ,u2 ))




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                   )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                            pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1(1            (1        p0,u2 ) (1                      pu1 ,u2 ))· · ·




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                   )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                            pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                             0                               1
                                                     n
                                                     Y                     i 1
                                                                           Y
                                               =             @1                  (1   puj ,ui )A
                                                     i=1                   j=1




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                     )
                                        http://www.pcworld.com/article/239719

                                             1079306 2011-08-25T00:03:06+01:00
                                             4549198 2011-08-25T04:32:25+01:00
                                             2662975 2011-08-25T00:35:11+01:00
                                             2333224 2011-08-25T01:43:18+01:00
                                             3141371 2011-08-25T01:52:06+01:00
                                             3482720 2011-08-25T07:18:24+01:00
                                             1403682 2011-08-25T03:52:58+01:00
                                             4679657 2011-08-25T01:07:48+01:00
                                               32460 2011-08-25T01:11:43+01:00
                                                                                              pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                               0                               1
                                                       n
                                                       Y                     i 1
                                                                             Y
                                                 =             @1                  (1   puj ,ui )A
                                                       i=1                   j=1
             for users that are not in cascade




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                      )
                                        http://www.pcworld.com/article/239719

                                             1079306 2011-08-25T00:03:06+01:00
                                             4549198 2011-08-25T04:32:25+01:00
                                             2662975 2011-08-25T00:35:11+01:00
                                             2333224 2011-08-25T01:43:18+01:00
                                             3141371 2011-08-25T01:52:06+01:00
                                             3482720 2011-08-25T07:18:24+01:00
                                             1403682 2011-08-25T03:52:58+01:00
                                             4679657 2011-08-25T01:07:48+01:00
                                               32460 2011-08-25T01:11:43+01:00
                                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                               0                                1
                                                       n
                                                       Y                     i 1
                                                                             Y
                                                 =             @1                  (1    puj ,ui )A
                                                       i=1                   j=1
             for users that are not in cascade
                                                         Y                       Y
                                                                                        (1   pu,v )
                                                 u2{u1 ...un } v2users
                                                  /


Wednesday, 16 May 12
Maximum likelihood at scale




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML

                   •   use heuristics to compute probabilities at scale




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML

                   •   use heuristics to compute probabilities at scale

                   •   use ML to tune parameters on small-scale data




Wednesday, 16 May 12
Influence maximisation




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})       f (A)   f (B [ {x})   f (B)




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})           f (A)   f (B [ {x})   f (B)

                       • these functions are fun to optimise


Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})          f (A)   f (B [ {x})   f (B)

                       • these functions are fun to optimise
                       • pops up many times in machine learning

Wednesday, 16 May 12
Wrap up




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to
                       •   putting all aspects together into a single number, and visualise




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to
                       •   putting all aspects together into a single number, and visualise
                       •   influence maximisation




Wednesday, 16 May 12
Thanks


            We’re hiring ML scientists, interns and engineers...
                                @fhuszar
                           fh@peerindex.com




Wednesday, 16 May 12

Contenu connexe

Similaire à Machine Learning at PeerIndex

Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebMatthew Russell
 
Advanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsAdvanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsSloan Carne
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceDeep Shankar Yadav
 
Complex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesComplex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesS.M. Mahdi Seyednezhad, Ph.D.
 
Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Ruben Pertusa Lopez
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Lora Aroyo
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea conInnismir
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Xiaohu ZHU
 
Let’s hunt the target using OSINT
Let’s hunt the target using OSINTLet’s hunt the target using OSINT
Let’s hunt the target using OSINTChandrapal Badshah
 
Linked In 101 Workshop
Linked In 101 WorkshopLinked In 101 Workshop
Linked In 101 Workshoprocklandweb
 
SIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniSIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniMarco Canini
 
Shrp on line social networking handout
Shrp on line social networking handoutShrp on line social networking handout
Shrp on line social networking handoutTodd Nilson
 
Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lora Aroyo
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustPeter Skomoroch
 
Open Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysisikanow
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internettkisason
 
Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Eric Reiss
 

Similaire à Machine Learning at PeerIndex (20)

Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social Web
 
Clarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal PresentationClarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal Presentation
 
Advanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsAdvanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU Investigators
 
DECEPTICONv2
DECEPTICONv2DECEPTICONv2
DECEPTICONv2
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligence
 
Complex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesComplex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and Databases
 
From OSINT to Phishing presentation
From OSINT to Phishing presentationFrom OSINT to Phishing presentation
From OSINT to Phishing presentation
 
Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea con
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1
 
Let’s hunt the target using OSINT
Let’s hunt the target using OSINTLet’s hunt the target using OSINT
Let’s hunt the target using OSINT
 
Linked In 101 Workshop
Linked In 101 WorkshopLinked In 101 Workshop
Linked In 101 Workshop
 
SIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniSIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco Canini
 
Shrp on line social networking handout
Shrp on line social networking handoutShrp on line social networking handout
Shrp on line social networking handout
 
Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data Exhaust
 
Open Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysis
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
 
Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018
 

Dernier

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Dernier (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Machine Learning at PeerIndex

  • 1. Machine Learning at PeerIndex @fhuszar Ferenc Huszár Wednesday, 16 May 12
  • 2. PeerIndex.com: understand your influence Wednesday, 16 May 12
  • 3. PeerPerks.com: free stuff for influencers Wednesday, 16 May 12
  • 4. PeerPerks: free stuff for influencers Wednesday, 16 May 12
  • 5. Machine Learning @ PeerIndex Wednesday, 16 May 12
  • 6. Machine Learning @ PeerIndex • The usual stuff Wednesday, 16 May 12
  • 7. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs Wednesday, 16 May 12
  • 8. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn Wednesday, 16 May 12
  • 9. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system Wednesday, 16 May 12
  • 10. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral Wednesday, 16 May 12
  • 11. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff Wednesday, 16 May 12
  • 12. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later Wednesday, 16 May 12
  • 13. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later • visualise different aspects of influence, in an engaging way Wednesday, 16 May 12
  • 14. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later • visualise different aspects of influence, in an engaging way • influence maximisation - submodular optimisation Wednesday, 16 May 12
  • 15. Inferring networks of influence Wednesday, 16 May 12
  • 16. Inferring networks of influence Social network Wednesday, 16 May 12
  • 17. Inferring networks of influence Social network Propagation probabilities pi,j Wednesday, 16 May 12
  • 18. Inferring networks of influence Social network Propagation probabilities pi,j Information cascade logs http://www.pcworld.com/article/239719 http://techcrunch.com/2011/11/21/... 1079306 2011-08-25T00:03:06+01:00 259725 2011-10-24T03:32:19+01:00 4549198 2011-08-25T04:32:25+01:00 76539 2011-10-24T03:32:23+01:00 2662975 2011-08-25T00:35:11+01:00 1922351 2011-10-24T04:28:47+01:00 2333224 2011-08-25T01:43:18+01:00 9183 2011-10-24T03:30:57+01:00 3141371 2011-08-25T01:52:06+01:00 3335398 2011-10-24T03:34:01+01:00 3482720 2011-08-25T07:18:24+01:00 1616885 2011-10-24T03:48:16+01:00 1403682 2011-08-25T03:52:58+01:00 82198 2011-10-24T03:48:29+01:00 4679657 2011-08-25T01:07:48+01:00 906390 2011-10-24T23:13:51+01:00 32460 2011-08-25T01:11:43+01:00 1051322 2011-10-24T03:40:02+01:00 Wednesday, 16 May 12
  • 19. Heurisric approaches to estimate pi,j Wednesday, 16 May 12
  • 20. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) Wednesday, 16 May 12
  • 21. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly Wednesday, 16 May 12
  • 22. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly • data-driven heuristics number of items shared by j after i shared it pi,j number of items shared by i Wednesday, 16 May 12
  • 23. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly • data-driven heuristics number of items shared by j after i shared it pi,j number of items shared by i How do you solve this with machine learning? Wednesday, 16 May 12
  • 25. The likelihood P( D | ✓ ) Wednesday, 16 May 12
  • 26. The likelihood P( D | ✓ ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 Wednesday, 16 May 12
  • 27. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j Wednesday, 16 May 12
  • 28. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un Wednesday, 16 May 12
  • 29. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade Wednesday, 16 May 12
  • 30. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1 Wednesday, 16 May 12
  • 31. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1(1 (1 p0,u2 ) (1 pu1 ,u2 )) Wednesday, 16 May 12
  • 32. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1(1 (1 p0,u2 ) (1 pu1 ,u2 ))· · · Wednesday, 16 May 12
  • 33. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 Wednesday, 16 May 12
  • 34. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 for users that are not in cascade Wednesday, 16 May 12
  • 35. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 for users that are not in cascade Y Y (1 pu,v ) u2{u1 ...un } v2users / Wednesday, 16 May 12
  • 36. Maximum likelihood at scale Wednesday, 16 May 12
  • 37. Maximum likelihood at scale • data too sparse to learn one parameter per edge Wednesday, 16 May 12
  • 38. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly Wednesday, 16 May 12
  • 39. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML Wednesday, 16 May 12
  • 40. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML • use heuristics to compute probabilities at scale Wednesday, 16 May 12
  • 41. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML • use heuristics to compute probabilities at scale • use ML to tune parameters on small-scale data Wednesday, 16 May 12
  • 43. Influence maximisation • Select a set of users to maximise outreach Wednesday, 16 May 12
  • 44. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly Wednesday, 16 May 12
  • 45. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly Wednesday, 16 May 12
  • 46. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) Wednesday, 16 May 12
  • 47. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) • these functions are fun to optimise Wednesday, 16 May 12
  • 48. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) • these functions are fun to optimise • pops up many times in machine learning Wednesday, 16 May 12
  • 50. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks Wednesday, 16 May 12
  • 51. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks Wednesday, 16 May 12
  • 52. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems Wednesday, 16 May 12
  • 53. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities Wednesday, 16 May 12
  • 54. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to Wednesday, 16 May 12
  • 55. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to • putting all aspects together into a single number, and visualise Wednesday, 16 May 12
  • 56. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to • putting all aspects together into a single number, and visualise • influence maximisation Wednesday, 16 May 12
  • 57. Thanks We’re hiring ML scientists, interns and engineers... @fhuszar fh@peerindex.com Wednesday, 16 May 12

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n