26. Click Through on Search Pages
BoIom of
“fold”
BoIom of
Page
BoIom of
2nd Window
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.
30. Does Relevance Matter?
• Bottom of the page
– Normally low click through
– Show alternate results
31. Does Relevance Matter?
• Bottom of the page
– Normally low click through
– Show alternate results
32. Does Relevance Matter?
• Bottom of the page
– Normally low click through
G
WRON
– Show alternate results
33. Does Relevance Matter?
• Bottom of the page Precision/recall
– Normally low click through
G doesn’t (always)
WRON matter!!
– Show alternate results
(for multimedia)
34. Un-related images at the bottom of the page
should be here.
BoIom of
“fold”
BoIom of
Page
BoIom of
2nd Window
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.
35. Un-related images at the bottom of the page
are here!!!
BoIom of
“fold”
BoIom of
Page
BoIom of
2nd Window
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.
63. Movie rabng data
Training data Test data
• Training data score movie user movie user
– 100 million rabngs 1 21 1 ? 62 1
5 213 1 ? 96 1
– 480,000 users
4 345 2 ? 7 2
– 17,770 movies
4 123 2 ? 3 2
– 6 years of data:
3 768 2 ? 47 3
2000‐2005
5 76 3 ? 15 3
• Test data
4 45 4 ? 41 4
– Last few rabngs of
1 568 5 ? 28 4
each user (2.8
2 342 5 ? 93 5
million)
2 234 5 ? 74 5
• Dates of rabngs are
5 76 6 ? 69 6
given
4 56 6 ? 83 6
64. Components of a rabng predictor
user bias movie bias user‐movie interacbon
Baseline predictor User‐movie interacbon
• Separates users and movies • Characterizes the matching
• Onen overlooked between users and movies
• Benefits from insights into users’ • AIracts most research in the field
behavior
• Benefits from algorithmic and
• Among the main pracbcal
contribubons of the compebbon mathemabcal innovabons
Courtesy of YehudaKoren
81. What to Collect to measure
• Type of event
(Zync player command or a normal chat message)
• Anonymous hash
(uniquely identifies the sender and the receiver, without
exposing personal account data)
• URL to the shared video
• Timestamp for the event
• The player time (with respect to the specific video) at the
point the event occurred
• The number of characters and the number words typed
(for chat messages)
• Emoticons used in the chat message
86. Reciprocity
• 43.6% of the sessions the invitee played at
least one video back to the session’s initiator.
• 77.7% sharing reciprocation
• Pairs of people often exchanged more than
one set of videos in a session.
• In the categories of Nonprofit, Technology
and Shows, the invitees shared more videos
87. How do we know what people are watching?
How can we give them better things to watch?
CLASSIFICATION
89. 5 star ratings has been the golden egg for recommendation systems
so far; implicit human cooperative sharing activity works better.
CLASSIFICATION BASED ON
IMPLICIT CONNECTED SOCIAL
90. 20 random videos sent to 43 people.
60.3% identified the category correctly.
52.3% identified the comedies correctly.
PEOPLE REALLY STINK AT THIS
91. Used and Unused Data
You Tube Zync
Duration (video) Duration (session)*
Views (video)
Duration # of Play/Pause*
Duration (session)*
Rating*
Views # of Scrubs*
# of Play/Pause*
Rating* # of Chats*
# of Scrubs*
You Tube (not used) Zync (not used)
Tags Emoticons
Comments User ID data
Favorites # of Sessions
# of Loads
92. Phone in your favorite ML technique.
FIRST ORDER DATA WASN’T
PRETTY
93. Naïve Bayes Classification
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
Zync Features 53.9%
Humans 60.9%
94. What about these three videos? Which one you like?
Nominal Factorization
96. Classification with Factoring
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
YT Top 5 Factoring Duration 51.8%
Humans 60.9%
YT Top 5 Factoring Views 66.9%
YT Top 5 Factoring Ratings 75.5%
YT Top 5 Factoring All Features 75.9%
psst, yes we know that more training will do the same thing eventually,
I just don’t like waiting.
97. Classification w/ Zync features
Type Accuracy
Random Chance 23.0%
You Tube Features 14.6%
You Tube Top 5 Categories 32.4%
YT Top 5 Factoring Duration 51.8%
Humans 60.9%
YT Top 5 Factoring Views 66.9%
YT Top 5 Factoring Ratings 75.5%
YT Top 5 Factoring All Features 75.9%
Zync Factored All Features 87.8%
psst, we are looking at using Gradient Boosted Decision Trees in our
future work.
98. Finding the viral.
Can we predict if a video has over 10M views?
More so, can we do so with say 10 people across 5 sessions?
100. Viral Classification w/ Zync features
Does the video have over 10 M views? Accuracy
Guessing Yes 6.3%
Guessing No 93.7%
Guessing Randomly 88.3%
Naive Bayes (25% training set) 89.2%
Naive Bayes (50% training set) 95.5%
Naive Bayes (80% training set) 96.6%
118. Me: You’re in China, go to the night market for !!
My friend: Street food? Are you kidding? I’ll get sick!
119. Me: You’re in China, go to the night market for !!
My friend: Street food? Are you kidding? I’ll get sick!
Me: I dare you not to!!
120. Me: You’re in China, go to the night market for !!
You: Street food? Are you kidding? I’ll get sick!
Me: I dare you not to! (It’s delicious!)
121. Man vs. Food http://www.travelchannel.com/TV_Shows/
Man_V_Food
122. Why try to understand engagement?
Better advertising.
Better understanding of the relationship between users and the sharing/
consumption of media content.
Better organization and classification of media for efficient navigation and content
retrieval.
Better recommendations!
123. Find me: @ayman • aymans@acm.org
Fin & Thanks!
Thanks to D. DuBois, M. Slaney, E. Churchill, L. Kennedy, J.Yew, S. Pentland, A.
Brooks, J. Dunning, B. Pardo, M. Cooper.
Knowing Funny: Genre Perception and Categorization in Social Video Sharing Jude Yew; David A. Shamma; Elizabeth F.
Churchill, CHI 2011, ACM, 2011
Peaks and Persistence: Modeling the Shape of Microblog Conversations David A. Shamma; Lyndon Kennedy; Elizabeth F.
Churchill, CSCW 2011, ACM, 2011
In the Limelight Over Time: Temporalities of Network Centrality David A. Shamma; Lyndon Kennedy; Elizabeth F. Churchill,
CSCW 2011, ACM, 2011
Tweet the Debates: Understanding Community Annotation of Uncollected Sources David A. Shamma; Lyndon Kennedy;
Elizabeth F. Churchill, ACM Multimedia, ACM, 2009
Understanding the Creative Conversation: Modeling to Engagement David A. Shamma; Dan Perkel; Kurt Luther, Creativity and
Cognition, ACM, 2009
Spinning Online: A Case Study of Internet Broadcasting by DJs David A. Shamma; Elizabeth Churchill; Nikhil Bobb; Matt
Fukuda, Communities & Technology, ACM, 2009
Zync with Me: Synchronized Sharing of Video through Instant Messaging David A. Shamma; Yiming Liu; Pablo Cesar, David
Geerts, Konstantinos Chorianopoulos, Social Interactive Television: Immersive Shared Experiences and Perspectives,
Information Science Reference, IGI Global, 2009
Enhancing online personal connections through the synchronized sharing of online video Shamma, D. A.; Bastéa-Forte, M.;
Joubert, N.; Liu, Y., Human Factors in Computing Systems (CHI), ACM, 2008
Supporting creative acts beyond dissemination David A. Shamma; Ryan Shaw, Creativity and Cognition, ACM, 2007
Watch what I watch: using community activity to understand content David A. Shamma; Ryan Shaw; Peter Shafton; Yiming
Liu, ACM Multimedia Workshop on Multimedia Information Retrival (MIR), ACM, 2007
Zync: the design of synchronized video sharing Yiming Liu; David A. Shamma; Peter Shafton; Jeannie Yang, Designing for
User eXperiences, ACM, 2007
Notes de l'éditeur
here are my notes\n\n
There are many of us, but this is the work of three.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
are we that bad at this?\n
are we that bad at this?\n
These verbs have us trapped in 1998…oh ya and the anti-flash silliness doesn’t help.\n
These verbs have us trapped in 1998…oh ya and the anti-flash silliness doesn’t help.\n
Recommendation buys us the ability to discover (search) without text.\n
\n
\n
\n
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.\n
\n
\n
Side bar of related people\n
\n
\n
\n
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.\n
Adapted from “A Dynamic Bayesian Network Click Model for Web Search Ranking,” by Olivier Chapelle, Ya Zhang, WWW’09.\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
In a study I performed a few years ago we compared two different approaches for judging music similarity [Slaney and White].  In the classic approach we use music features, often used to judge genre.  The assumption is that if these features are good for making genre judgements, then they will also tell us something about similarity.  This feature is known as a genregram [Tsanatakis].  The content is rich---it tells us everything we need to know about the music.  In fact, listeners can tell whether they like a radio station within seconds of changing the dial.\n\n
In a study I performed a few years ago we compared two different approaches for judging music similarity [Slaney and White].  In the classic approach we use music features, often used to judge genre.  The assumption is that if these features are good for making genre judgements, then they will also tell us something about similarity.  This feature is known as a genregram [Tsanatakis].  The content is rich---it tells us everything we need to know about the music.  In fact, listeners can tell whether they like a radio station within seconds of changing the dial.\n\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
Bagpipes from: http://www.weddingbagpipes.com/ \nBeethoven Orchestral Ode to Joy from Various (Walt Disney Records)/\nClassical Silly Songs\nAlong with the Mozart (Symphony No. 40)\n
\n
The alternative is an item-to-item judgement based on user ratings.  The idea considers each song as a point in a multidimensional space defined by a user's rating of the song.  On a 5-point scale, this is just 2.2 bits of information!  If a jazz lover, a rock lover, and a hip-hop lover all give two songs the same rating, then the two songs are probably quite similar.\n\n
The alternative is an item-to-item judgement based on user ratings.  The idea considers each song as a point in a multidimensional space defined by a user's rating of the song.  On a 5-point scale, this is just 2.2 bits of information!  If a jazz lover, a rock lover, and a hip-hop lover all give two songs the same rating, then the two songs are probably quite similar.\n\n
In our study, we used the ratings by XXX listeners of 1000 different songs. After adjusting for missing data, we formed a vector of all user ratings for each song.  Song similarity was defined as the correlation between the user-rating vectors for the two songs.\n\n
We initially expected that a bias of 50% would be best. This means that strong likes and dislikes would be equally important. \n\nBut user’s don’t rate everything. Left, summary of 717M user ratings. Right 35k users, rating 10 songs at random.\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
We tested the two song-similarity approaches by starting with a seed song and forming playlists.  In a blind test, user's overwhelmingly said that the songs on the playlist based on rating data were more similar to each other than those based on the genre space, or a random selection of songs.  How can this be?  Just 2.2 bits beat out a state-of-the-art system based on content.\n\nProblem: How do we figure out the semantics of media signals? We can do simple problems like ASR and OCR. This is the holy grail of image analysis. We want to solve the problem when we have some information about the signal (like a caption).\n\nProblem: How do we describe the time course of a podcast, a musical signal, or a movie? What parts are similar to each other? How do we pick out the most salient portions? How do we segment?\n
Netflix recently hosted a one million dollar competition to find a better recommendation system for their movies.  It is not an understatement to say that it captured the entire machine-learning community's interest.  Thousands of hours of research, in all different directions, were directed at this problem.\n\nWhile the identity of the users was unknown, the movie titles were not.  Researchers quickly identified each movie and analyzed their content.  It only makes sense that Alice, who loves romance movies, will like very different content from Bob, who wants action films.  We should be able to use this information to build a better recommendation system.\n\n
But alas, content didn't help!  The winning systems included every possible signal [Koren, Y].  One that surprised me was that the amount of time between the movie's release and the user's rating.  Evidently there is a strong correlation, with older movies getting a higher rating.  All available signals were combined using a boosting.  In boosting various (weak) classifiers are combined to make a prediction (the movie's rating by a new user)  if they reduce the error on an unseen test data set.  Dozens of different features were included.\n\n\n
Not a single feature was derived from the movie's content!  These were well-motivated researchers, with access to the best of the algorithms in the multimedia literature. But we couldn't help them.  Arguably, the movie's genre was reflected in the rating data.  But in the end the FFT lost to *****'s.\n\n
\n
Transactional. There is MORE to tagging and comments in social media than how we think of it currently as the single browser/site/startup.\n
These tags and comments are regulated to anchored explicit annotation. This is the problem. Temporally, there is a gap – we cannot leverage these components like we have with photos. Some tags and notes are added as deep annotation, but that’s rare.\n
\n
Notre Dame!\n
Augsburg Cathedral\n
Australia\n
All tagged Christmas\n
Likewise, the context of an image tells us a LOT about what might be in the image.  We like to treat multimedia classification as a simple problem---here is an image, does it show a telephone box?  But in the real world every piece of content has a history.  At the very least we know it was shot by a real person (or a real person owned the camera.)  The image was uploaded to a web site, and each web site has a flavor.  Photos on the ESPN web site are very different from those at TMZ.  Photos uploaded to Flickr (tm) are often more artistic than the people shots typical on Facebook.  Even more finely, the friends of a person who takes lots of pictures of cats, will probably have friends who like and take pictures of cats.\n\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
http://www.flickr.com/photos/wvs/3833148925/\n\nThis is a three part talk where I’ll discuss IM, Chatrooms, and Twitter.\n
Gift giving at its finest\n
\n
\n
\n
\n
So we started looking at classification based on two datasets YouTube and Zync. Each is about 5000 videos (or sessions).\n
I come from a strong AI family…so I don’t wanna get too into it…\n
\n
So we started to think about what the data was saying to us…\n
\n
\n
\n
\n
\n
\n
Triangulate between the classifier results, the survey results and the interviews:\n Determine whether the Naïve Bayes classifier or humans are better at determining whether a video belongs to the “comedy” genre.\n Determine if the “ground truth” genre categories provided by the original uploader is reliable.\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
\n
A dare is my favorite type of social recommandation\n
A dare is my favorite type of social recommandation\n
A dare is my favorite type of social recommandation\n
A dare is my favorite type of social recommandation\n