3. Theory
1. Fair body of research on automated
sports highlight extraction
2. Twitter data can offer interesting
insights in real world phenomena
Monday, November 12, 12
5. 3
Tasks
1. Detecting events
What minutes did events occur?
2. Classifying events
Is the event a goal, card or substitution?
3. Assigning events to teams
Is the event for the home team or away team?
Monday, November 12, 12
6. 5
types
of
events
- Goal
- Own Goal
- Red Card
- Yellow Card
- Substitution
Monday, November 12, 12
7. Methodology
1. Gathering the data
2. Exploring and
cleaning the data
3. Classifying interesting
data points
Monday, November 12, 12
8. Gathering
data
- Collect all tweets with game hashtags
#ajafey #nacgro #psvutr
- Collect official data for each match
Goals, cards, substitutions
Monday, November 12, 12
9. Our
data
6 months
61 games
661 events
10,643 tweets
Monday, November 12, 12
10. Three
Experiments
1. Detecting events
2. Classifying events
3. Assigning events to teams
Monday, November 12, 12
13. 1. Experimental Setup
- Goal: detect peaks in # tweets per
minute signal to extract events
- Setup: Test three peak detection
methods:
1. LocMaxNoBaseLineCorr
2. IntThresNoBaseLineCorr
3. IntThresWithBaseLineCorr
Monday, November 12, 12
15. 1. Findings
- Goals and red cards are detected better
than yellow cards and substitutions
- None of the three peak selection
methods works well.
- Highlights can be extracted, but not
precise enough
Monday, November 12, 12
16. Three
Experiments
1. Detecting events
2. Classifying events
3. Assigning events to teams
Monday, November 12, 12
23. 2. Discussion
- Top50GainRatio best feature selection
- libSVM best classifier
- EventMinutes results:
Class F-‐measure
OVERALL 0.822
Goal 0.841
Own
goal 0.000
Red
card 0.848
Yellow
card 0.785
Subs@tu@on 0.839
Monday, November 12, 12
24. Three
Experiments
1. Detecting events
2. Classifying events
3. Assigning events to teams
Monday, November 12, 12
25. 3. Experimental Setup
- Goal: Assign events to team
- Based on the ratio between tweets
from fans for home and away team
- But first: extract fans
Monday, November 12, 12
26. 3. Extracting fans
- Hypothesis:
People that tweet for the same team
each week are probably fan of that
team
Monday, November 12, 12
27. 3. Extracting fans
- Extracted 38,527 fans rom 146,326
f
users (26%)
- This method of extracting fans works
well:
Right
team Not
clear Wrong
team
88% 10% 2%
Monday, November 12, 12
29. 3. Results
- Performance of assigning events to teams
above baseline performance:
Class Baseline Performance
OVERALL 52% 58%
Goal 58% 69%
Red
card 50% 62%
Yellow
card 63% 63%
Subs@tu@on 52% 57%
Monday, November 12, 12
30. Conclusion
1. Detecting events
=> difficult
2. Classifying events
=> good results
3. Assigning events to teams
=> promising results
Monday, November 12, 12
31. Future Work
- Use sentiment in tweets
(for detecting events and assigning events to teams)
- Player detection
- Other sports
Monday, November 12, 12