Nell’iperspazio con Rocket: il Framework Web di Rust!
Multimedia Lab @ Ghent University - iMinds - Organizational Overview & Outline Research Activities
1. ELIS – Multimedia Lab
Multimedia Lab @ Ghent University - iMinds:
Organizational Overview & Outline Research Activities
Research Seminar
KAIST, 1 August 2014
Wesley De Neve
@wmdeneve
Ghent University – iMinds & KAIST
2. 2
ELIS – Multimedia Lab
Outline
• Organizational overview (15 minutes)
- Ghent University
- iMinds
- Multimedia Lab
• Outline research activities (45 minutes)
- social media analysis
- visual content understanding
- deep machine learning
3. 3
ELIS – Multimedia Lab
Outline
• Organizational overview (15 minutes)
- Ghent University
- iMinds
- Multimedia Lab
• Outline research activities (45 minutes)
- social media analysis
- visual content understanding
- deep machine learning
4. 4
ELIS – Multimedia Lab
Ghent University (1/3)
• A Dutch-speaking public university
- located in Ghent, Belgium
- established in 1817
Ghent
Brussels
5. 5
ELIS – Multimedia Lab
Ghent University (2/3)
• Consists of 38,000 students and 8,000 staff members
- about 4,000 foreign students and 800 foreign staff members
• Consists of eleven faculties, composed of more than 130 departments
- campus buildings distributed all over the city
Congress Center
‘Het Pand’
Faculty of Engineering
and Architecture
Aula Academia
6. 6
ELIS – Multimedia Lab
Ghent University (3/3)
• Ghent University Global Campus in Songdo
- offers academic programs in molecular biotechnology, environmental
technology, and food technology
- operates together with the State University of New York (SUNY), George
Mason University, and University of Utah
Songdo Global University Campus Visit to Samsung Biologics
7. 7
ELIS – Multimedia Lab
• Organizational overview
- Ghent University
- iMinds
- Multimedia Lab
Outline
• Outline research activities
- social media analysis
- visual content understanding
- deep machine learning
8. 8
ELIS – Multimedia Lab
iMinds
Research institute founded in 2004 by the Flemish
government, with the aim of creating lasting
economic and social value through ICT innovation
9. 9
ELIS – Multimedia Lab
iMinds: A Virtual Research Institute
Leverages the strengths of 5 universities,
20 research groups, and more than 850 researchers
10. 10
ELIS – Multimedia Lab
iMinds’ Research Departments
ICT Media Health Energy
Smart
Cities
Manu-facturing
Internet Technologies
Multimedia Technologies
Security
Medical Information Technologies
Digital Society
11. 11
ELIS – Multimedia Lab
From Idea to Business: The iMinds Innovation Toolbox
5+ years Time-to-market …1 year
Strategic research
Incubation &
entrepreneurship
Applied
research
Pre-competitive
testing
Knowledge-driven
Explorative
Basics for applied
research
Training &
coaching
Financing
Facilities
Networking
Internationali-zation
Business-driven
Interdisciplinary
Cooperative
Demand-driven
Proof of Concept
ICON projects
Large-scale user
trials & living
labs
Evaluate
technical
feasibility
Simulations
12. 12
ELIS – Multimedia Lab
iMinds ICON: Example Projects
• iRead+ – The intelligent reading companion
- January 2012 to December 2013
- finished project that built a text analysis
pipeline for enriching digital news articles
in Dutch and French with links to Wikipedia,
dictionary definitions, and images
• GiPA – Generic platform for augmented reality
- January 2014 to December 2015
- aims at building an interoperable platform
for augmented reality applications, ranging
from games to simulations, addressing diverse
requirements, from capturing to rendering
13. 13
ELIS – Multimedia Lab
• Organizational overview
- Ghent University
- iMinds
- Multimedia Lab
Outline
• Outline research activities
- social media analysis
- visual content understanding
- deep machine learning
14. 14
ELIS – Multimedia Lab
People (Speech Lab excluded)
• Staff
- Rik Van de Walle – senior full professor, head of MMLab
- Peter Lambert – associate professor
- Piet Verhoeve – guest lecturer (ICON program manager at iMinds)
- Erik Mannens, Jan De Cock & Wesley De Neve – research management
- Ellen Lammens & Laura Smekens – administrative management
• 35 researchers
- 50% PhD students
• Miscellaneous
- about 15 master’s thesis students per year
- a few Summer internships each year
15. 15
ELIS – Multimedia Lab
Research Activities (1/2)
• Cluster 1: Video Coding (Jan De Cock)
- compression and transport of video
- transcoding and scalable coding
- high-dynamic range video
• Cluster 2: Game Tech & Graphics (Peter Lambert)
- augmented and virtual reality
- texture and mesh compression
- path planning
16. 16
ELIS – Multimedia Lab
Research Activities (2/2)
• Cluster 3: Semantic Web (SWTF; Erik Mannens)
- multimedia and interactivity on the Web
- knowledge representation and reasoning
- (big) data analytics and visualization
- digital publishing
• Cluster 4: Social & Visual Intelligence (SaVI; Wesley De Neve)
- social media analysis
- visual content analysis
- machine learning
17. 17
ELIS – Multimedia Lab
Teaching Activities
• Bachelor/Master Computer Science and Bachelor/Master Electronics
(Faculty of Engineering and Architecture)
- Multimedia Techniques
- Design of Multimedia Applications
- Advanced Multimedia Applications
• Bachelor Informatics
(Faculty of Sciences)
- Multimedia
- Internet Technology
• Bachelor Biotechnology
(Songdo Global Campus)
- Structured Programming
+ New graduate course on
Big Data Analytics
(pending approval)
18. 18
ELIS – Multimedia Lab
Standardization Activities
• W3C (World Wide Web Consortium)
- new Web techniques
- e.g., HTML5 and Media Annotations
• MPEG (Moving Picture Experts Group)
- new compression techniques
• e.g., H.264/AVC and 3-D Video Coding
- new storage and transport techniques
• e.g., MP4 file format and MPEG DASH
• VQEG (Video Quality Experts Group)
- measurement of video quality
- e.g., subjective quality evaluations
19. 19
ELIS – Multimedia Lab
• Organizational overview
- Ghent University
- iMinds
- Multimedia Lab
Outline
• Outline research activities
- social media analysis
- visual content understanding
- deep machine learning
20. 20
ELIS – Multimedia Lab
Twitter
• An online social network service that enables users to send and read
short 140-character text messages, called "tweets" or "microposts"
Hashtag
(starts with #)
Tweet or
Mention
(starts with @)
Favorite
(like or
bookmark)
Retweet micropost
(sharing)
21. 21
ELIS – Multimedia Lab
Famous Tweets
Note the presence of both textual and (embedded) visual information!
22. 22
ELIS – Multimedia Lab
• Usage in general
Twitter Statistics
- 271 million monthly active users
- 500 million Tweets are sent per day
- 78% of active users are on mobile
- expected revenue for 2014 is $1.33 billion
• mobile advertising + data licensing
• Usage during the World Cup 2014
- fans sent 672 million related tweets in total
- during the semi-final between Brazil and Germany, fans sent more
than 35.6 million tweets
- during the final, the number of tweets sent by fans peaked at
618,725 Tweets Per Minute (TPM)
23. 23
ELIS – Multimedia Lab
Twitter Research Goal and Challenges
• Research goal
- to make sense of the vast amounts of textual and visual information
communicated on Twitter by means of machine learning
• Challenges
- microposts are noisy in nature
- microposts are short-form in nature
- microposts are multi-lingual in nature
- microposts come in highly varying quantities
- microposts are real-time in nature
- microposts are multi-modal in nature (textual & visual, a/o)
24. 24
ELIS – Multimedia Lab
• What?
Deep Learning (1/4)
- simply speaking: use of multi-layered neural networks that are able
to learn complicated mappings between inputs and outputs
x y = hθ(x)
learned intermediate features
deep learning = (hierarchical) representation learning
25. 25
ELIS – Multimedia Lab
Deep Learning (2/4)
• Example learned features
Supervised handwritten
digit recognition
Unsupervised visual object recognition
(Google Brain)
26. 26
ELIS – Multimedia Lab
Deep Learning (3/4)
• Why the resurgence of neural networks?
- availability of large data sets (cf. social media & Internet of Things)
- availability of cheap computing power (cf. GPU & cloud)
- availability of algorithmic improvements (cf. DropOut & max pooling)
• Current achievements
- top performance in handwritten digit recognition
- top performance in automatic speech recognition
- top performance in large-scale visual concept detection
• Attracts substantial private R&D investments
- Google (Geoffrey Hinton & Ray Kurzweil), Facebook (Yann LeCun),
Baidu (Andrew Ng & Kai Yu), Microsoft, Twitter, Netflix, and so on
27. 27
ELIS – Multimedia Lab
Deep Learning (4/4)
• Plenty of open research challenges
- how to tailor deep neural networks to novel applications?
- how to scale up deep neural networks?
- how to scale down neural networks at no cost in effectiveness?
- how to take advantage of massively parallel hardware?
- how to develop effective hybrid architectures?
- how to take into account long-term temporal dependencies?
- how to implement multi-modal approaches?
- how to establish solid theoretical foundations?
- how to bridge the gap between deep learning and strong A.I.?
28. 28
ELIS – Multimedia Lab
Ongoing Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition and disambiguation
• Sports analytics
• Social television
• Vine video classification
29. 29
ELIS – Multimedia Lab
Social and Visual Intelligence (SaVI)
Abhineshwar Tomar
abhineshwar.tomar@ugent.be
Fréderic Godin
frederic.godin@ugent.be
Baptist Vandersmissen
baptist.vandersmissen@ugent.be
Wesley De Neve
wesley.deneve@ugent.be
Azarakhsh Jalalvand
azarakhsh.jalalvand@ugent.be
+ 3 master’s thesis students
30. 30
ELIS – Multimedia Lab
Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition and disambiguation
• Social television
• Sports analytics
• Vine video classification
31. 31
ELIS – Multimedia Lab
Hashtags on Twitter
Hashtag usage:
- topic-based indexing & search
• #socialnetwork
• #Reddit
- conversational/event clustering
• #www2014
Observation: only about 10% of tweets contain a hashtag
Research challenge: develop techniques for Twitter hashtag recommendation
32. 32
ELIS – Multimedia Lab
Twitter Hashtag Recommendation
Using Deep Learning (1/2)
• Training: learning the relation between tweets and hashtags
Tweet Hashtag
word2vec
300-D
tweet
vector
word2vec
300-D
hashtag
vector
Deep feed-forward
neural
network
300-D input layer
1000-D hidden layer
500-D hidden layer
400-D hidden layer
300-D output layer
Elizabeth Warren Taking on
Hillary as New Democratic
Powerhouse
#politics
33. 33
ELIS – Multimedia Lab
Twitter Hashtag Recommendation
Using Deep Learning (2/2)
• Testing: recommending hashtags to tweets
word2vec
300-D
tweet
vector
300-D
hashtag
vector
Deep feed-forward
neural
network
300-D input layer
1000-D hidden layer
500-D hidden layer
400-D hidden layer
300-D output layer
Tweet
House Democrats suggest
Obama impeachment is
imminent to raise cash
vec2word
Hashtag
Hashtag
Hashtag
Hashtags
#politics
#crisis
34. 34
ELIS – Multimedia Lab
word2vec
• Developed by Google Research
• Computes vector representations for words
- through the use of neural network technology
• trained on part of the Google News dataset (+/- 100 billion words)
• the model contains vectors for 3 million words and phrases
- capture the semantic meaning of a word
• Example word vector properties
- vector('Paris') - vector('France') + vector('Italy') ≈ vector('Rome')
- vector('king') - vector('man') + vector('woman') ≈ vector('queen')
35. 35
ELIS – Multimedia Lab
Experimental Results
Tweet Recommended hashtags
1 Someone dm/text me bc I’m so bored madd, Oh noes, rainnwilson,
sooooooo, fricken
2 The good life is one inspired by love and guided by
knowledge.
Ahh yes, FIVE THINGS About,
YANKEES TALK, Kinder gentler,
Ya gotta love
3 Method of Losing Weight http://t.co/rs64CEuo5W Shape Shifting, Treat Acne, Detect
Cancer, Warps, Calorie Burn
4 I hate today cause its room cleaning day for me!!! FAN ’S ATTIC, Puh leez, Mopping
robot, % #F######## 3v.jsn, Interest
EURO JAP
5 SPELLS AND SPELL-CASTING:ENCYCLOPEDIA OF
5000 SPELLS ( JUDIKA ILLES ):BLACKSMITH’S
WATER HEALING SPELL: A...
http://t.co/k0TfrqJFQW
DEBUTS NEW, NOW AVAILABLE FOR,
TO PUBLISH, DESIGNED TO,
IS READY TO
36. 36
ELIS – Multimedia Lab
Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition and disambiguation
• Sports analytics
• Social television
• Vine video classification
37. 37
ELIS – Multimedia Lab
Named Entity Recognition and Disambiguation
• Named entity
- person
- location
- organization
- miscellaneous
• film/movie, entertainment award event, political event,
programming language, sporting event and TV show
• Recognition
- identification of a named entity in a given text
• Disambiguation
- e.g., fruit ‘apple’ versus company ‘Apple’
38. 38
ELIS – Multimedia Lab
Research Challenge
• Tools for named entity recognition and disambiguation have thus far
been developed for long-form news articles using formal language
• Need for development of tools for named entity recognition and
disambiguation for short-form microposts using informal language
39. 39
ELIS – Multimedia Lab
Natural Language Processing (NLP) for Twitter from Scratch
Tweet Tokenization
Part-of-Speech
Tagging (PoS)
Chunking
Named Entity
Recognition and
Disambiguation
Information
Retrieval
Text-to-Speech
Artificial Intelligence
(cf. Siri, Cortana, Google Now)
General Text
Parsing
pronoun verb noun
Tom likes Sprite.
40. 40
ELIS – Multimedia Lab
Our Approach: Twitter PoS using Deep Learning
Word 1
Word 2
Word 3
L
o
o
k
u
p
word
vector
word
vector
word
vector
• Use of a feed-forward neural network for learning the mapping between
a collection of word vector representations and a PoS tag
- feature learning and not feature engineering
• Use of word vector representations derived from Twitter
- not from Google News
Neural
network
PoS tag of
word 2
44. 44
ELIS – Multimedia Lab
Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition
• Sports analytics
• Social television
• Video classification
45. 45
ELIS – Multimedia Lab
• What?
Rationale
- prediction of the outcome of football matches
in the English Premier League (EPL), using both
traditional statistics and Twitter microposts
• Why?
- betting on football is a billion dollar industry
- Twitter is highly popular for real-time coverage of sports events
• How?
- fusion of the output of four simple methods, using different features
and machine learning techniques
46. 46
ELIS – Multimedia Lab
Approach
• Method 1: Statistical features
- ranking in the league, the number of points gathered in the league,
the number of points gathered during the last five games, the
number of goals made, and the number of goals against
• Method 2: Twitter volume changes
• Method 3: Twitter sentiment analysis
• Method 4: Twitter user predictions
• Machine learning
- Naive Bayes, Logistic Regression, and SVM
social features derived from
+50 million tweets
47. 47
ELIS – Multimedia Lab
Experimental Results (1/2)
Method Accuracy
Baseline methods
Naive predictions 51%
Expert predictions 60%
Bookmaker predictions 67%
Individual methods
Statistical features 64%
Twitter volume changes 50%
Twitter sentiment analysis 52%
Twitter user predictions 63%
Combination of statistical features and
Twitter user predictions
Majority voting 64%
Early fusion 68%
Late fusion 66%
49. 49
ELIS – Multimedia Lab
Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition
• Sports analytics
• Social television
• Video classification
50. 50
ELIS – Multimedia Lab
Rationale (1/2)
• Social television (second screen)
- interaction between televised content and online social networks
• Breaking Bad finale: peak of 22,373 TPM
• Super Bowl 2014: peak of 382,000 TPM
• World Cup 2014 final: peak of 618,725 TPM
51. 51
ELIS – Multimedia Lab
• Challenges
Rationale (2/2)
- how to measure engagement and reach on online social networks?
• cf. the Nielsen television ratings
- how to profile your audience?
• e.g., age, gender and location
• Addressing these challenges is important for the allocation of
advertisement budgets and targeted advertisement strategies
versus
52. 52
ELIS – Multimedia Lab
Measurement of Engagement and Reach in Flanders
• Three major difficulties
- privacy concerns
- low usage of Twitter (at that time)
- identification of Flemish users of Twitter
53. 53
ELIS – Multimedia Lab
Twitter User Profiling: Gender Detection (1/3)
• What?
- classification of Flemish Twitter users into male and female classes
• Why?
- current user profiles do not contain gender information
- gender information is important for targeted advertising
• How?
- through (mostly n-gram) features extracted from the profile of the
user, the tweets of the user, and the social network of the user
- through machine learning based on Naive Bayes and SVM
54. 54
ELIS – Multimedia Lab
Twitter User Profiling: Gender Detection (2/3)
Male
Female
E
n
s
e
m
b
l
e
averaging of
probabilities
Username
Classifier
Name Classifier
Description
Classifier
Tweet Content
Classifier
Tweet Style
Classifier
Friend Description
Classifier
@wmdeneve
Wesley De Neve
Senior Researcher at Ghent University - iMinds &
KAIST. Interested in social media analysis, visual
content understanding and machine learning.
Attending "The Future of Metadata" at CONTEC.
#TISP
URL usage, emoticon usage, and punctuation
Sports fan, basketball player, outdoor lover and
a Ph.D. researcher #SocialTV and Natural
Language Processing (#NLP) @iMinds - @UGent
55. 55
ELIS – Multimedia Lab
Twitter User Profiling: Gender Detection (3/3)
Classifier Accuracy
Username 78.86%
Name 87.54%
Description 65.74%
Tweet content 75.36%
Tweet style 66.34%
Friend description 75.34%
Test set TweetGenie Ensemble
Test set 2 82.15% 91.89%
Test set 3 86.44% 93.32%
56. 56
ELIS – Multimedia Lab
Research Topics with a Twitter Focus
• Hashtag recommendation
• Named entity recognition
• Social television
• Sports analytics
• Vine video classification
57. 57
ELIS – Multimedia Lab
What is Vine? (1/4)
• Platform for social & mobile video
- established in June 2012
• Allows creating & distributing videos of up to 6 seconds
- maximum video length resembles Twitter’s character limitation
• Acquired by Twitter in October 2012
- currently has more than 40 million users
• Has the potential to become a new social news platform
- cf. Ninja News in Belgium
61. 61
ELIS – Multimedia Lab
Automatic Understanding of Social Video Content (1/2)
Recognition of general concepts in video fragments
Categorize short and noisy video fragments
Localize and recognize named entities in video fragments
Localize and recognize products in video fragments
+
Neural
network
Output
62. 62
ELIS – Multimedia Lab
Automatic Understanding of Social Video Content (2/2)
Representation learning for social video
Learn general noise-robust features
Exploitation of temporal information in video to improve classification
Investigate recurrent neural networks and reservoir computing networks
63. Visualization
63
ELIS – Multimedia Lab
Future Research Vision SaVI & SWTF
Cognitive computing? Strong A.I.? Technological singularity ;-)?
Human &
machine action
Machine-understandable
information
Data
(online social networks &
Internet of Things)
Deep
learning
Semantic
Web
understanding
Natural
language
Visual
content
understanding
Application domains Technology stacks
65. 65
ELIS – Multimedia Lab
References
[1] F. Godin, B. Vandersmissen, A. Jalalvand, W. De Neve, and R. Van de Walle, “Alleviating manual
feature engineering for Part-of-Speech tagging of Twitter microposts using distributed word
representations,” Proceedings of the NIPS Workshop on Modern Machine Learning Methods and
Natural Language Processing, Dec. 2014.
[2] A. Tomar, F. Godin, B. Vandersmissen, W. De Neve, and R. Van de Walle, “Towards Twitter
hashtag recommendation using distributed word representations and a deep feed forward
neural network,” Proceedings of the IEEE International Workshop on Cyber-Physical Systems and
Social Computing (CSSC-2014) , Sep. 2014.
[3] F. Godin, J. Zuallaert, B. Vandersmissen, W. De Neve, and R. Van de Walle, "Beating the
bookmakers: leveraging statistics and Twitter microposts for predicting soccer results,“
Proceedings of the 2014 KDD Workshop on Large-Scale Sports Analytics, Aug. 2014.
[4] B. Vandersmissen, F. Godin, A. Tomar, W. De Neve, and R. Van de Walle, "The rise of mobile
and social short-form video: an in-depth measurement study of Vine," Proceedings of SoMuS
2014 : Workshop on Social Multimedia and Storytelling (co-located with ICMR 2014), Apr. 2014.