SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
Revealing the Hidden Patterns of News Photos: Analysis of Millions of News Photos through GDELT and Deep Learning-based Vision APIs
Revealing the Hidden Patterns of
Analysis of Millions of News Photos through
GDELT and Deep Learning-based Vision APIs
Haewoon Kwak Jisun An
Qatar Computing Research Institute
Hamad Bin Khalifa University
Deep learning enables us to study
news photos in large-scale
Goal of This Work
● To offer a general understanding of
○ What are shown in the photos?
○ How are people portrayed?
■ From the perspective of emotion
■ From the perspective of gender
● Case study: Portrayal of politicians
● We can crawl photos from news
websites and analyze them
● But, setting the deep learning
framework and training it take
GDELT Visual GKG (VGKG)
● Collects news articles around the world
● Extract photos from the articles
● Calls Google Cloud Vision API to analyze
● VGKG is available since 1 Jan 2016 http:
(Potential) Limitations of GDELT
● List of news sources is not explicitly
announced (also, growing) - coverage
bias might exist
● Our work of comparing GDELT with
another news dataset will be presented
in the poster session
Two Tales of the World: Comparison of Widely Used World News Datasets - GDELT and EventRegistry
Haewoon Kwak and Jisun An
ICWSM'16: The 10th International Conference on Web and Social Media (poster), 2016 21
Our Dataset - Full
● GKG and VGKG in January 2016
● Popularity measured by Alexa.com
Our Dataset - 7 Popular News Media
● Top 30 & > 1K records
● Keep labels whose confidence score ≥ .8
What Are Shown in the Photos?
Common Objects in News Photos
News Topics and Relevant Photos
● News photos should relate with topics
of news articles
→ Common objects might be different
● CNN has ‘section’ info. in its URL
● Why does this matter?
Western Media and the Third World
● Golan reports that western mass media
strengthen the portrayal of the third
world by reporting war, poverty, famine,
conflicts, violence and conflicts and lead
to negative perception (Golan 2008).
How Are People Portrayed?
From the Perspective of Emotion
Classification of Emotions
Google API Can Detect 4 Emotions
Neutral (75%) or Joy (24%)
● Among 11,127 faces (in 7 popular media),
2,740 faces (24.6%) have one of emotions
● Most of them (2,665 faces) express joy
Nonverbal & Verbal Communication
● Happy faces accelerate the cognitive
processing of positive words and slow
down that of negative words (Stenberg,
Wiking, and Dahl 1998)
We Use Microsoft Face API
● Measures smiling intensity (0.0~1.0)
0.998 0.0 (baby)
Smile Comes with Positive Text
● Positive correlation between smile
intensity and tone (sentiment) of the
How Are People Portrayed?
From the Perspective of Gender
Previous Studies on News Media
1. Men outnumber women
2. Men and women are associated with
3. More women than men were depicted
as happy and calm.
→ We’ll verify this in large-scale
Again, We Use Microsoft Face API
● Measures Gender and Age
● What are shown in the news photos
○ People commonly appear (≥ 40.5% @top500)
● How they are portrayed
○ People are neutral (75%) or smiling (24%)
○ Gender representation is unequal
○ Gender role stereotyping is found
○ Women smile more and look younger than men
● Clinton smiles more than Sanders in some media
→We demonstrate the great potential of deep
for computational journalism
Deeper Analysis on Text and Photos
● Headline and photos?
● Topic and photos?
● Keywords and photos?
● Showing the preference of media
outlets toward candidates over time
○ Modeling complex dimension of
preference - “Smile” is only one
Full paper is available via