[0211]hwiyeon

SirenLess:
reveal the intention behind news
EuroVis 2020
Xumeng Chen, Leo Yu-Ho Lo, Huamin Qu
Presenter: Hwiyeon Kim

Overview of the paper
• Motivation of the study
 Popularization of journalism (writers with ulterior intentions)
• authority & credibility of news articles?
 Can be difficult for untrained readers to identify those latent intentions
 Some machine learning approaches provide limited help for humans in
decision making
• Goal of the study
 Present a visual analytical system for misleading news detection by
linguistic features
• Analyze 18 news articles from different sources
• Summarize helpful patterns for misleading news detection
• User study  Confirm usefulness & effectiveness of the system

Main Topics of this work
• Design & implementation – how & why we made the system
 Design Goals
 System and Data
 Visual Design
• Findings & Evaluation – what we can find new with the system
 Patterns of unreliable news
 With a professor & 9 students
2

Background Knowledge
• A rise of fake news
 Introduced the nature & impact of fake news (Lazer et al. ‘18)
 Problem: fake news intended to provoke emotions (Bakir et al. ‘18) extremist behavior (Aisch et al. ‘16)
 Promising directions: rhetorical structure and discourse analysis (Rubin et al. ‘15)
• Suggested solutions by previous works
 Transmission of fake news
• trying to capture the pattern of spread of misinformation & factors contributing to it (Vosoughi et al. ‘18)
 Content analysis
• Flag the malicious intent of the writer
• Automated deception detection with high (80~90%) accuracy (Feng et al. ‘12)
• Deceptive stories are separable from truthful stories in the rhetorical structure feature space
 Equip readers with the ability to evaluate credibility of fake news
• Fact-checking websites (FactCheck.org, PolitiFact, Ad Fontes Media)
• Education (Bergstrom et al. ‘17)
3

4
Design goals
• Inspired by Conroy et al. ＇s survey on automatic deception detection
• Goals
 G1: Provide a quick overview of the languate usage of news articles
 G2: Present news meta data to help users grasp its semantic strucure
 G3: Let users gain direct access and reference to the article text
• Tasks
 T1: Reveal the sentiment and discourse mode distribution of the article [G1]
 T2: Identify the estimated subjectivity and readability level of the article [G1]
 T3: Identify and compare character and keywords occurences in the article [G2]
 T4: Provide the original text [G3]

5
System and Data
• Automatic data processing pipeline
 Plain text –(extract)-> High-level semantics
• Discourse mode analysis
 Combine General text analysis (narration, description, exposition and argument)
+ Tom Wolfe’s Theory
 Five Categories
• Narration: the most important part of storytelling, author’s interpretation
• Argument: analysis and ideas of the author
• Quote: directly repeat the passage of a person
• Description: detailed depiction, aimed at rebuilding the original scene
• Background: fact-checked background information aimed at helping readers understand the
current story
• Training Data
 312 news articles from Fox News, ABC News, New York Times, the Economist..
 Make a rough pre-filtering of articles with the Pew Research Center

6
Visual Design
Article Explorer Module
Article Stat Module
Reader View Module

7
Visual Design
• Visualizes the distribution of discourse
modes
• Sentiment Distribution
• Problem: sentiment score of news
articles goes zero after being aver
aged
• Solution: distribution of sentiment
of the whole article from negative
(-1) to positive (1)
• Extreme sentiment sentences(>0.
5) could easily draw readers’ atten
tion

8
Visual Design
• Discourse Mode Distribution
• Choose color as the visual channel
• Common discourse mode narration; a low
-saturation
• Argument; reddish
• Background information; purple
• Description; green
• Quotes; sky-blue
• Reveal Metadata of News Articles
• Inspired by the ‘5W&1H’ theory
• ‘why’ & ‘how’: high-level, hard to be extra
cted by the computer
• ‘where’ & ‘when’ : give limited help to the
understanding of news, tried to overuse
markers
• Decided to visualize ‘who’ (characters) an
d ‘what’ (keywords grouped by topics)

9
Visual Design
Article Stats Module
• Sentiment and discourse mode stats
• Overall status and sentence-level information
• Article stats
• Flesh-Kincaid readability grade: difficulty level of an article by
its vocabulary use
• Misleading news likes to be easy-to-read  spread faster thr
oughout the general public (‘15, Coroy et al.)

10
Patterns of unreliable news
• Analyzed 8 misleading articles using the system
 Some linguistic features (sentiment dist., article subjectivity, article readability) have strong indicative patterns o
n their own / together
Unreliable news tends to be emotional
• Dominated by one polarity (a, b, c) Fluctuating between two polarities (d)

11
Subjectivity of a realiable article is less than 0.2
• Subjectivity is another strong indicator to show the reliability of news articles
• reliable articles could have a subjectivity score < 0.2
Unreliable news shows an easy-to- read pattern
• With a Flesch-Kincaid readability grade greater than 0.3 in the stud
y (cognitive level of secondary & college levels)
• Reliable news: cognitive range of college & graduate levels

12
Unreliable articles include considerable portion of arguments
• Subjectivity is another strong indicator to show the reliability of news articles
• Reliable news are narrative without subjective arguments
Sly writers arrange background
/descriptions with their person
al opinions

13
Sophisticatd writers selectively use other＇s mouth to convey their words
• Easy to be ignored when reading plain text but can be seen clearly through visualization

14
Evaluation
• Journalism scholar review
 One journalism professor to evaluate the system thorugh a questionnaire
 ＂Meaningful and potentially useful tool for news/information analytics＂
 ＂The features of writing style, sentiment, and keywords, etc. can be relevant indica
tors of journalistic performance, depending on the usage context.＂
 Recommended usage field
• For academic research in media content analysis
• Teaching and learning in jouranlism and public relations couses
• News room practice especially for editors

15
Evaluation
• User study
 Among college students to verify whether it can help them to identify the misleading intention
of the news articles
 9 participants (4 undergraduate and 6 postgraduagte students)
 Two steps
1. Read first and review with the visualization system
• For one selected article
• 2 out of 9  8 out of 9 spotted the bias
2. View visualization system first then read under the help of the visualization system
• For two selected article (same event different standpoints from different news organization)
• 7 (highly alerted), 5 (dubious), and 6 (looks objective) out of 18  unanimously agreed the articles are biased
 Questionnaire
• Help me to spot the intention of misleading in the articles
• 4 participants (strongly agree), 3 participants (agree), 2 participants(neutral)
• Help me to spot the bias towards different entities or events in the articles
• 5 participants (strongly agree), 4 participants (agree)

16
Discussion
• Limitation
 Accuracy of feature extraction
• Some extracted features are denied by human
 Interaction and readability
• Function filters to radar chart – sentiment/discoure and characters/events
• Future Work
 Extending the system and supplement missing dimensions
• Enable comparison by aggregating articles of the same topic (data collection pipline)
• Integrate related external information such as comments and Wikipedia to enable fact check (cr
oss-validation through crowdworkers)
 Generalization to other domains
• Help students analyze patterns of TOEFL writing samples..

17
Takeaways
1. An interactive visualization design that could act as entry point or hint
for further visualization research
2. A field study on current computer-based news analytical techniques
3. A case study reporting patterns found by their methodology

Criticism
1. Well-structured and logical writing (with few questions why)
2. Design rationale is dense
3. Interesting and noticeable findings on unreliable articles
4. Visualization dashboards use reliable indicators
 The authors introduce useful indicators
18
• Very useful when building article selection crite
ria for article reading experiments

Criticism
19
1. Need more information about training data
 Which kind of topic? Politics? Economy? Or else?
2. In the section that describes why they chose that color
 Need reference of each color meaning (controversial)
 ＂purple represents wisdom... Green and sky blue are safe and reliable..＂
3. Not fully convinced with visualizing only who and what
 Missing description of the scope of the article selected in the paper
• I can guess the mainly discussed topic in this paper is politics but not sure..
 Where and when are more important information for infectious diseaes such as Corona virus
4. Study procedure description is quite unclear
 Missing information of used articles
 For the second task; why the authors count all samples for two articles (9 from one and 9 from the other)?
 Participants and articles are quite small number; needed to be studies on more diverse articles
• or at least mention the scope as ‘politics’ only

20
My IDEA for Future Work
• Education
 Learning structure is essential for writing
 Examples are indespensable
• show reliable and unreliable news articles to students or junior journalists
 From examples, try to figure out features and create reliabe/unreliable news articles
 Self-verification of the reliability of their writing
• Filtering function
 Filtering system (or an app) for readers
 Notice subjectivity level and other stats
 Do stats influence readers' acceptance of news content?
• Article’s subjectivity may change depending on the reader’s political propensity

[0211]hwiyeon

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à [0211]hwiyeon

Similaire à [0211]hwiyeon (20)

Plus de ivaderivader

Plus de ivaderivader (20)

Dernier

Dernier (20)

[0211]hwiyeon