2. Overview of the paper
• Motivation of the study
Popularization of journalism (writers with ulterior intentions)
• authority & credibility of news articles?
Can be difficult for untrained readers to identify those latent intentions
Some machine learning approaches provide limited help for humans in
decision making
• Goal of the study
Present a visual analytical system for misleading news detection by
linguistic features
• Analyze 18 news articles from different sources
• Summarize helpful patterns for misleading news detection
• User study Confirm usefulness & effectiveness of the system
3. Main Topics of this work
• Design & implementation – how & why we made the system
Design Goals
System and Data
Visual Design
• Findings & Evaluation – what we can find new with the system
Patterns of unreliable news
With a professor & 9 students
2
4. Background Knowledge
• A rise of fake news
Introduced the nature & impact of fake news (Lazer et al. ‘18)
Problem: fake news intended to provoke emotions (Bakir et al. ‘18) extremist behavior (Aisch et al. ‘16)
Promising directions: rhetorical structure and discourse analysis (Rubin et al. ‘15)
• Suggested solutions by previous works
Transmission of fake news
• trying to capture the pattern of spread of misinformation & factors contributing to it (Vosoughi et al. ‘18)
Content analysis
• Flag the malicious intent of the writer
• Automated deception detection with high (80~90%) accuracy (Feng et al. ‘12)
• Deceptive stories are separable from truthful stories in the rhetorical structure feature space
Equip readers with the ability to evaluate credibility of fake news
• Fact-checking websites (FactCheck.org, PolitiFact, Ad Fontes Media)
• Education (Bergstrom et al. ‘17)
3
5. 4
Design goals
• Inspired by Conroy et al. 's survey on automatic deception detection
• Goals
G1: Provide a quick overview of the languate usage of news articles
G2: Present news meta data to help users grasp its semantic strucure
G3: Let users gain direct access and reference to the article text
• Tasks
T1: Reveal the sentiment and discourse mode distribution of the article [G1]
T2: Identify the estimated subjectivity and readability level of the article [G1]
T3: Identify and compare character and keywords occurences in the article [G2]
T4: Provide the original text [G3]
6. 5
System and Data
• Automatic data processing pipeline
Plain text –(extract)-> High-level semantics
• Discourse mode analysis
Combine General text analysis (narration, description, exposition and argument)
+ Tom Wolfe’s Theory
Five Categories
• Narration: the most important part of storytelling, author’s interpretation
• Argument: analysis and ideas of the author
• Quote: directly repeat the passage of a person
• Description: detailed depiction, aimed at rebuilding the original scene
• Background: fact-checked background information aimed at helping readers understand the
current story
• Training Data
312 news articles from Fox News, ABC News, New York Times, the Economist..
Make a rough pre-filtering of articles with the Pew Research Center
8. 7
Visual Design
Article Explorer Module
• Visualizes the distribution of discourse
modes
• Sentiment Distribution
• Problem: sentiment score of news
articles goes zero after being aver
aged
• Solution: distribution of sentiment
of the whole article from negative
(-1) to positive (1)
• Extreme sentiment sentences(>0.
5) could easily draw readers’ atten
tion
9. 8
Visual Design
• Discourse Mode Distribution
• Choose color as the visual channel
• Common discourse mode narration; a low
-saturation
• Argument; reddish
• Background information; purple
• Description; green
• Quotes; sky-blue
• Reveal Metadata of News Articles
• Inspired by the ‘5W&1H’ theory
• ‘why’ & ‘how’: high-level, hard to be extra
cted by the computer
• ‘where’ & ‘when’ : give limited help to the
understanding of news, tried to overuse
markers
• Decided to visualize ‘who’ (characters) an
d ‘what’ (keywords grouped by topics)
Article Explorer Module
10. 9
Visual Design
Article Stats Module
• Sentiment and discourse mode stats
• Overall status and sentence-level information
• Article stats
• Flesh-Kincaid readability grade: difficulty level of an article by
its vocabulary use
• Misleading news likes to be easy-to-read spread faster thr
oughout the general public (‘15, Coroy et al.)
11. 10
Patterns of unreliable news
• Analyzed 8 misleading articles using the system
Some linguistic features (sentiment dist., article subjectivity, article readability) have strong indicative patterns o
n their own / together
Unreliable news tends to be emotional
• Dominated by one polarity (a, b, c) Fluctuating between two polarities (d)
12. 11
Patterns of unreliable news
Subjectivity of a realiable article is less than 0.2
• Subjectivity is another strong indicator to show the reliability of news articles
• reliable articles could have a subjectivity score < 0.2
Unreliable news shows an easy-to- read pattern
• With a Flesch-Kincaid readability grade greater than 0.3 in the stud
y (cognitive level of secondary & college levels)
• Reliable news: cognitive range of college & graduate levels
13. 12
Patterns of unreliable news
Unreliable articles include considerable portion of arguments
• Subjectivity is another strong indicator to show the reliability of news articles
• Reliable news are narrative without subjective arguments
Sly writers arrange background
/descriptions with their person
al opinions
14. 13
Patterns of unreliable news
Sophisticatd writers selectively use other's mouth to convey their words
• Easy to be ignored when reading plain text but can be seen clearly through visualization
15. 14
Evaluation
• Journalism scholar review
One journalism professor to evaluate the system thorugh a questionnaire
"Meaningful and potentially useful tool for news/information analytics"
"The features of writing style, sentiment, and keywords, etc. can be relevant indica
tors of journalistic performance, depending on the usage context."
Recommended usage field
• For academic research in media content analysis
• Teaching and learning in jouranlism and public relations couses
• News room practice especially for editors
16. 15
Evaluation
• User study
Among college students to verify whether it can help them to identify the misleading intention
of the news articles
9 participants (4 undergraduate and 6 postgraduagte students)
Two steps
1. Read first and review with the visualization system
• For one selected article
• 2 out of 9 8 out of 9 spotted the bias
2. View visualization system first then read under the help of the visualization system
• For two selected article (same event different standpoints from different news organization)
• 7 (highly alerted), 5 (dubious), and 6 (looks objective) out of 18 unanimously agreed the articles are biased
Questionnaire
• Help me to spot the intention of misleading in the articles
• 4 participants (strongly agree), 3 participants (agree), 2 participants(neutral)
• Help me to spot the bias towards different entities or events in the articles
• 5 participants (strongly agree), 4 participants (agree)
17. 16
Discussion
• Limitation
Accuracy of feature extraction
• Some extracted features are denied by human
Interaction and readability
• Function filters to radar chart – sentiment/discoure and characters/events
• Future Work
Extending the system and supplement missing dimensions
• Enable comparison by aggregating articles of the same topic (data collection pipline)
• Integrate related external information such as comments and Wikipedia to enable fact check (cr
oss-validation through crowdworkers)
Generalization to other domains
• Help students analyze patterns of TOEFL writing samples..
18. 17
Takeaways
1. An interactive visualization design that could act as entry point or hint
for further visualization research
2. A field study on current computer-based news analytical techniques
3. A case study reporting patterns found by their methodology
19. Criticism
1. Well-structured and logical writing (with few questions why)
2. Design rationale is dense
3. Interesting and noticeable findings on unreliable articles
4. Visualization dashboards use reliable indicators
The authors introduce useful indicators
18
• Very useful when building article selection crite
ria for article reading experiments
20. Criticism
19
1. Need more information about training data
Which kind of topic? Politics? Economy? Or else?
2. In the section that describes why they chose that color
Need reference of each color meaning (controversial)
"purple represents wisdom... Green and sky blue are safe and reliable.."
3. Not fully convinced with visualizing only who and what
Missing description of the scope of the article selected in the paper
• I can guess the mainly discussed topic in this paper is politics but not sure..
Where and when are more important information for infectious diseaes such as Corona virus
4. Study procedure description is quite unclear
Missing information of used articles
For the second task; why the authors count all samples for two articles (9 from one and 9 from the other)?
Participants and articles are quite small number; needed to be studies on more diverse articles
• or at least mention the scope as ‘politics’ only
21. 20
My IDEA for Future Work
• Education
Learning structure is essential for writing
Examples are indespensable
• show reliable and unreliable news articles to students or junior journalists
From examples, try to figure out features and create reliabe/unreliable news articles
Self-verification of the reliability of their writing
• Filtering function
Filtering system (or an app) for readers
Notice subjectivity level and other stats
Do stats influence readers' acceptance of news content?
• Article’s subjectivity may change depending on the reader’s political propensity