SlideShare une entreprise Scribd logo
1  sur  15
Computational Social Science
as the Ultimate Web Intelligence
Kno.e.sis Projects at the Intersection of Big Data, AI, Social Good and Health
Panel at Web Intelligence 2018
Prof. Amit Sheth
LexisNexis Ohio Eminent Scholar
Executive Director, Kno.e.sis - Ohio Center of Excellence in
Knowledge-enabled Computing & BioHealth Innovation
Presentation template by SlidesCarnival
Photographs by Unsplash
Icons by thenounproject
Big Data | Social Media | AI
2
Harnessing Twitter ‘Big Data’ for
Automatic Emotion Identification
2.5 M Tweets with Machine
Learning algorithms
Trends
Emotions
eDrugTrends - Identify emerging trends in
cannabis and synthetic cannabinoid use in the
U.S.
Web Forum Data & Tweets with
NLP, ML & Semantic Web
Technologies
Intents
Sentiments
Hazards SEES - Cross-modal aggregation
of Multi-modal & Multi-disciplinary
Data to support human efforts in disaster
management
Extracting Diverse Sentiment Expressions
with Target-Dependent Polarity from
Twitter
Opinions
400 000 Tweets with an
Optimization Model
People
Places
Times
Gender-Based Violence in
140 Characters or Fewer: A
#BigData Case Study of
Twitter
14 million tweets
collected from Twitter
over a period of 10
months
3
1. Gender-based violence in 140 characters or fewer: A #BigData case study of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit
Sheth, First Monday, Volume 21, Number 1 - 4 January 2016
Outcomes of Analysis
◎ Trends of GBV tweets across 5 countries; USA,
India, Philippines, Nigeria, South Africa.
4
◎ Three thematic groups of GBV tweets: physical
violence, sexual violence, and harmful practices.
◎ Nigeria has the highest percentage of tweets with URLs in
comparison to other countries.
◎ Numerous explanations;
○ Literacy,
○ Credibility of the public press
○ Possibility that reliance on external resources somehow reduces
the threat of being identified as the responsible party.
Context-Aware
Harassment Detection
on Social Media
24 000 tweets collected
Supervised ML methods
used
5
1. Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Krishnaprasad Thirunarayan, Valerie L. Shalin, Amit Sheth. A Quality Type-aware Annotated Corpus and
Lexicon for Harassment Research. Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018
2. Mohammadreza Rezvan, Saeedeh Shekarpour, Thirunarayan, K., Valerie L. Shalin, Sheth, A. (2018). Analyzing and learning the languagefor different types of harassment
Knoesis wiki for Context-Aware Harassment Detection on Social
Media
Outcomes and Insights
Lexicon
Covering different types of harassment content
● Sexual
● Political
● Racial
Tweets
24 000 non-redundant annotated
tweets with 3000 are labeled as
harassing
Features
Combination of features resulted in best
accuracy
○ TFIDF
○ word2vec
○ paragraph2vec
○ LIWC vector
ML Methods
Gradient Boosting Machine (GBM)
outperformed SVM, KNN and NB
6
● Intellectuel
● Appearance - related
● General
7
1. Gaur, Manas, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "Let Me Tell You About Your
Mental Health!: Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM CIKM 2018.
Patient
ClinicianEMR
Insight
DSM-5 & Drug Abuse
Ontology
Improved
Healthcare
Classification of Reddit
Content to DSM-5 for
Web-based
Intervention
3 Million Posts from 270K
Reddit Users collected From
2005-2015 with zero shot
learning
Provide clinicians, insights of their patients
Knoesis wiki for Modeling Social Behavior for Healthcare
Utilization in Depression
Outcomes & Insights
9
Our sophisticated methods have
reduced the false alarm rate to 3%
- 5% by incorporating domain
knowledge and slang terms in
social media data
Views: People - Content - Network
Information in tweets by a user displays
an intent based on the user type:
Personal accounts share opinions, Retail
accounts promote related products for
sale, Media accounts disseminate
information.
Proper incorporation
of each view is
essential to
better represent
characteristics
of users.
User Modeling in Marijuana-related Communications
11
Multimodality
- The information shared in different
formats contributes to the meaning:
Text, Image, Emoji, Interactions
- Translation of image and emoji to textual
representation using state-of-the-art tools
such as EmojiNet.
People: user description, emoji,
profile pictures.
Content: text, emoji
Network: interactions with other
users: retweets and mentions.
🏈
😉
🍔
1. Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, and I. Budak Arpinar. "" What's ur type?"
Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding." In Proceedings of IEEE International
Conference on Web Intelligence, 2018
Knoesis wiki for eDrugTrends
Outcomes & Insights
◎ Incorporation of multimodal data,
specifically profile pictures and network
interactions, significantly contributes into
the classification of users.
◎ Multimodality significantly improves the
classification performance in the case of
imbalanced dataset, e.g., profile pictures
of users.
◎ Compositional of embeddings of views
(e.g., person, content, network) provide
more coherent representation of users.
12
Features Personal Media Retail
1 Tweet + Desc 0.95 0.42 0.73
2 w/ Composition 0.94 0.18 0.71
3 w/ Metadata 0.94 0.17 0.72
4 w/ Image 0.97 0.72 0.87
5 w/ Network 0.98 0.73 0.91
F-Scores for each user type
Fusing Visual, Textual and
Connectivity Clues for Studying
Mental Health
Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression
Develop a multimodal framework and
employing statistical techniques for
fusing heterogeneous sets of features
obtained by processing visual, textual
and user interaction data to identify
depressive behavior and demographic
inference.
13
1. Amir Hossein Yazdavar, Mohammad Saied Mahdavinejad, Goonmeet Bajaj, Krishnaprasad Thirunarayan, Jyotishman Pathak and Amit Sheth. Fusing Visual, Textual and
Connectivity Clues for Studying Mental Health in Population. In: 30th International Conference on World Wide Web (Submitted WWW-2019)
◎ How well do the content of posted images (colors,
aesthetic and facial presentation) reflect depressive
behavior?
◎ Does the choice of profile picture show any psychological
traits of depressed online persona? Are they reliable
enough to represent the demographic information such as
age and gender?
◎ Are there any underlying common themes among
depressed individuals generated using multimodal
content that can be used to detect depression reliably?
Outcomes & Insights
14
Characterizing Linguistic Patterns in two aspects:
Depressive-behavior and Age Distribution
Gender Biases
and Depressive
Behavior
Association (Chi-
square test: color-
code:
(blue:association),
(red: repulsion),
size: amount of
each cell’s
contribution)
The age
distribution for
depressed and
control users
in ground-truth
dataset
Outcomes & Insights
15
The explanation of the log-odds prediction of outcome (0.31) for
a sample user (y-axis shows the outcome probability (depressed
or control), the bar labels indicate the log-odds impact of each
feature)
Ranking Features obtained from Different Modalities with
Boruta Algorithm
Create value from data that supports action
Big Data & AI
16
What can we do that
is unique?
Emotions
Sentiments
Intentions Derive Insights
Scale to identify important & relevant
issues to human kind
Floods Earthquake
Wildfires Tsunami
Derive insights from data
Do more exercises
Reduce sugar intake
Increase water intake
More at: http://knoesis.org/projects, http://bit.ly/Kapproach
Thank You!
17

Contenu connexe

Tendances

Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Sensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor ApplicationsSensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor Applications
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor ApplicationsMelanie Swan
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...Sara-Jayne Terp
 
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...ALexandruDaia1
 
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary SlidedeckBig Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary SlidedeckUN Global Pulse
 
Digital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in KenyaDigital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in KenyaUN Global Pulse
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyUN Global Pulse
 
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackHelping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackCOMRADES project
 
Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Ashutosh Jadhav
 
Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...Amit Sheth
 
A Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 VaccinationA Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 VaccinationSarah Jackson
 
"Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges" "Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges" UN Global Pulse
 
Future%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paperFuture%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paperMarketingfacts
 
Pew Study: The Future Of The Internet
Pew Study: The Future Of The InternetPew Study: The Future Of The Internet
Pew Study: The Future Of The InternetDavid O'Reilly
 
Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)Miriam Fernandez
 
Public Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolenceHemant Purohit
 

Tendances (16)

Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
Sensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor ApplicationsSensor Ubiquity:  Automotive-Quantified Self  Integrated Sensor Applications
Sensor Ubiquity: Automotive-Quantified Self Integrated Sensor Applications
 
disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...
 
Just Google it! [slides]
Just Google it! [slides]Just Google it! [slides]
Just Google it! [slides]
 
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
Clustering analysis on news from health OSINT data regarding CORONAVIRUS-COVI...
 
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary SlidedeckBig Data for Development: Opportunities and Challenges, Summary Slidedeck
Big Data for Development: Opportunities and Challenges, Summary Slidedeck
 
Digital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in KenyaDigital Signals & Access to Finance in Kenya
Digital Signals & Access to Finance in Kenya
 
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copyGlobal Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
Global Pulse: Mining Indonesian Tweets to Understand Food Price Crises copy
 
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackHelping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
 
Disaster data informatics for situation awareness
Disaster data informatics for situation awareness Disaster data informatics for situation awareness
Disaster data informatics for situation awareness
 
Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...Augmented Personalized Health: using AI techniques on semantically integrated...
Augmented Personalized Health: using AI techniques on semantically integrated...
 
A Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 VaccinationA Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 Vaccination
 
"Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges" "Big Data for Development: Opportunities and Challenges"
"Big Data for Development: Opportunities and Challenges"
 
Future%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paperFuture%20of%20internet%202010%20 %20 Aaas%20paper
Future%20of%20internet%202010%20 %20 Aaas%20paper
 
Pew Study: The Future Of The Internet
Pew Study: The Future Of The InternetPew Study: The Future Of The Internet
Pew Study: The Future Of The Internet
 
Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)Biases in Social Media Research (NoBias EU project)
Biases in Social Media Research (NoBias EU project)
 
Public Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender Violence
 

Similaire à Computational Social Science as the Ultimate Web Intelligence

A Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social MediaA Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social MediaRSIS International
 
Social computing meet & greet
Social computing meet & greetSocial computing meet & greet
Social computing meet & greetAngela Brandt
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
 
UMN - Social Computing Collaborative
UMN - Social Computing CollaborativeUMN - Social Computing Collaborative
UMN - Social Computing Collaborativenorapaul
 
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdfA_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdfclientmentailai
 
Applications of data science in social media.pptx
Applications of data science in social media.pptxApplications of data science in social media.pptx
Applications of data science in social media.pptxlyudmilabaruah
 
Empowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdfEmpowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdfSamirsinh Parmar
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEijcsit
 
Literature review on customer emotions in social media
Literature review on customer emotions in social mediaLiterature review on customer emotions in social media
Literature review on customer emotions in social mediaJari Jussila
 
Digital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated MediaDigital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated MediaUniversity of Sydney
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEAIRCC Publishing Corporation
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3archiejones4
 
Suicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning ClassifiersSuicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning ClassifiersIRJET Journal
 
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st CenturyConvergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st CenturySimon Collister & Associates
 
76201960
7620196076201960
76201960IJRAT
 
Artificial intelligence in social media.
Artificial intelligence in social media.Artificial intelligence in social media.
Artificial intelligence in social media.ChetnaGoyal16
 
mmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmRohit440277
 
Fake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptxFake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptxHarshMangal20
 
Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...Miguel del Fresno
 

Similaire à Computational Social Science as the Ultimate Web Intelligence (20)

A Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social MediaA Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social Media
 
Social computing meet & greet
Social computing meet & greetSocial computing meet & greet
Social computing meet & greet
 
Social Multimedia as Sensors
Social Multimedia as SensorsSocial Multimedia as Sensors
Social Multimedia as Sensors
 
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIATHE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIA
 
UMN - Social Computing Collaborative
UMN - Social Computing CollaborativeUMN - Social Computing Collaborative
UMN - Social Computing Collaborative
 
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdfA_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
A_Hybrid_Deep_Learning_Model_to_Predict_the_Impact (1).pdf
 
Applications of data science in social media.pptx
Applications of data science in social media.pptxApplications of data science in social media.pptx
Applications of data science in social media.pptx
 
Empowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdfEmpowering Women in the Digital Sphere.pdf
Empowering Women in the Digital Sphere.pdf
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
 
Literature review on customer emotions in social media
Literature review on customer emotions in social mediaLiterature review on customer emotions in social media
Literature review on customer emotions in social media
 
Digital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated MediaDigital intermediation: Towards Transparent Public Automated Media
Digital intermediation: Towards Transparent Public Automated Media
 
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCEEPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
EPIDEMIC OUTBREAK PREDICTION USING ARTIFICIAL INTELLIGENCE
 
Strategic perspectives 3
Strategic perspectives 3Strategic perspectives 3
Strategic perspectives 3
 
Suicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning ClassifiersSuicide Analysis and Prevention Application using Machine Learning Classifiers
Suicide Analysis and Prevention Application using Machine Learning Classifiers
 
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st CenturyConvergence, Computation and Continuity: Challenges for PR in the 21st Century
Convergence, Computation and Continuity: Challenges for PR in the 21st Century
 
76201960
7620196076201960
76201960
 
Artificial intelligence in social media.
Artificial intelligence in social media.Artificial intelligence in social media.
Artificial intelligence in social media.
 
mmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
mmmmmmmmmmmmmmmm
 
Fake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptxFake_News_Detection_1st_review[1] capstone project.pptx
Fake_News_Detection_1st_review[1] capstone project.pptx
 
Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...Identifying social media influencers through social media network analysis: A...
Identifying social media influencers through social media network analysis: A...
 

Dernier

miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxCarrieButtitta
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxnoorehahmad
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.KathleenAnnCordero2
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69
 

Dernier (20)

miladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptxmiladyskindiseases-200705210221 2.!!pptx
miladyskindiseases-200705210221 2.!!pptx
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
PAG-UNLAD NG EKONOMIYA na dapat isaalang alang sa pag-aaral.
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
 

Computational Social Science as the Ultimate Web Intelligence

  • 1. Computational Social Science as the Ultimate Web Intelligence Kno.e.sis Projects at the Intersection of Big Data, AI, Social Good and Health Panel at Web Intelligence 2018 Prof. Amit Sheth LexisNexis Ohio Eminent Scholar Executive Director, Kno.e.sis - Ohio Center of Excellence in Knowledge-enabled Computing & BioHealth Innovation Presentation template by SlidesCarnival Photographs by Unsplash Icons by thenounproject
  • 2. Big Data | Social Media | AI 2 Harnessing Twitter ‘Big Data’ for Automatic Emotion Identification 2.5 M Tweets with Machine Learning algorithms Trends Emotions eDrugTrends - Identify emerging trends in cannabis and synthetic cannabinoid use in the U.S. Web Forum Data & Tweets with NLP, ML & Semantic Web Technologies Intents Sentiments Hazards SEES - Cross-modal aggregation of Multi-modal & Multi-disciplinary Data to support human efforts in disaster management Extracting Diverse Sentiment Expressions with Target-Dependent Polarity from Twitter Opinions 400 000 Tweets with an Optimization Model People Places Times
  • 3. Gender-Based Violence in 140 Characters or Fewer: A #BigData Case Study of Twitter 14 million tweets collected from Twitter over a period of 10 months 3 1. Gender-based violence in 140 characters or fewer: A #BigData case study of Twitter, Hemant Purohit, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit Sheth, First Monday, Volume 21, Number 1 - 4 January 2016
  • 4. Outcomes of Analysis ◎ Trends of GBV tweets across 5 countries; USA, India, Philippines, Nigeria, South Africa. 4 ◎ Three thematic groups of GBV tweets: physical violence, sexual violence, and harmful practices. ◎ Nigeria has the highest percentage of tweets with URLs in comparison to other countries. ◎ Numerous explanations; ○ Literacy, ○ Credibility of the public press ○ Possibility that reliance on external resources somehow reduces the threat of being identified as the responsible party.
  • 5. Context-Aware Harassment Detection on Social Media 24 000 tweets collected Supervised ML methods used 5 1. Mohammadreza Rezvan, Saeedeh Shekarpour, Lakshika Balasuriya, Krishnaprasad Thirunarayan, Valerie L. Shalin, Amit Sheth. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018 2. Mohammadreza Rezvan, Saeedeh Shekarpour, Thirunarayan, K., Valerie L. Shalin, Sheth, A. (2018). Analyzing and learning the languagefor different types of harassment Knoesis wiki for Context-Aware Harassment Detection on Social Media
  • 6. Outcomes and Insights Lexicon Covering different types of harassment content ● Sexual ● Political ● Racial Tweets 24 000 non-redundant annotated tweets with 3000 are labeled as harassing Features Combination of features resulted in best accuracy ○ TFIDF ○ word2vec ○ paragraph2vec ○ LIWC vector ML Methods Gradient Boosting Machine (GBM) outperformed SVM, KNN and NB 6 ● Intellectuel ● Appearance - related ● General
  • 7. 7 1. Gaur, Manas, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "Let Me Tell You About Your Mental Health!: Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM CIKM 2018. Patient ClinicianEMR Insight DSM-5 & Drug Abuse Ontology Improved Healthcare Classification of Reddit Content to DSM-5 for Web-based Intervention 3 Million Posts from 270K Reddit Users collected From 2005-2015 with zero shot learning Provide clinicians, insights of their patients Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression
  • 8. Outcomes & Insights 9 Our sophisticated methods have reduced the false alarm rate to 3% - 5% by incorporating domain knowledge and slang terms in social media data
  • 9. Views: People - Content - Network Information in tweets by a user displays an intent based on the user type: Personal accounts share opinions, Retail accounts promote related products for sale, Media accounts disseminate information. Proper incorporation of each view is essential to better represent characteristics of users. User Modeling in Marijuana-related Communications 11 Multimodality - The information shared in different formats contributes to the meaning: Text, Image, Emoji, Interactions - Translation of image and emoji to textual representation using state-of-the-art tools such as EmojiNet. People: user description, emoji, profile pictures. Content: text, emoji Network: interactions with other users: retweets and mentions. 🏈 😉 🍔 1. Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, and I. Budak Arpinar. "" What's ur type?" Contextualized Classification of User Types in Marijuana-related Communications using Compositional Multiview Embedding." In Proceedings of IEEE International Conference on Web Intelligence, 2018 Knoesis wiki for eDrugTrends
  • 10. Outcomes & Insights ◎ Incorporation of multimodal data, specifically profile pictures and network interactions, significantly contributes into the classification of users. ◎ Multimodality significantly improves the classification performance in the case of imbalanced dataset, e.g., profile pictures of users. ◎ Compositional of embeddings of views (e.g., person, content, network) provide more coherent representation of users. 12 Features Personal Media Retail 1 Tweet + Desc 0.95 0.42 0.73 2 w/ Composition 0.94 0.18 0.71 3 w/ Metadata 0.94 0.17 0.72 4 w/ Image 0.97 0.72 0.87 5 w/ Network 0.98 0.73 0.91 F-Scores for each user type
  • 11. Fusing Visual, Textual and Connectivity Clues for Studying Mental Health Knoesis wiki for Modeling Social Behavior for Healthcare Utilization in Depression Develop a multimodal framework and employing statistical techniques for fusing heterogeneous sets of features obtained by processing visual, textual and user interaction data to identify depressive behavior and demographic inference. 13 1. Amir Hossein Yazdavar, Mohammad Saied Mahdavinejad, Goonmeet Bajaj, Krishnaprasad Thirunarayan, Jyotishman Pathak and Amit Sheth. Fusing Visual, Textual and Connectivity Clues for Studying Mental Health in Population. In: 30th International Conference on World Wide Web (Submitted WWW-2019) ◎ How well do the content of posted images (colors, aesthetic and facial presentation) reflect depressive behavior? ◎ Does the choice of profile picture show any psychological traits of depressed online persona? Are they reliable enough to represent the demographic information such as age and gender? ◎ Are there any underlying common themes among depressed individuals generated using multimodal content that can be used to detect depression reliably?
  • 12. Outcomes & Insights 14 Characterizing Linguistic Patterns in two aspects: Depressive-behavior and Age Distribution Gender Biases and Depressive Behavior Association (Chi- square test: color- code: (blue:association), (red: repulsion), size: amount of each cell’s contribution) The age distribution for depressed and control users in ground-truth dataset
  • 13. Outcomes & Insights 15 The explanation of the log-odds prediction of outcome (0.31) for a sample user (y-axis shows the outcome probability (depressed or control), the bar labels indicate the log-odds impact of each feature) Ranking Features obtained from Different Modalities with Boruta Algorithm
  • 14. Create value from data that supports action Big Data & AI 16 What can we do that is unique? Emotions Sentiments Intentions Derive Insights Scale to identify important & relevant issues to human kind Floods Earthquake Wildfires Tsunami Derive insights from data Do more exercises Reduce sugar intake Increase water intake More at: http://knoesis.org/projects, http://bit.ly/Kapproach

Notes de l'éditeur

  1. Opinions - "Time for dabs": Analyzing Twitter data on butane hash oil use.
  2. Sharing behavior analysis. Social media provide the opportunity to distribute information, potentially reflecting both the senders’ judgment of information importance, and reliance on the voice of others. Sharing functions as an amplification of these voices, often through the voices of influential celebrities. We analyze two types of sharing behavior in the social media community surrounding GBV events: direct content resharing as a retweet (RT), and indirect sharing via references to external resources, such as news, blogs, articles, and multimedia, using URLs, etc. the low retweeting frequency in Nigeria is particularly remarkable (see Table 5). One might hypothesize that a low literacy country such as Nigeria, in which senders are less able to compose messages, would have the highest retweet ratio. The adjacent analysis of the proportion of URL references with respect to the total corpus suggests a different sociocultural phenomenon at work concerning the identifiability of the responsible party. For GBV tweets containing URLs, Nigeria has the highest percentage of tweets with URLs in comparison to other countries. Numerous explanations can be tested, including literacy, credibility of the public press, and the possibility that reliance on external resources somehow reduces the threat of being identified as the responsible party.
  3. Goal - understanding individuals mental health situation Provide clinicias insights of his/ pataients
  4. Not all the Reddit content types (Main Posts, Comments, and Replies) are informative. Identification of Features that represent users on Reddit: Vertical Linguistic Features (e.g. Inter-Subreddit Similarity) Horizontal Linguistic Features (e.g. Subordinate Conjunction) Fine-Grained Features (e.g. Readability scores) Word Embedding with/without modulation Coherence-based topic selection that associate subreddit to DSM-5 Enrichment of DAO ontology with DSM-5 lexicon and Slang Terms : DSM-5 Knowledge Hierarchy DAO - we created
  5. A sophisticated method allowed us to hugly reduce the false alarm rate - Explain the optimization effort in one sentence 25% reduction in the false alarm rate (2- 5%) while the other methods have higher false alarm rates () Takeaway; Incorporation of domain knowledge and slang terms in social media data
  6. 1)Analysis of content of posted images in terms of colors, aesthetic and facial presentation and their associations with depressive behavior; 2)Uncovering the underlying relationships between the visual and contextual content of likely depressed profiles obtained using demographic inference process which can facilitate community-level management of depression
  7. Top left: Our findings from social media are consistent with the findings in the medical literature as according to the third National Health and Nutrition Examination Survey [29] more women than men were given a diagnosis of depression. Bottom Left: shows that young people aged below 24 tend to be more depressed suggesting that either likely depressed-user population is younger, or youngsters are more likely to disclose their age say with the intention of connecting to their peers (social homophily
  8. Right: The waterfall charts represent how the probability of being de- pressed changes with the addition of each feature variable. Left: illustrates feature importance obtained by Boruta algorithm.