SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Social Network Analysis
Approach and Applications
Joshua S. White
PhD Candidate, Engineering Science
April 22, 2014
Committee Members:
Jeanna N. Matthews, PhD (Advisor)
John S. Bay, PhD (External Examiner)
Chris Lynch, PhD
Chen Liu, PhD
Stephanie C. Schuckers, PhD
| Clarkson University 1/42
Outline
Motivation . . . . . . . . . . . . . . . . 3
Problem Questions . . . . . . . . . 4
Method & Publications . . . . . . . . . 5
Coalmine . . . . . . . . . . . . . . . . . 6
PySNAP . . . . . . . . . . . . . . . . . 7
Established Dataset . . . . . . . . . . . 8
Insights into the Data . . . . . . . 9
Botnet Command & Control Detection . 10
Phishing Website Detection . . . . . . . 12
Phishing Website Detection Con-
tinuum: ML based detection 14
Malware Infection Vector Detection . . 15
Actor Identification . . . . . . . . . . . 19
Event Identification . . . . . . . . . . . 24
Conclusions . . . . . . . . . . . . . . . 30
Future Work . . . . . . . . . . . . . . . 31
Acknowledgements . . . . . . . . . . . 32
References . . . . . . . . . . . . . . . . 33
Contact . . . . . . . . . . . . . . . . . 34
Questions . . . . . . . . . . . . . . . . 35
Suplimental Material . . . . . . . . . . 36
| Clarkson University 2/42
Motivation
Partially inspired by Gladwell’s book, The Tipping Point [1], in which he discusses
how life can be thought of as an epidemic. Some criticism exists as to Gladwell’s
rigor, however for our use it is about inspiration and motivation not accuracy.
The Books Key Points “for our purposes”
• Actors (Connectors, Mavens, Salesmen).
• Information spreads like disease.
• Ideas reach a tipping point (critical mass).
Let’s Face It - Social Networks Are Fun
• We are a social species, that enjoy communicating and self adulation.
| Clarkson University 3/42
Problem Questions
• Can we come up with a way of classifying users based on actor types?
• Can we determine who the opinion leaders or influencers are?
• Can we determine how information spreads on these networks?
• Can we detect malicious social network use?
• Are there information security applications for social network data-mining?
| Clarkson University 4/42
Method & Publications
• Establish a reliable collection mechanism.
• Establish a large dataset that can be utilized to answer each question.
• Use a case study approach, whereby each case feeds the next.
• Produce each case study as an individual publication or presentation.
– 3 x Published Proceedings
– 2 x Pending Proceedings
– 3 x Invited Presentations
| Clarkson University 5/42
Coalmine
• Scales well based on initial tests
• Useful for both manual and automated detection
• Allowed us to refine our data collection capabilities
At the Time (Future Work)
• Rebuild of the tool to fix scaling limitations
• More extensible Map/Reduce method
• Inclusion of native multi-threading capability
• New storage and distribution method
• New algorithms for automated opinion leader detection
| Clarkson University 6/42
PySNAP
• Fixes all of the previous issues with Coalmine
• Completely reimplimented in Python with a few supportive Bash scripts
• Utilizes the DISCO MapReduce framework, also built on Python
• Included a better method for data capture that was previously bolted on to Coalmine
• Allowed us to establish a large dataset for future work
| Clarkson University 7/42
Established Dataset
• Over the course of 2012 we collected 165 TB of Twitter Data (Uncompressed)
– 175 Days Collected, 147 Full Days
∗ Estimated 45 Billion Tweets
– Recently released estimates place total Twitter traffic at 175 million tweets per
day in 2012
– Thus our daily collection rates varied between 50% and 80% of total Twitter
traffic.
– We captured complete tweet data in JSON format using Twitters REST API.
∗ This data includes a large number of additional fields other than the mes-
sage text, all of which can be taken into account when doing measure-
ments.
| Clarkson University 8/42
Insights into the Data
| Clarkson University 9/42
Botnet Command & Control Detection
• Joshua S White, Jeanna N Matthews, and John L Stacy. Coalmine: an experience in building a system for social
media analytics. In SPIE Defense, Security, and Sensing, pages 84080A-84080A. International Society for Optics
and Photonics, 2012.
| Clarkson University 10/42
Botnet Command & Control Detection Continued
Date/Time UID Text MSG Entropy Source
Sun Mar 20 15:27:02
+0000 2011
49492150
668365824
Shutdown -r now 3.373557
26227518
http://twitter.com/Ebastos
Sun Mar 20 01:25:20
+0000 2011
49280326
475853825
# shutdown -h now 3.373557
26227518
http://twitter.com/ohdediku
Sun Mar 20 21:40:53
+0000 2011
49586229
964062720
$ sudo shutdown -h
now
3.373557
26227518
http://twitter.com/souzabruno
Sun Mar 20 19:38:41
+0000 2011
49555476
769280000
Text: sudo shut-
down -h now
3.373557
26227518
http://twitter.com/stormyblack
Sun Mar 20 18:51:51
+0000 2011
49543693
820116992
shutdown -now 3.373557
26227518
http://twitter.com/godzilla2k9
Sun Mar 20 18:52:30
+0000 2011
49543856
840126464
shutdown -h now !: 3.373557
26227518
http://twitter.com/ph3nagen
Sun Mar 20 18:52:30
+0000 2011
49600582
113177600
shutdown -H now. 3.373557
26227518
http://twitter.com/willybistuer
Sun Mar 20 22:37:54
+0000 2011
49597117
039251457
elmenda: su shut-
down -h now
3.373557
26227518
http://twitter.com/NeoVasili
| Clarkson University 11/42
Phishing Website Detection
• Joshua S White, Jeanna N Matthews, and John L Stacy. A method for the automated detection phishing websites
through both site characteristics and image analysis. In SPIE Defense, Security, and Sensing, pages 84080B- 84080B.
International Society for Optics and Photonics, 2012.
| Clarkson University 12/42
Phishing Website Detection Continued
(F)raud / (L)egit URL Structural
Fingerprint
Page Title pHash Value Hamming Score
Paypal Fraudulent http://si4r.com/_paypal
.co.uk/webscr.html?cmd
=SignIn&co_partnerId=2
&pUserId=&siteid=0
&pageType=&pa1=&i1
=&bshowgif=&UsingSSL
=&ru=&pp=&pa2=
&errmsg=&runame=
0,7,1,0,2 RETURNED
NOTHING
167161696874
89800000
1
Paypal Legitimate https://www.paypal.com/
cgi-bin/webscr?cmd=
_login-submit&dispatch=
5885d80a13c0db1f8e263
663d3faee8d1e83f46a369
95b3856cef1e18897ad75
27,3,0,0,2 Redirecting
- Paypal
184397071904
31800000
0
| Clarkson University 13/42
Phishing Website Detection Continuum: ML based
detection
• Title: An Image-based Feature Extraction Approach for Phishing Website Detection
• Authors: Hao Jiang, Joshua White, Jeanna Matthews
• Builds off of our previous work in phishing website detection, specifically the image
analysis approach
• Utilizes a Machine Learning based approach to identifying the most prominent images
on a webpage, usually the sites logo
• Is able to detect phishing sites that the phash/hamming distance method concludes as
not similar.
– These are the “poor quality” phishing sites
| Clarkson University 14/42
Malware Infection Vector Detection
• BEK (The Blackhole Exploit Kit) was the predominant MaaS (Malware as a Service)
in 2012.
• It accounted for an estimated 29% of all malicious URLs.
• BEK licenses went for around 1500$ USD
• BEK used Twitter as it’s primary means of spreading infectious URLs
• Our method detects these malicious URLs and infectious accounts on a large scale
| Clarkson University 15/42
Malware Infection Vector Detection Continued
• Joshua S. White and Jeanna N. Matthews, “It’s you on photo?: Automatic detection of Twitter accounts in-
fected with the Blackhole Exploit Kit,” Malicious and Unwanted Software: "The Americas" (MALWARE), 2013 8th
International Conference on , vol., no., pp.51,58, 22-24 Oct. 2013 doi: 10.1109/MALWARE.2013.6703685
| Clarkson University 16/42
Malware Infection Vector Detection Continued
| Clarkson University 17/42
Malware Infection Vector Detection Continued
| Clarkson University 18/42
Actor Identification
• Title: Connectors, Mavens, Salesmen and More: Actor Based Online Social Network
(OSN) Analysis Method Using Tensed Predicate Logic
• Authors: Joshua White and Jeanna Matthews
• Submitted to KDD2014 (Knowledge Discovery and Data Mining) Conference “Data
Mining for Social Good”
• Utilized multiple definitions of actor types to created tensed predicate logic descriptions
• Translated these logics into semantic queries
• Tested the queries against a known dataset
| Clarkson University 19/42
Actor Identification Continued
| Clarkson University 20/42
Actor Identification Continued
• Time is important
• Previous methods did not take event sequence into account
• Liaison Example:
| Clarkson University 21/42
Actor Identification Continued
| Clarkson University 22/42
Actor Identification Continued
| Clarkson University 23/42
Event Identification
• Still in the initial stages of this part of our work
• Given a general topic, “search term, hashtag,” we can identify most of the related
content from the dataset
• We have a means for alerting on all new posts regarding that term
• We can dig historically through the data and trace the path that an itea took
• We can identify the influential individuals, “accounts,” that played a part in the infor-
mation spread
• Our test case was the KONY2012 Event
| Clarkson University 24/42
Event Identification Continued
| Clarkson University 25/42
Event Identification Continued
• Top 10 Twitter Accounts, sending and receiving KONY2012 related Tweets
Directed @ Account Names In-Degree Origin Account Names Out-Degree
tothekidswho 625 twittonpeace 47
Invisible 125 interhabernet 44
youtube 118 DailyisOut 44
helpspreadthis 95 MEDYA_TURK 42
justinbieber 83 haber_42 35
prettypinkprobz 48 gundem_haber 30
ninadobrev 48 twittofpeace 22
MeekMill 47 korkmazhaber 19
ladygaga 43 tarafsiz_haber 14
KendallJenner 39 Son_DakikaHaber 13
| Clarkson University 26/42
Event Identification Continued
• Top 10 Twitter Accounts, retweeting and being retweeted regarding KONY2012
Retweeting Accounts In-Degree Message Source Out-Degree
MedyaKonya 8 Stop____Kony 2642
twittonpeace 8 tothekidswho 753
haber_42 7 konyfamous2012 716
gundem_haber 7 Kony2012Help 615
korkmazhaber 7 stop______kony 353
DailyisOut 7 WESTOPKONY 225
interhabernet 6 zaynmalik 221
KONYA_ZAMAN 6 iSayStopKony 127
konya_time 6 Stop_2012_Kony 80
konyagazetesi 5 Kony_Awareness 72
| Clarkson University 27/42
Event Identification Continued
| Clarkson University 28/42
Event Identification Continued
| Clarkson University 29/42
Conclusions
• We aimed to answer the following questions when we started this work:
– Can we come up with a way of classifying users based on actor types?
– Can we determine who the opinion leaders or influencers are?
– Can we determine how information spreads on these networks?
– Can we detect malicious social network use?
– Are there information security applications for social network data-mining?
• I think we did a good job at providing at least some cursory answers to these questions
| Clarkson University 30/42
Future Work
• We have applied for a data grant from Twitter
• We have, are in the process of, moving our entire dataset to the lab at Clarkson and
building up a new capture/analysis system
• I am planning on pursuing the semantic side of social network analysis
– Currently only one SNA semantic ontology exists and it’s on on paper.
– I am planning on rolling both the actor and event analysis into one approach
which will be part of a new ontology
| Clarkson University 31/42
Acknowledgements
• I would like to thank:
– Dr. Matthews
– Dr. Bay
– Dr. Lynch
– Dr. Schuckers
– Dr. Liu
| Clarkson University 32/42
References
[1] Gladwell, M. (2000). The tipping point. Boston: Little, Brown and Company
| Clarkson University 33/42
Contact
whitejs@clarkson.edu
| Clarkson University 34/42
Questions
Questions?
Suplimental Material
| Clarkson University 36/42
• DDFS
| Clarkson University 37/42
| Clarkson University 38/42
• Twitter JSON Key Fields
profile_link_color Coordinates verified
In_reply_to_screen_name Geo time_zone
In_reply_to_status_id text statuses_count
In_reply_to_status_id_str entities Contributors
In_reply_to_user_id place protected
profile_background_color contributors_enabled trunkated
profile_background_title default_profile retweeted
default_profile_image description id_translator
follow_request_sent followers_count location
friends_count geo_endabled favorites_count
profile_image_url_https listed_count following
profile_background_image_url notifications retweet_count
background_image_url_https name created_at
profile_image_url lang Favorited
sidebar_border_color use_background_image Id_str
sidebar_fill_color screen_name Created_at
profile_text_color show_all_inline_media Id
url utc_offset
| Clarkson University 39/42
• BEK Infectious Account Visualization
| Clarkson University 40/42
• Tensed Predicate Logic Key
| Clarkson University 41/42
• Coalmine User Interface
| Clarkson University 42/42

Contenu connexe

Tendances

12 Network Experiments and Interventions: Studying Information Diffusion and ...
12 Network Experiments and Interventions: Studying Information Diffusion and ...12 Network Experiments and Interventions: Studying Information Diffusion and ...
12 Network Experiments and Interventions: Studying Information Diffusion and ...dnac
 
Virtual Assisted Self Interview Research
Virtual Assisted Self Interview ResearchVirtual Assisted Self Interview Research
Virtual Assisted Self Interview ResearchMark Bell
 
Cognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsCognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsChristoph Trattner
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on NetworksMason Porter
 
Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...Christoph Trattner
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Denis Parra Santander
 
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...Micah Altman
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two PartsPatti Anklam
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measuresdnac
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part ITHomas Plotkowiak
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisSujoy Bag
 
From research to reality: Transforming libraries for a global information world.
From research to reality: Transforming libraries for a global information world.From research to reality: Transforming libraries for a global information world.
From research to reality: Transforming libraries for a global information world.Lynn Connaway
 
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Academia Sinica
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentivesElena Simperl
 
11 Network Experiments and Interventions
11 Network Experiments and Interventions11 Network Experiments and Interventions
11 Network Experiments and Interventionsdnac
 
Recommending Items in Social Tagging Systems Using Tag and Time Information
Recommending Items in Social Tagging Systems Using Tag and Time InformationRecommending Items in Social Tagging Systems Using Tag and Time Information
Recommending Items in Social Tagging Systems Using Tag and Time InformationChristoph Trattner
 
Paper Writing in Applied Mathematics (slightly updated slides)
Paper Writing in Applied Mathematics (slightly updated slides)Paper Writing in Applied Mathematics (slightly updated slides)
Paper Writing in Applied Mathematics (slightly updated slides)Mason Porter
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationChristoph Trattner
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Vala Ali Rohani
 

Tendances (20)

12 Network Experiments and Interventions: Studying Information Diffusion and ...
12 Network Experiments and Interventions: Studying Information Diffusion and ...12 Network Experiments and Interventions: Studying Information Diffusion and ...
12 Network Experiments and Interventions: Studying Information Diffusion and ...
 
Virtual Assisted Self Interview Research
Virtual Assisted Self Interview ResearchVirtual Assisted Self Interview Research
Virtual Assisted Self Interview Research
 
Cognitive Models in Recommender Systems
Cognitive Models in Recommender SystemsCognitive Models in Recommender Systems
Cognitive Models in Recommender Systems
 
Opinion Dynamics on Networks
Opinion Dynamics on NetworksOpinion Dynamics on Networks
Opinion Dynamics on Networks
 
Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...Social Computing in the area of Big Data at the Know-Center Austria's leading...
Social Computing in the area of Big Data at the Know-Center Austria's leading...
 
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
 
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...
MIT Program on Information Science Talk -- Julia Flanders on Jobs, Roles, Ski...
 
Social Network Analysis in Two Parts
Social Network Analysis in Two PartsSocial Network Analysis in Two Parts
Social Network Analysis in Two Parts
 
01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures01 Introduction to Networks Methods and Measures
01 Introduction to Networks Methods and Measures
 
Social network analysis intro part I
Social network analysis intro part ISocial network analysis intro part I
Social network analysis intro part I
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
From research to reality: Transforming libraries for a global information world.
From research to reality: Transforming libraries for a global information world.From research to reality: Transforming libraries for a global information world.
From research to reality: Transforming libraries for a global information world.
 
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
 
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentives
 
11 Network Experiments and Interventions
11 Network Experiments and Interventions11 Network Experiments and Interventions
11 Network Experiments and Interventions
 
Recommending Items in Social Tagging Systems Using Tag and Time Information
Recommending Items in Social Tagging Systems Using Tag and Time InformationRecommending Items in Social Tagging Systems Using Tag and Time Information
Recommending Items in Social Tagging Systems Using Tag and Time Information
 
Paper Writing in Applied Mathematics (slightly updated slides)
Paper Writing in Applied Mathematics (slightly updated slides)Paper Writing in Applied Mathematics (slightly updated slides)
Paper Writing in Applied Mathematics (slightly updated slides)
 
Recommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human CategorizationRecommending Tags with a Model of Human Categorization
Recommending Tags with a Model of Human Categorization
 
Social Network Analysis (Part 1)
Social Network Analysis (Part 1)Social Network Analysis (Part 1)
Social Network Analysis (Part 1)
 

Similaire à Social Network Analysis Applications and Approach

The analysis of qualitative data 22nd Oct 2015
The analysis of qualitative data 22nd Oct 2015The analysis of qualitative data 22nd Oct 2015
The analysis of qualitative data 22nd Oct 2015Matthew Maycock
 
User behavior model & recommendation on basis of social networks
User behavior model & recommendation on basis of social networks User behavior model & recommendation on basis of social networks
User behavior model & recommendation on basis of social networks Shah Alam Sabuj
 
NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in EducationPhilip Piety
 
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On Students
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On StudentsAbdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On Students
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On StudentsLisa Garcia
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Nicola Osborne
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkDaniel S. Katz
 
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...DataScienceConferenc1
 
Social Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsSocial Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsPatti Anklam
 
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditSC CTSI at USC and CHLA
 
A brief introduction to crowdsourcing for data collection
A brief introduction to crowdsourcing for data collectionA brief introduction to crowdsourcing for data collection
A brief introduction to crowdsourcing for data collectionElena Simperl
 
Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Mike Kujawski
 
Learning Analytics: Seeking new insights from educational data
Learning Analytics: Seeking new insights from educational dataLearning Analytics: Seeking new insights from educational data
Learning Analytics: Seeking new insights from educational dataAndrew Deacon
 
Aligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & NeedsAligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & NeedsSimon Knight
 
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...Roman Atachiants
 
Lecture_1_Intro.pdf
Lecture_1_Intro.pdfLecture_1_Intro.pdf
Lecture_1_Intro.pdfpaijitk
 
ONA and the tools landscape
ONA and the tools landscapeONA and the tools landscape
ONA and the tools landscapePatti Anklam
 
Online Research: New Challenges and Opportunities
Online Research: New Challenges and OpportunitiesOnline Research: New Challenges and Opportunities
Online Research: New Challenges and OpportunitiesGlen Farrelly
 
Survey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni ConnawaySurvey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni ConnawayLynn Connaway
 

Similaire à Social Network Analysis Applications and Approach (20)

ase-social-informatics (6)
ase-social-informatics (6)ase-social-informatics (6)
ase-social-informatics (6)
 
The analysis of qualitative data 22nd Oct 2015
The analysis of qualitative data 22nd Oct 2015The analysis of qualitative data 22nd Oct 2015
The analysis of qualitative data 22nd Oct 2015
 
User behavior model & recommendation on basis of social networks
User behavior model & recommendation on basis of social networks User behavior model & recommendation on basis of social networks
User behavior model & recommendation on basis of social networks
 
NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in Education
 
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On Students
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On StudentsAbdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On Students
Abdulwahaab Saif S Alsaif Investigate The Impact Of Social Media On Students
 
Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...Working with Social Media Data: Ethics & good practice around collecting, usi...
Working with Social Media Data: Ethics & good practice around collecting, usi...
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...[DSC Europe 22] Machine learning algorithms as tools for student success pred...
[DSC Europe 22] Machine learning algorithms as tools for student success pred...
 
Social Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to ToolsSocial Network Analysis & an Introduction to Tools
Social Network Analysis & an Introduction to Tools
 
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditDigital Scholar Webinar: Recruiting Research Participants Online Using Reddit
Digital Scholar Webinar: Recruiting Research Participants Online Using Reddit
 
A brief introduction to crowdsourcing for data collection
A brief introduction to crowdsourcing for data collectionA brief introduction to crowdsourcing for data collection
A brief introduction to crowdsourcing for data collection
 
Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...Practical Applications for Social Network Analysis in Public Sector Marketing...
Practical Applications for Social Network Analysis in Public Sector Marketing...
 
Learning Analytics: Seeking new insights from educational data
Learning Analytics: Seeking new insights from educational dataLearning Analytics: Seeking new insights from educational data
Learning Analytics: Seeking new insights from educational data
 
Aligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & NeedsAligning Learning Analytics with Classroom Practices & Needs
Aligning Learning Analytics with Classroom Practices & Needs
 
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
Master Thesis: The Design of a Rich Internet Application for Exploratory Sear...
 
Lecture_1_Intro.pdf
Lecture_1_Intro.pdfLecture_1_Intro.pdf
Lecture_1_Intro.pdf
 
ONA and the tools landscape
ONA and the tools landscapeONA and the tools landscape
ONA and the tools landscape
 
NCCMT Spotlight Webinar: MetaQAT
NCCMT Spotlight Webinar: MetaQATNCCMT Spotlight Webinar: MetaQAT
NCCMT Spotlight Webinar: MetaQAT
 
Online Research: New Challenges and Opportunities
Online Research: New Challenges and OpportunitiesOnline Research: New Challenges and Opportunities
Online Research: New Challenges and Opportunities
 
Survey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni ConnawaySurvey Research Methods with Lynn Silipigni Connaway
Survey Research Methods with Lynn Silipigni Connaway
 

Plus de Joshua S. White, PhD josh@securemind.org

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...Joshua S. White, PhD josh@securemind.org
 
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...Joshua S. White, PhD josh@securemind.org
 
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...Joshua S. White, PhD josh@securemind.org
 
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...Joshua S. White, PhD josh@securemind.org
 

Plus de Joshua S. White, PhD josh@securemind.org (11)

Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
 
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
 
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...Presentation - Application of Actor Level Social Characteristic Indicator Sel...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
Supraja_SMS_presentation
 
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1Clarkson   joshua white - ids testing - spie 2013 presentation - jsw - d1
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
 
Malware bek slides 20131023 final
Malware bek slides 20131023 finalMalware bek slides 20131023 final
Malware bek slides 20131023 final
 
CSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and PrivacyCSIAC - Social Media Analysis and Privacy
CSIAC - Social Media Analysis and Privacy
 
Clarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal PresentationClarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal Presentation
 
Coalmine spie 2012 presentation - jsw -d3
Coalmine   spie 2012 presentation - jsw -d3Coalmine   spie 2012 presentation - jsw -d3
Coalmine spie 2012 presentation - jsw -d3
 
Phishing spie 2012 presentation - jsw - d2
Phishing   spie 2012 presentation - jsw - d2Phishing   spie 2012 presentation - jsw - d2
Phishing spie 2012 presentation - jsw - d2
 
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
 

Dernier

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 

Dernier (20)

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 

Social Network Analysis Applications and Approach

  • 1. Social Network Analysis Approach and Applications Joshua S. White PhD Candidate, Engineering Science April 22, 2014 Committee Members: Jeanna N. Matthews, PhD (Advisor) John S. Bay, PhD (External Examiner) Chris Lynch, PhD Chen Liu, PhD Stephanie C. Schuckers, PhD | Clarkson University 1/42
  • 2. Outline Motivation . . . . . . . . . . . . . . . . 3 Problem Questions . . . . . . . . . 4 Method & Publications . . . . . . . . . 5 Coalmine . . . . . . . . . . . . . . . . . 6 PySNAP . . . . . . . . . . . . . . . . . 7 Established Dataset . . . . . . . . . . . 8 Insights into the Data . . . . . . . 9 Botnet Command & Control Detection . 10 Phishing Website Detection . . . . . . . 12 Phishing Website Detection Con- tinuum: ML based detection 14 Malware Infection Vector Detection . . 15 Actor Identification . . . . . . . . . . . 19 Event Identification . . . . . . . . . . . 24 Conclusions . . . . . . . . . . . . . . . 30 Future Work . . . . . . . . . . . . . . . 31 Acknowledgements . . . . . . . . . . . 32 References . . . . . . . . . . . . . . . . 33 Contact . . . . . . . . . . . . . . . . . 34 Questions . . . . . . . . . . . . . . . . 35 Suplimental Material . . . . . . . . . . 36 | Clarkson University 2/42
  • 3. Motivation Partially inspired by Gladwell’s book, The Tipping Point [1], in which he discusses how life can be thought of as an epidemic. Some criticism exists as to Gladwell’s rigor, however for our use it is about inspiration and motivation not accuracy. The Books Key Points “for our purposes” • Actors (Connectors, Mavens, Salesmen). • Information spreads like disease. • Ideas reach a tipping point (critical mass). Let’s Face It - Social Networks Are Fun • We are a social species, that enjoy communicating and self adulation. | Clarkson University 3/42
  • 4. Problem Questions • Can we come up with a way of classifying users based on actor types? • Can we determine who the opinion leaders or influencers are? • Can we determine how information spreads on these networks? • Can we detect malicious social network use? • Are there information security applications for social network data-mining? | Clarkson University 4/42
  • 5. Method & Publications • Establish a reliable collection mechanism. • Establish a large dataset that can be utilized to answer each question. • Use a case study approach, whereby each case feeds the next. • Produce each case study as an individual publication or presentation. – 3 x Published Proceedings – 2 x Pending Proceedings – 3 x Invited Presentations | Clarkson University 5/42
  • 6. Coalmine • Scales well based on initial tests • Useful for both manual and automated detection • Allowed us to refine our data collection capabilities At the Time (Future Work) • Rebuild of the tool to fix scaling limitations • More extensible Map/Reduce method • Inclusion of native multi-threading capability • New storage and distribution method • New algorithms for automated opinion leader detection | Clarkson University 6/42
  • 7. PySNAP • Fixes all of the previous issues with Coalmine • Completely reimplimented in Python with a few supportive Bash scripts • Utilizes the DISCO MapReduce framework, also built on Python • Included a better method for data capture that was previously bolted on to Coalmine • Allowed us to establish a large dataset for future work | Clarkson University 7/42
  • 8. Established Dataset • Over the course of 2012 we collected 165 TB of Twitter Data (Uncompressed) – 175 Days Collected, 147 Full Days ∗ Estimated 45 Billion Tweets – Recently released estimates place total Twitter traffic at 175 million tweets per day in 2012 – Thus our daily collection rates varied between 50% and 80% of total Twitter traffic. – We captured complete tweet data in JSON format using Twitters REST API. ∗ This data includes a large number of additional fields other than the mes- sage text, all of which can be taken into account when doing measure- ments. | Clarkson University 8/42
  • 9. Insights into the Data | Clarkson University 9/42
  • 10. Botnet Command & Control Detection • Joshua S White, Jeanna N Matthews, and John L Stacy. Coalmine: an experience in building a system for social media analytics. In SPIE Defense, Security, and Sensing, pages 84080A-84080A. International Society for Optics and Photonics, 2012. | Clarkson University 10/42
  • 11. Botnet Command & Control Detection Continued Date/Time UID Text MSG Entropy Source Sun Mar 20 15:27:02 +0000 2011 49492150 668365824 Shutdown -r now 3.373557 26227518 http://twitter.com/Ebastos Sun Mar 20 01:25:20 +0000 2011 49280326 475853825 # shutdown -h now 3.373557 26227518 http://twitter.com/ohdediku Sun Mar 20 21:40:53 +0000 2011 49586229 964062720 $ sudo shutdown -h now 3.373557 26227518 http://twitter.com/souzabruno Sun Mar 20 19:38:41 +0000 2011 49555476 769280000 Text: sudo shut- down -h now 3.373557 26227518 http://twitter.com/stormyblack Sun Mar 20 18:51:51 +0000 2011 49543693 820116992 shutdown -now 3.373557 26227518 http://twitter.com/godzilla2k9 Sun Mar 20 18:52:30 +0000 2011 49543856 840126464 shutdown -h now !: 3.373557 26227518 http://twitter.com/ph3nagen Sun Mar 20 18:52:30 +0000 2011 49600582 113177600 shutdown -H now. 3.373557 26227518 http://twitter.com/willybistuer Sun Mar 20 22:37:54 +0000 2011 49597117 039251457 elmenda: su shut- down -h now 3.373557 26227518 http://twitter.com/NeoVasili | Clarkson University 11/42
  • 12. Phishing Website Detection • Joshua S White, Jeanna N Matthews, and John L Stacy. A method for the automated detection phishing websites through both site characteristics and image analysis. In SPIE Defense, Security, and Sensing, pages 84080B- 84080B. International Society for Optics and Photonics, 2012. | Clarkson University 12/42
  • 13. Phishing Website Detection Continued (F)raud / (L)egit URL Structural Fingerprint Page Title pHash Value Hamming Score Paypal Fraudulent http://si4r.com/_paypal .co.uk/webscr.html?cmd =SignIn&co_partnerId=2 &pUserId=&siteid=0 &pageType=&pa1=&i1 =&bshowgif=&UsingSSL =&ru=&pp=&pa2= &errmsg=&runame= 0,7,1,0,2 RETURNED NOTHING 167161696874 89800000 1 Paypal Legitimate https://www.paypal.com/ cgi-bin/webscr?cmd= _login-submit&dispatch= 5885d80a13c0db1f8e263 663d3faee8d1e83f46a369 95b3856cef1e18897ad75 27,3,0,0,2 Redirecting - Paypal 184397071904 31800000 0 | Clarkson University 13/42
  • 14. Phishing Website Detection Continuum: ML based detection • Title: An Image-based Feature Extraction Approach for Phishing Website Detection • Authors: Hao Jiang, Joshua White, Jeanna Matthews • Builds off of our previous work in phishing website detection, specifically the image analysis approach • Utilizes a Machine Learning based approach to identifying the most prominent images on a webpage, usually the sites logo • Is able to detect phishing sites that the phash/hamming distance method concludes as not similar. – These are the “poor quality” phishing sites | Clarkson University 14/42
  • 15. Malware Infection Vector Detection • BEK (The Blackhole Exploit Kit) was the predominant MaaS (Malware as a Service) in 2012. • It accounted for an estimated 29% of all malicious URLs. • BEK licenses went for around 1500$ USD • BEK used Twitter as it’s primary means of spreading infectious URLs • Our method detects these malicious URLs and infectious accounts on a large scale | Clarkson University 15/42
  • 16. Malware Infection Vector Detection Continued • Joshua S. White and Jeanna N. Matthews, “It’s you on photo?: Automatic detection of Twitter accounts in- fected with the Blackhole Exploit Kit,” Malicious and Unwanted Software: "The Americas" (MALWARE), 2013 8th International Conference on , vol., no., pp.51,58, 22-24 Oct. 2013 doi: 10.1109/MALWARE.2013.6703685 | Clarkson University 16/42
  • 17. Malware Infection Vector Detection Continued | Clarkson University 17/42
  • 18. Malware Infection Vector Detection Continued | Clarkson University 18/42
  • 19. Actor Identification • Title: Connectors, Mavens, Salesmen and More: Actor Based Online Social Network (OSN) Analysis Method Using Tensed Predicate Logic • Authors: Joshua White and Jeanna Matthews • Submitted to KDD2014 (Knowledge Discovery and Data Mining) Conference “Data Mining for Social Good” • Utilized multiple definitions of actor types to created tensed predicate logic descriptions • Translated these logics into semantic queries • Tested the queries against a known dataset | Clarkson University 19/42
  • 20. Actor Identification Continued | Clarkson University 20/42
  • 21. Actor Identification Continued • Time is important • Previous methods did not take event sequence into account • Liaison Example: | Clarkson University 21/42
  • 22. Actor Identification Continued | Clarkson University 22/42
  • 23. Actor Identification Continued | Clarkson University 23/42
  • 24. Event Identification • Still in the initial stages of this part of our work • Given a general topic, “search term, hashtag,” we can identify most of the related content from the dataset • We have a means for alerting on all new posts regarding that term • We can dig historically through the data and trace the path that an itea took • We can identify the influential individuals, “accounts,” that played a part in the infor- mation spread • Our test case was the KONY2012 Event | Clarkson University 24/42
  • 25. Event Identification Continued | Clarkson University 25/42
  • 26. Event Identification Continued • Top 10 Twitter Accounts, sending and receiving KONY2012 related Tweets Directed @ Account Names In-Degree Origin Account Names Out-Degree tothekidswho 625 twittonpeace 47 Invisible 125 interhabernet 44 youtube 118 DailyisOut 44 helpspreadthis 95 MEDYA_TURK 42 justinbieber 83 haber_42 35 prettypinkprobz 48 gundem_haber 30 ninadobrev 48 twittofpeace 22 MeekMill 47 korkmazhaber 19 ladygaga 43 tarafsiz_haber 14 KendallJenner 39 Son_DakikaHaber 13 | Clarkson University 26/42
  • 27. Event Identification Continued • Top 10 Twitter Accounts, retweeting and being retweeted regarding KONY2012 Retweeting Accounts In-Degree Message Source Out-Degree MedyaKonya 8 Stop____Kony 2642 twittonpeace 8 tothekidswho 753 haber_42 7 konyfamous2012 716 gundem_haber 7 Kony2012Help 615 korkmazhaber 7 stop______kony 353 DailyisOut 7 WESTOPKONY 225 interhabernet 6 zaynmalik 221 KONYA_ZAMAN 6 iSayStopKony 127 konya_time 6 Stop_2012_Kony 80 konyagazetesi 5 Kony_Awareness 72 | Clarkson University 27/42
  • 28. Event Identification Continued | Clarkson University 28/42
  • 29. Event Identification Continued | Clarkson University 29/42
  • 30. Conclusions • We aimed to answer the following questions when we started this work: – Can we come up with a way of classifying users based on actor types? – Can we determine who the opinion leaders or influencers are? – Can we determine how information spreads on these networks? – Can we detect malicious social network use? – Are there information security applications for social network data-mining? • I think we did a good job at providing at least some cursory answers to these questions | Clarkson University 30/42
  • 31. Future Work • We have applied for a data grant from Twitter • We have, are in the process of, moving our entire dataset to the lab at Clarkson and building up a new capture/analysis system • I am planning on pursuing the semantic side of social network analysis – Currently only one SNA semantic ontology exists and it’s on on paper. – I am planning on rolling both the actor and event analysis into one approach which will be part of a new ontology | Clarkson University 31/42
  • 32. Acknowledgements • I would like to thank: – Dr. Matthews – Dr. Bay – Dr. Lynch – Dr. Schuckers – Dr. Liu | Clarkson University 32/42
  • 33. References [1] Gladwell, M. (2000). The tipping point. Boston: Little, Brown and Company | Clarkson University 33/42
  • 37. • DDFS | Clarkson University 37/42
  • 39. • Twitter JSON Key Fields profile_link_color Coordinates verified In_reply_to_screen_name Geo time_zone In_reply_to_status_id text statuses_count In_reply_to_status_id_str entities Contributors In_reply_to_user_id place protected profile_background_color contributors_enabled trunkated profile_background_title default_profile retweeted default_profile_image description id_translator follow_request_sent followers_count location friends_count geo_endabled favorites_count profile_image_url_https listed_count following profile_background_image_url notifications retweet_count background_image_url_https name created_at profile_image_url lang Favorited sidebar_border_color use_background_image Id_str sidebar_fill_color screen_name Created_at profile_text_color show_all_inline_media Id url utc_offset | Clarkson University 39/42
  • 40. • BEK Infectious Account Visualization | Clarkson University 40/42
  • 41. • Tensed Predicate Logic Key | Clarkson University 41/42
  • 42. • Coalmine User Interface | Clarkson University 42/42