SlideShare a Scribd company logo
1 of 36
Download to read offline
Duplicate Detection on Hoaxy
Dataset
Team: Spoilers
Social Media Mining
Hackathon
Essa, Yasanka and Sreeja
Hoaxy - Collects low-credible documents in
web, E.g., fake news, hoax, rumour,
conspiracy Etc.
- Visualize the spread of document URL
in Twitter
- Temporal trends
- # shares in Twitter
- Diffusion Network
- retweets, reply, mention Etc.
- Include bot scores, details of fact
checks Etc.
https://hoaxy.iuni.iu.edu
Hackathon Objectives
Given two low-credible claims find whether they are duplicates
- Based on the Linguistic Features of the claims
- looking at the content of claims (Obvious !!)
- Based on Temporal Dynamics in Twitter
- looking at the diffusion networks of two claim URLs (Wow !!)
Duplicate
Detection
In a nutshell..
Yes / No
Linguistic Features Propagation Dynamics
Overall Architecture
Hoaxy
Database
Hoaxy
API
Web
Crawler
Documents
URLURLURL
URLURLURL
Diffusion
Networks
Unsupervised
Learning
(Document
Clustering)
LDA
LSI
HDP
Duplicates
Supervised
Learning
(Random
Forest)
Diffusion Networks
Duplicates
Overall Architecture
Hoaxy
Database
Hoaxy
API
Web
Crawler
Documents
URLURLURL
URLURLURL
Diffusion
Networks
Unsupervised
Learning
(Document
Clustering)
LDA
LSI
HDP
Duplicates
Supervised
Learning
(Random
Forest)
Diffusion Networks
Duplicates
Hoaxy API
● It provides seven types of queries
○ GET Articles, GET LatestArticles, GET Network, GET Timeline, GET TopArticles, GET TopUsers,
and GET Tweets.
● GET Articles query
○ It requires a keyword and the date published periods.
○ It returns url source, date published, article/document id, domain, site type, total tweets, the title,
the score.
● GET Network query
○ It requires the article/document id with specifying max edges, max nodes, and whether mentions
are included or not.
Data Collection Process
● Collecting Documents
○ Keywords List -->21 general terms in businesses, politics, trend news.
○ Hoaxy API for articles queries.
○ Collecting articles for one month between Sep 16, 2018 to Oct 16, 2018
● Collecting Star Networks
○ The collected article/document ids
○ Hoaxy API for diffusion network queries.
○ Collecting Star networks for each article.
Data Characterization
● Total documents is ~7k.
● Total stars is ~82k.
● Max. star nets. in a doc is 627.
● Largest star net size (adoptions) is ~ 9k
(specified in the API)
● Total nodes ~172k
● Total adoptions is ~ 522k
Tweet Adoption
● The longest adoption occurs in 28 days.
● The majority adoptions happen in less an hour.
● A user may adopt tweets of another user in more than one documents.
● 591 documents have been adopted by one pair.
Overall Architecture
Hoaxy
Database
Hoaxy
API
Web
Crawler
Documents
URLURLURL
URLURLURL
Diffusion
Networks
Unsupervised
Learning
(Document
Clustering)
LDA
LSI
HDP
Duplicates
Supervised
Learning
(Random
Forest)
Diffusion Networks
Duplicates
Web Scraping
Web Scraping
● Collected the content of the url
● Packages used
○ Requests
○ Nltk
○ Beautifulsoup
● Data cleaning
○ Stop word removal using NLTK’s
english stop word datasets
○ Punctuation removal
○ Lemmatization using gensim’s
lemmatize to only keep the nouns
Sample OutputOutput
Overall Architecture
Hoaxy
Database
Hoaxy
API
Web
Crawler
Documents
URLURLURL
URLURLURL
Diffusion
Networks
Unsupervised
Learning
(Document
Clustering)
LDA
LSI
HDP
Duplicates
Supervised
Learning
(Random
Forest)
Diffusion Networks
Duplicates
Document Clustering of Documents
● Classify similar documents into same class
● Used 3 different approaches
○ LSI (Latent Semantic Indexing)
○ LDA (Latent Dirichlet Allocation)
○ HDP (Hierarchical Dirichlet Process)
LDA
● It is a “generative probabilistic model”
● It builds
○ a topic per document model
○ words per topic model,
● Modeled as Dirichlet distribution (continuous multivariate probability distribution)
● Transformation from bag-of-words counts into a topic space
LDA Visualization
LSI
● Principle - words that are used in the same contexts tend to have similar meanings.
● Matrix decomposition (SVD) on the term document matrix.
○ Identify patterns in the relationships between the terms and concepts
● Extract the conceptual content of a body of text
● Establishing associations between those terms that occur in similar contexts.
LSI - Example
source : wikipedia
Coherence score
● To find the optimal number of topics
● Computing the sum of pairwise scores of top n words w1, ...,wn used to describe the topic.
● Röder,Both,Hinneburg:”Exploring the Space of Topic Coherence Measures”, WSDM’15
Coherence score
HDP
● Nonparametric Bayesian approach to clustering grouped data
● Mixed-membership model for the unsupervised analysis
● Infers the number of topics from the data
● Wang, Paisley, Blei: “Online Variational Inference for the Hierarchical Dirichlet
Process”, JMLR (2011).
Example
HDP Visualization
Model Comparison
Overall Architecture
Hoaxy
Database
Hoaxy
API
Web
Crawler
Documents
URLURLURL
URLURLURL
Diffusion
Networks
Unsupervised
Learning
(Document
Clustering)
LDA
LSI
HDP
Duplicates
Supervised
Learning
(Random
Forest)
Diffusion Networks
Duplicates
Features: Propagation Dynamics
- Given a pair of feature vectors representing propagation dynamics of URL in Twitter
q1 q2 q3 q4
Star Lifetime
Document X 2 5 8 12
Document Y 4 6 15 24
Twitter
stages of star lifetime
(quartiles of adoption delay)
# Retweets
in a given
stage qx
Target: Documents in same cluster
- Given a pair of documents, whether they appeared in the same topic
Document X 2 5 8 12
Document Y 4 6 15 24
Topic 01
Topic 03
Topic 02
Document X
Document Y
2 5 8 12 4 6 15 24 0
Topic
Modeling
Multiple Targets: Documents in same cluster
- Given a pair of documents, whether they appeared in the same topic
Document X 2 5 8 12
Document Y 4 6 15 24
Topic 01
Topic 03
Topic 02
Document X
Document Y
2 5 8 12 4 6 15 24 0
LDA
Topic 01
Topic 03
Topic 02
Document X
Document Y
LSI
2 5 8 12 4 6 15 24 1
Classification: Random Forest
● Train/ Test per Topic
○ LDA
○ LSI
○ HDP
● Two classes (Highly Imbalanced):
○ Randomly draw balanced samples
● Hyper-parameters
○ # Decision Trees: 1000
○ Split criteria: Gini impurity
○ Max depth: 4
○ Bootstrapped samples
○ Max features = sqrt(# features)
Topic Modeling Train Test
LDA 411,558 137,186
LSI 215,768 71,923
HDP 120,243 40,081
Performance: Random Forest
Topic
Modeling
Target
(same cluster or not)
Precision Recall F1-score Train Test
LDA No 0.53 0.03 0.06 411,558 137,18
6
Yes 0.57 0.97 0.72
LSI No 0.58 0.21 0.30 215,768 71,923
Yes 0.63 0.90 0.75
HDP No 0.64 0.80 0.71 120,243 40,081
Yes 0.63 0.43 0.52
Discussion
● How useful such predictions?
○ Assume a Fake News article been posted in URL Z;
■ Now, Twitter users share/ retweet Z.
■ Later, administrators decide to kill the web-page in Z
■ Still, Twitter users share/ retweet the original tweet contains Z.
■ Use the URL diffusion network
● To predict the topic/ category of the dead article
● Topic Modeling captures;
○ latent semantic structure of documents
● Propagation Dynamics captures;
○ latent cascade structure originated in a different platform
Thank You.
Spoilers.

More Related Content

What's hot

Text Mining Using R
Text Mining Using RText Mining Using R
Text Mining Using RKnoldus Inc.
 
Medical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSparkMedical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSparkHelge Holzmann
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonPaco Nathan
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiersLars Marius Garshol
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchNoemi Derzsy
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text miningKrish_ver2
 
handle data with DHT and load balnce over P2P network
handle data with DHT and load balnce over P2P networkhandle data with DHT and load balnce over P2P network
handle data with DHT and load balnce over P2P networkHema Priya
 
Security, Privacy and Trust - Web Technologies (1019888BNR)
Security, Privacy and Trust - Web Technologies (1019888BNR)Security, Privacy and Trust - Web Technologies (1019888BNR)
Security, Privacy and Trust - Web Technologies (1019888BNR)Beat Signer
 
Automatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksAutomatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksMarko Rodriguez
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebIJwest
 
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast EfficientlyFull-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficientlyijsrd.com
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly CommunityMarko Rodriguez
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 

What's hot (20)

Text Mining Using R
Text Mining Using RText Mining Using R
Text Mining Using R
 
Medical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSparkMedical Heritage Library (MHL) on ArchiveSpark
Medical Heritage Library (MHL) on ArchiveSpark
 
SF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in PythonSF Python Meetup: TextRank in Python
SF Python Meetup: TextRank in Python
 
Linking data without common identifiers
Linking data without common identifiersLinking data without common identifiers
Linking data without common identifiers
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
PyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from ScratchPyGotham NY 2017: Natural Language Processing from Scratch
PyGotham NY 2017: Natural Language Processing from Scratch
 
Ch19
Ch19Ch19
Ch19
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text mining
 
search engine
search enginesearch engine
search engine
 
handle data with DHT and load balnce over P2P network
handle data with DHT and load balnce over P2P networkhandle data with DHT and load balnce over P2P network
handle data with DHT and load balnce over P2P network
 
Security, Privacy and Trust - Web Technologies (1019888BNR)
Security, Privacy and Trust - Web Technologies (1019888BNR)Security, Privacy and Trust - Web Technologies (1019888BNR)
Security, Privacy and Trust - Web Technologies (1019888BNR)
 
Automatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative NetworksAutomatic Metadata Generation using Associative Networks
Automatic Metadata Generation using Associative Networks
 
Using Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic WebUsing Page Size for Controlling Duplicate Query Results in Semantic Web
Using Page Size for Controlling Duplicate Query Results in Semantic Web
 
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast EfficientlyFull-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
 
Working with text data
Working with text dataWorking with text data
Working with text data
 
A Model of the Scholarly Community
A Model of the Scholarly CommunityA Model of the Scholarly Community
A Model of the Scholarly Community
 
Text categorization
Text categorizationText categorization
Text categorization
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 

Similar to Duplicate Detection on Hoaxy Dataset

The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedSören Auer
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondFrank Kelly
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021Gérard Dupont
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesTony Hammond
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory acijjournal
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Noemi Derzsy
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsPyData
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object EmbeddingJacob Xu
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webFabien Gandon
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph IntroductionSören Auer
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisMathieu d'Aquin
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesDr.-Ing. Thomas Hartmann
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryRuben Schalk
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021hala Skaf
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxKalpit Desai
 

Similar to Duplicate Detection on Hoaxy Dataset (20)

Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
Topic modelling
Topic modellingTopic modelling
Topic modelling
 
Hierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyondHierarchical clustering in Python and beyond
Hierarchical clustering in Python and beyond
 
Tds — big science dec 2021
Tds — big science dec 2021Tds — big science dec 2021
Tds — big science dec 2021
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory Generating domain specific sentiment lexicons using the Web Directory
Generating domain specific sentiment lexicons using the Web Directory
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding20140327 - Hashing Object Embedding
20140327 - Hashing Object Embedding
 
bridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the webbridging formal semantics and social semantics on the web
bridging formal semantics and social semantics on the web
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept AnalysisExtracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
Extracting Relevant Questions to an RDF Dataset Using Formal Concept Analysis
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Linked Open Data Utrecht University Library
Linked Open Data Utrecht University LibraryLinked Open Data Utrecht University Library
Linked Open Data Utrecht University Library
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
 
TopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptxTopicModels_BleiPaper_Summary.pptx
TopicModels_BleiPaper_Summary.pptx
 

More from Sameera Horawalavithana

Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationSameera Horawalavithana
 
Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
 Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
Drivers of Polarized Discussions on Twitter during Venezuela Political CrisisSameera Horawalavithana
 
Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
 Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
Twitter Is the Megaphone of Cross-platform Messaging on the White HelmetsSameera Horawalavithana
 
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...Sameera Horawalavithana
 
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHub
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHubMentions of Security Vulnerabilities on Reddit, Twitter and GitHub
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHubSameera Horawalavithana
 
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...Sameera Horawalavithana
 
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...Sameera Horawalavithana
 
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation [ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation Sameera Horawalavithana
 
Be Elastic: Leapset Innovation session 06-08-2015
Be Elastic: Leapset Innovation session 06-08-2015Be Elastic: Leapset Innovation session 06-08-2015
Be Elastic: Leapset Innovation session 06-08-2015Sameera Horawalavithana
 
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...Sameera Horawalavithana
 
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...Sameera Horawalavithana
 
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingTalk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingSameera Horawalavithana
 

More from Sameera Horawalavithana (17)

Data-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and SimulationData-driven Studies on Social Networks: Privacy and Simulation
Data-driven Studies on Social Networks: Privacy and Simulation
 
Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
 Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
Drivers of Polarized Discussions on Twitter during Venezuela Political Crisis
 
Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
 Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
Twitter Is the Megaphone of Cross-platform Messaging on the White Helmets
 
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
Behind the Mask: Understanding the Structural Forces That Make Social Graphs ...
 
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHub
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHubMentions of Security Vulnerabilities on Reddit, Twitter and GitHub
Mentions of Security Vulnerabilities on Reddit, Twitter and GitHub
 
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...
[MLNS | NetSci] A Generative/ Discriminative Approach to De-construct Cascadi...
 
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
[Compex Network 18] Diversity, Homophily, and the Risk of Node Re-identificat...
 
Dancing with Stream Processing
Dancing with Stream ProcessingDancing with Stream Processing
Dancing with Stream Processing
 
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation [ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
[ARM 15 | ACM/IFIP/USENIX Middleware 2015] Research Paper Presentation
 
Be Elastic: Leapset Innovation session 06-08-2015
Be Elastic: Leapset Innovation session 06-08-2015Be Elastic: Leapset Innovation session 06-08-2015
Be Elastic: Leapset Innovation session 06-08-2015
 
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
[Undergraduate Thesis] Final Defense presentation on Cloud Publish/Subscribe ...
 
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
[Undergraduate Thesis] Interim presentation on A Publish/Subscribe Model for ...
 
Locality sensitive hashing
Locality sensitive hashingLocality sensitive hashing
Locality sensitive hashing
 
Zipf distribution
Zipf distributionZipf distribution
Zipf distribution
 
Query personalization
Query personalizationQuery personalization
Query personalization
 
Dancing with publish/subscribe
Dancing with publish/subscribeDancing with publish/subscribe
Dancing with publish/subscribe
 
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingTalk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
 

Recently uploaded

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

Duplicate Detection on Hoaxy Dataset