SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
www.insight-centre.org

An Ontology-based Technique for
Online Profile Resolution
Keith Cortis, Simon Scerri, Ismael Rivera,
Siegfried Handschuh

International Conference on Social Informatics
Kyoto, Japan

27th November 2013
Introduction (1)
www.insight-centre.org



Instance Matching : if two instances /
representations refer to the same real world
entity or not e.g., persons

 Research Challenge : Discovery of multiple
online profiles that refer to the same person
identity on heterogeneous social networks
Introduction (2)
www.insight-centre.org



Improved profile matching system extended
with:
 Named

Entity Recognition
 Linked Open Data
 Semantic Matching

Additional Benefit: Ontology used
background schema
 Advantage: Standard schema enables
cross-network interoperability


as

a
Motivation
www.insight-centre.org

 Contact Matcher Applications:
 Control sharing of personal data
 Detection of fully or partly anonymous
contacts
o

> 83 million fake accounts

 New contacts suggestions that are of direct
interest to user
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

6
Online Profile Merging

Attribute Weighting
Function
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction

2
Semantic Lifting
Semantic Lifting
www.insight-centre.org

 Lifting semi-/un-structured profile information
from a remote schema

 Transform information to instances of the
Contact Ontology (NCO)
 NCO - Identity-related online profile information
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Large KB
Gazetteer

Surname

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

Country
Attribute Value Matching
www.insight-centre.org

 Direct Value Comparison

 String Matching
Best string matching metric for each
attribute type
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Large KB
Gazetteer

Surname

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b
Semantic-based
Matching Extension
City

Country
country

Country
Semantic-based Matching
www.insight-centre.org

 Indirect semantic relations at a schema level
 Use-case: Location-related profile attributes
 Location sub-entities being semantically
compared are: city, region and country
 Find the semantic relations between the subentities in question in a bi-directional manner
 E.g. Galway (profile 1) vs. Ireland (profile 2)
Galway

locatedWithin

Ireland

Ireland

country
isPartOf

isLocationOf
containsLocation

Galway
capital
largestCity
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country
country

Country

Attribute Weighting
Function
Attribute Weighting Function
www.insight-centre.org

 Approach 1: Direct Similarity Score
Name

Justin Bieber

Similarity Value

J. Bieber
0.90

 Approach 2: Normalised Similarity Score
based on a threshold for each attribute type
Attribute Threshold for Name : 0.70
Name

Justin Bieber

J. Bieber

Metric Similarity Value

0.90

Similarity Value

1.0

Name

Justin Bieber

Joffrey Baratheon

Metric Similarity Value

0.4

Similarity Value

0.0
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

Attribute Weighting
Function
Online Profile Suggestions
www.insight-centre.org

Name

Joffrey Baratheon

Joff Baratheon

City

King’s Landing

King’s Landing

Role

King

King

286AL

286AL

Date of Birth
Similarity Score

0.95
Similarity Threshold: 0.90

Name

Joffrey Baratheon

Joffrey Bieber

City

King’s Landing

London, Ontario

Role

King

Singer

286AL

01/03/1994

Date of Birth
Similarity Score

0.30
Online Profile Suggestions
www.insight-centre.org
Profile Resolution Technique
www.insight-centre.org
1
User Profile
Data Extraction
NCO

2
Semantic Lifting

3
Named Entity Recognition
Name
ANNIE
IE System

Surname

Large KB
Gazetteer

City

4
Hybrid Matching
Process
a
Attribute
Value
Matching

b

c

Semantic-based
Matching Extension
City

Country

Country
country

5
Online Profile Suggestions

6
Online Profile Merging

Attribute Weighting
Function
Experiments & Evaluation
www.insight-centre.org

 Two-staged evaluation:
1. Technique
a) Best attribute similarity score approach
b) If NER & semantic-based matching extension
improve overall technique
c) The computational performance of hybrid
technique against the syntactic-based one
d) A similarity threshold that determines profile
equivalence within a satisfactory degree of
confidence

2. Usability
e) Level of precision for the profile matching
Technique Evaluation
www.insight-centre.org

 Two Datasets:
1. A controlled dataset of public profiles obtained
from the Web (LinkedIn and Twitter)
 182 online profiles
–
–

112 ambiguous real-world
persons (common attributes)
70 refer to 35 well-known
sports journalists

 Maximised False Positives

2. Private personal and contact-list profiles
obtained from 5 consenting participants
Technique Evaluation – Experiment 1
www.insight-centre.org

 Profile attribute similarity score that fares best
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Normalised Approach

Precision
Recall
F1-Measure

0.7

0.75

0.8

0.85

Threshold value

0.9

Results

Result

Direct Approach
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Precision
Recall
F1-Measure

0.7

0.75

0.8

0.85

0.9

Threshold value

 Direct Approach outperforms Normalised Approach
 8631 online profile pair comparisons
Technique Evaluation – Experiment 2
www.insight-centre.org

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

String
Technique

Precision
Recall
F1-Measure

0.7

0.75
Threshold value

0.8

Result

Result

 String-based technique vs. String + NER + Semanticbased technique
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

Hybrid
Technique

Precision
Recall
F1-Measure

0.7

0.75

0.8

Threshold value

 New hybrid technique improves the results
considerably over the string-only based one
 F-measure -> more or less stable for thresholds of
0.75 and 0.8.
Technique Evaluation – Experiment 3
www.insight-centre.org

 Computational performance of hybrid technique vs.
syntactic-only based one
 For this test we selected profile pairs:
 Having a number of common attributes
 At least 1 attribute candidate for semantic matching
40
35

Time (ms)

30
25
20

Syntactic

15

Hybrid

10
5
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of Common Attributes

 On average hybrid technique takes ≈15ms more
Technique Evaluation – Experiment 4
www.insight-centre.org

 Find a deterministic similarity threshold with the
highest degree of confidence
1.0
0.9
0.8
0.7

Result

0.6
0.5
0.4
0.3
0.2
0.1
0.0

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

Precision

0.290

0.317

0.550

0.694

0.806

0.876

0.940

0.947

0.988

Recall

0.805

0.784

0.654

0.600

0.584

0.573

0.508

0.486

0.454

F1-Measure 0.426

0.452

0.598

0.643

0.677

0.693

0.660

0.643

0.622

 Optimal threshold is 0.9 -> F-measure of 0.693
Usability Evaluation (1)
www.insight-centre.org

 Quantitative & Qualitative
 Performance of profile matching technique
 Contact matcher run against the two social
networks that user is most active
 Social Networks chosen:
 Number of participants: 16
 Person suggestion page
 Short survey about their user experience
Usability Evaluation (2)
www.insight-centre.org

 Usability Evaluation Results:
#Distinct Profiles: 8,415
#Average Profiles per Social Network per
Participant: 262
#Comparisons: 1,041,279
#Person Matching Suggestions: 1,195
#Correct Matches: 975
#Incorrect Matches: 220
#Precision rate: 0.816
Usability Evaluation (3)
www.insight-centre.org

 Statistics & Results:
Social Network Integration
– 56.25% : LinkedIn and Facebook
– 25% : LinkedIn and Twitter
– 18.75% : Facebook and Twitter

User Satisfaction
– 50% : Extremely
– 43.8% : Quite a bit
– 0% : Moderately
– 6.3% : A little
– 0% : Not at all
Usability Evaluation (4)
www.insight-centre.org

Application 1: Management & Sharing

Application 2: Enhanced Security

Application 3: Networking & Suggestions
Limitations
www.insight-centre.org

 Person’s gender is not provided by all social
network APIs
Identify gender based on first name or
surname through NER
 Weights of some profile attributes e.g., first
name, surname are too high
 In some cases they impact the final result too
strongly
More experiments will be conducted to finetune these weights
Future Work
www.insight-centre.org

 Consider identification of higher degrees of
semantic relatedness

country

 Enrich technique with other LOD cloud datasets
 Additional social networks targeted
Conclusion
www.insight-centre.org

 Profile matching algorithm with:
Semantic Lifting
NER on semi-/un-structured profile information
Linked Open Data to improve the NER process
Semantic matching at the schema level to find
any possible indirect semantic relations
Weighted Profile Attribute Matching

 Quantitative & Qualitative Evaluation
Thank you for your attention
Related Work Comparison
www.insight-centre.org

 Existing Profile Matching Approaches based on:
User’s friends
Specific Inverse Functional Properties e.g., email
address
String matching of all profile attribute
Semantic relatedness between text, depending
on remote Knowledge Bases e.g., Wikipedia

 Evaluation of these Approaches:
Technique Evaluation on controlled datasets
No Usability Evaluation

Contenu connexe

Tendances

IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure AlgorithmIRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure AlgorithmIRJET Journal
 
IRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine LearningIRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine LearningIRJET Journal
 
Determining a digital profile from public social media information.
Determining a digital profile from public social media information.Determining a digital profile from public social media information.
Determining a digital profile from public social media information.Karolina Stamblewska
 
Social networks protection against fake profiles and social bots attacks
Social networks protection against  fake profiles and social bots attacksSocial networks protection against  fake profiles and social bots attacks
Social networks protection against fake profiles and social bots attacksAboul Ella Hassanien
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applicationsAndrey Apuhtin
 
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYPHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYIJNSA Journal
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGijcsit
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkillijtsrd
 
AppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking AppsAppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking AppsMarkus Huber
 
A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques IJECEIAES
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...gerogepatton
 
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACTIEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACTtsysglobalsolutions
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...IJCNCJournal
 
762019109
762019109762019109
762019109IJRAT
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET Journal
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningMirXahid1
 
Fake Product Review Monitoring System
Fake Product Review Monitoring SystemFake Product Review Monitoring System
Fake Product Review Monitoring Systemijtsrd
 

Tendances (20)

IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure AlgorithmIRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
IRJET- Detecting the Phishing Websites using Enhance Secure Algorithm
 
Learning to detect phishing ur ls
Learning to detect phishing ur lsLearning to detect phishing ur ls
Learning to detect phishing ur ls
 
IRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine LearningIRJET- Phishing Website Detection based on Machine Learning
IRJET- Phishing Website Detection based on Machine Learning
 
Determining a digital profile from public social media information.
Determining a digital profile from public social media information.Determining a digital profile from public social media information.
Determining a digital profile from public social media information.
 
Social networks protection against fake profiles and social bots attacks
Social networks protection against  fake profiles and social bots attacksSocial networks protection against  fake profiles and social bots attacks
Social networks protection against fake profiles and social bots attacks
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applications
 
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEYPHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
PHISHING MITIGATION TECHNIQUES: A LITERATURE SURVEY
 
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNINGDETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
DETECTION OF FAKE ACCOUNTS IN INSTAGRAM USING MACHINE LEARNING
 
Vulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using WebkillVulnerability Assessment and Penetration Testing using Webkill
Vulnerability Assessment and Penetration Testing using Webkill
 
AppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking AppsAppInspect: Large-scale Evaluation of Social Networking Apps
AppInspect: Large-scale Evaluation of Social Networking Apps
 
A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques A Survey: Data Leakage Detection Techniques
A Survey: Data Leakage Detection Techniques
 
B07040308
B07040308B07040308
B07040308
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
 
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACTIEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
 
762019109
762019109762019109
762019109
 
Iy2515891593
Iy2515891593Iy2515891593
Iy2515891593
 
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using PysparkIRJET- Analysis and Detection of E-Mail Phishing using Pyspark
IRJET- Analysis and Detection of E-Mail Phishing using Pyspark
 
Predicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learningPredicting cyber bullying on t witter using machine learning
Predicting cyber bullying on t witter using machine learning
 
Fake Product Review Monitoring System
Fake Product Review Monitoring SystemFake Product Review Monitoring System
Fake Product Review Monitoring System
 

En vedette

Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networksIIIT Hyderabad
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsHappiest Minds Technologies
 
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...kcortis
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTrilok Sharma
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Dev Sahu
 
project report of social networking web sites
project report of social networking web sitesproject report of social networking web sites
project report of social networking web sitesGyanendra Pratap Singh
 
Social Networking Project
Social Networking ProjectSocial Networking Project
Social Networking Projectjessduff44
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social networkakash_mishra
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sasAnalyst
 

En vedette (13)

Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networks
 
Profile Matching in Solving Rank Problem
Profile Matching in Solving Rank ProblemProfile Matching in Solving Rank Problem
Profile Matching in Solving Rank Problem
 
Whitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest MindsWhitepaper: Extract value from Facebook Data - Happiest Minds
Whitepaper: Extract value from Facebook Data - Happiest Minds
 
Timilar ppt
Timilar pptTimilar ppt
Timilar ppt
 
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
Discovering Semantic Equivalence of People behind Online Profiles (RED 2012 -...
 
Facebook data analysis using r
Facebook data analysis using rFacebook data analysis using r
Facebook data analysis using r
 
social networking site
social networking sitesocial networking site
social networking site
 
Tweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVMTweets Classification using Naive Bayes and SVM
Tweets Classification using Naive Bayes and SVM
 
Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier Sentiment analysis using naive bayes classifier
Sentiment analysis using naive bayes classifier
 
project report of social networking web sites
project report of social networking web sitesproject report of social networking web sites
project report of social networking web sites
 
Social Networking Project
Social Networking ProjectSocial Networking Project
Social Networking Project
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Twitter text mining using sas
Twitter text mining using sasTwitter text mining using sas
Twitter text mining using sas
 

Similaire à ONTOLOGY-BASED PROFILE RESOLUTION

2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdiswebuploader
 
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...Carlton Northern
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisNeo4j
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsKrishnaram Kenthapadi
 
Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Marco Balduzzi
 
PHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINKPHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINKRajeshRavi44
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学Xu jiakon
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy HodlerNeo4j
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1Mark Wilkinson
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsIOSRjournaljce
 
Entity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsEntity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsPvrtechnologies Nellore
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelTrey Grainger
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Reiner Kraft
 
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)vivekkaushik795
 
Network Analysis for SEO and Social Media
Network Analysis for SEO and Social MediaNetwork Analysis for SEO and Social Media
Network Analysis for SEO and Social MediaMediative
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriFlink Forward
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systemsPersonalizing the web building effective recommender systems
Personalizing the web building effective recommender systemsAravindharamanan S
 

Similaire à ONTOLOGY-BASED PROFILE RESOLUTION (20)

2006-05-25__coi-semdis
2006-05-25__coi-semdis2006-05-25__coi-semdis
2006-05-25__coi-semdis
 
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
MDS 2011 Presentation: An Unsupervised Approach to Discovering and Disambigua...
 
Graph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysisGraph Data Science DEMO for fraud analysis
Graph Data Science DEMO for fraud analysis
 
Fairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML SystemsFairness and Privacy in AI/ML Systems
Fairness and Privacy in AI/ML Systems
 
Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)Attacking the Privacy of Social Network users (HITB 2011)
Attacking the Privacy of Social Network users (HITB 2011)
 
PHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINKPHISHING URL DETECTION AND MALICIOUS LINK
PHISHING URL DETECTION AND MALICIOUS LINK
 
Mazhiming
MazhimingMazhiming
Mazhiming
 
Internet 信息检索中的数学
Internet 信息检索中的数学Internet 信息检索中的数学
Internet 信息检索中的数学
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
 
FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1FAIR Metrics - Presentation to NIH KC1
FAIR Metrics - Presentation to NIH KC1
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital Footprints
 
Entity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutionsEntity linking with a knowledge base issues techniques and solutions
Entity linking with a knowledge base issues techniques and solutions
 
Lincoln talent analysis
Lincoln talent analysisLincoln talent analysis
Lincoln talent analysis
 
Ithet
IthetIthet
Ithet
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)Contextual Shortcuts (CIKM 2007)
Contextual Shortcuts (CIKM 2007)
 
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
Synopsis_rt_v_k.pptx(fgfefefehgftgegfeh)
 
Network Analysis for SEO and Social Media
Network Analysis for SEO and Social MediaNetwork Analysis for SEO and Social Media
Network Analysis for SEO and Social Media
 
Automatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia KalavriAutomatic Detection of Web Trackers by Vasia Kalavri
Automatic Detection of Web Trackers by Vasia Kalavri
 
Personalizing the web building effective recommender systems
Personalizing the web building effective recommender systemsPersonalizing the web building effective recommender systems
Personalizing the web building effective recommender systems
 

Dernier

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Dernier (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

ONTOLOGY-BASED PROFILE RESOLUTION

  • 1. www.insight-centre.org An Ontology-based Technique for Online Profile Resolution Keith Cortis, Simon Scerri, Ismael Rivera, Siegfried Handschuh International Conference on Social Informatics Kyoto, Japan 27th November 2013
  • 2. Introduction (1) www.insight-centre.org  Instance Matching : if two instances / representations refer to the same real world entity or not e.g., persons  Research Challenge : Discovery of multiple online profiles that refer to the same person identity on heterogeneous social networks
  • 3. Introduction (2) www.insight-centre.org  Improved profile matching system extended with:  Named Entity Recognition  Linked Open Data  Semantic Matching Additional Benefit: Ontology used background schema  Advantage: Standard schema enables cross-network interoperability  as a
  • 4. Motivation www.insight-centre.org  Contact Matcher Applications:  Control sharing of personal data  Detection of fully or partly anonymous contacts o > 83 million fake accounts  New contacts suggestions that are of direct interest to user
  • 5. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions 6 Online Profile Merging Attribute Weighting Function
  • 6. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction 2 Semantic Lifting
  • 7. Semantic Lifting www.insight-centre.org  Lifting semi-/un-structured profile information from a remote schema  Transform information to instances of the Contact Ontology (NCO)  NCO - Identity-related online profile information
  • 8. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Large KB Gazetteer Surname City 4 Hybrid Matching Process a Attribute Value Matching Country
  • 9. Attribute Value Matching www.insight-centre.org  Direct Value Comparison  String Matching Best string matching metric for each attribute type
  • 10. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Large KB Gazetteer Surname City 4 Hybrid Matching Process a Attribute Value Matching b Semantic-based Matching Extension City Country country Country
  • 11. Semantic-based Matching www.insight-centre.org  Indirect semantic relations at a schema level  Use-case: Location-related profile attributes  Location sub-entities being semantically compared are: city, region and country  Find the semantic relations between the subentities in question in a bi-directional manner  E.g. Galway (profile 1) vs. Ireland (profile 2) Galway locatedWithin Ireland Ireland country isPartOf isLocationOf containsLocation Galway capital largestCity
  • 12. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country country Country Attribute Weighting Function
  • 13. Attribute Weighting Function www.insight-centre.org  Approach 1: Direct Similarity Score Name Justin Bieber Similarity Value J. Bieber 0.90  Approach 2: Normalised Similarity Score based on a threshold for each attribute type Attribute Threshold for Name : 0.70 Name Justin Bieber J. Bieber Metric Similarity Value 0.90 Similarity Value 1.0 Name Justin Bieber Joffrey Baratheon Metric Similarity Value 0.4 Similarity Value 0.0
  • 14. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions Attribute Weighting Function
  • 15. Online Profile Suggestions www.insight-centre.org Name Joffrey Baratheon Joff Baratheon City King’s Landing King’s Landing Role King King 286AL 286AL Date of Birth Similarity Score 0.95 Similarity Threshold: 0.90 Name Joffrey Baratheon Joffrey Bieber City King’s Landing London, Ontario Role King Singer 286AL 01/03/1994 Date of Birth Similarity Score 0.30
  • 17. Profile Resolution Technique www.insight-centre.org 1 User Profile Data Extraction NCO 2 Semantic Lifting 3 Named Entity Recognition Name ANNIE IE System Surname Large KB Gazetteer City 4 Hybrid Matching Process a Attribute Value Matching b c Semantic-based Matching Extension City Country Country country 5 Online Profile Suggestions 6 Online Profile Merging Attribute Weighting Function
  • 18. Experiments & Evaluation www.insight-centre.org  Two-staged evaluation: 1. Technique a) Best attribute similarity score approach b) If NER & semantic-based matching extension improve overall technique c) The computational performance of hybrid technique against the syntactic-based one d) A similarity threshold that determines profile equivalence within a satisfactory degree of confidence 2. Usability e) Level of precision for the profile matching
  • 19. Technique Evaluation www.insight-centre.org  Two Datasets: 1. A controlled dataset of public profiles obtained from the Web (LinkedIn and Twitter)  182 online profiles – – 112 ambiguous real-world persons (common attributes) 70 refer to 35 well-known sports journalists  Maximised False Positives 2. Private personal and contact-list profiles obtained from 5 consenting participants
  • 20. Technique Evaluation – Experiment 1 www.insight-centre.org  Profile attribute similarity score that fares best 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Normalised Approach Precision Recall F1-Measure 0.7 0.75 0.8 0.85 Threshold value 0.9 Results Result Direct Approach 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Precision Recall F1-Measure 0.7 0.75 0.8 0.85 0.9 Threshold value  Direct Approach outperforms Normalised Approach  8631 online profile pair comparisons
  • 21. Technique Evaluation – Experiment 2 www.insight-centre.org 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 String Technique Precision Recall F1-Measure 0.7 0.75 Threshold value 0.8 Result Result  String-based technique vs. String + NER + Semanticbased technique 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Hybrid Technique Precision Recall F1-Measure 0.7 0.75 0.8 Threshold value  New hybrid technique improves the results considerably over the string-only based one  F-measure -> more or less stable for thresholds of 0.75 and 0.8.
  • 22. Technique Evaluation – Experiment 3 www.insight-centre.org  Computational performance of hybrid technique vs. syntactic-only based one  For this test we selected profile pairs:  Having a number of common attributes  At least 1 attribute candidate for semantic matching 40 35 Time (ms) 30 25 20 Syntactic 15 Hybrid 10 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Number of Common Attributes  On average hybrid technique takes ≈15ms more
  • 23. Technique Evaluation – Experiment 4 www.insight-centre.org  Find a deterministic similarity threshold with the highest degree of confidence 1.0 0.9 0.8 0.7 Result 0.6 0.5 0.4 0.3 0.2 0.1 0.0 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 Precision 0.290 0.317 0.550 0.694 0.806 0.876 0.940 0.947 0.988 Recall 0.805 0.784 0.654 0.600 0.584 0.573 0.508 0.486 0.454 F1-Measure 0.426 0.452 0.598 0.643 0.677 0.693 0.660 0.643 0.622  Optimal threshold is 0.9 -> F-measure of 0.693
  • 24. Usability Evaluation (1) www.insight-centre.org  Quantitative & Qualitative  Performance of profile matching technique  Contact matcher run against the two social networks that user is most active  Social Networks chosen:  Number of participants: 16  Person suggestion page  Short survey about their user experience
  • 25. Usability Evaluation (2) www.insight-centre.org  Usability Evaluation Results: #Distinct Profiles: 8,415 #Average Profiles per Social Network per Participant: 262 #Comparisons: 1,041,279 #Person Matching Suggestions: 1,195 #Correct Matches: 975 #Incorrect Matches: 220 #Precision rate: 0.816
  • 26. Usability Evaluation (3) www.insight-centre.org  Statistics & Results: Social Network Integration – 56.25% : LinkedIn and Facebook – 25% : LinkedIn and Twitter – 18.75% : Facebook and Twitter User Satisfaction – 50% : Extremely – 43.8% : Quite a bit – 0% : Moderately – 6.3% : A little – 0% : Not at all
  • 27. Usability Evaluation (4) www.insight-centre.org Application 1: Management & Sharing Application 2: Enhanced Security Application 3: Networking & Suggestions
  • 28. Limitations www.insight-centre.org  Person’s gender is not provided by all social network APIs Identify gender based on first name or surname through NER  Weights of some profile attributes e.g., first name, surname are too high  In some cases they impact the final result too strongly More experiments will be conducted to finetune these weights
  • 29. Future Work www.insight-centre.org  Consider identification of higher degrees of semantic relatedness country  Enrich technique with other LOD cloud datasets  Additional social networks targeted
  • 30. Conclusion www.insight-centre.org  Profile matching algorithm with: Semantic Lifting NER on semi-/un-structured profile information Linked Open Data to improve the NER process Semantic matching at the schema level to find any possible indirect semantic relations Weighted Profile Attribute Matching  Quantitative & Qualitative Evaluation Thank you for your attention
  • 31. Related Work Comparison www.insight-centre.org  Existing Profile Matching Approaches based on: User’s friends Specific Inverse Functional Properties e.g., email address String matching of all profile attribute Semantic relatedness between text, depending on remote Knowledge Bases e.g., Wikipedia  Evaluation of these Approaches: Technique Evaluation on controlled datasets No Usability Evaluation