Collabrate com2012 rashed

Deutschen Akademischen 8th IEEE International Conference on
Austauschdienstes
Collaborative Computing:
Networking, Applications and Worksharing
October 14–17, 2012 Pittsburgh, Pennsylvania, United States

Robust Expert Ranking in Online
Communities - Fighting Sybil Attacks
CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma Khaled A. N. Rashed, Cristina Balasoiu, Ralf Klamma
RWTH Aachen University
Advanced Community Information Systems (ACIS)
{rashed|balsoiu|klamma}@dbis.rwth-aachen.de

Lehrstuhl Informatik 5
(Information Systems)
Prof. Dr. M. Jarke
I5-DR-0312-1

Advanced Community Information
Deutschen Akademischen
Austauschdienstes

Systems (ACIS)

CollaborateCom2012
Responsive
Web Engineering Community

Web Analytics
Open
Visualization
Khaled Rashed Community
and
Cristina Balasoiu Information
Simulation
Systems
Ralf Klamma

Community Community
Support Analytics

Requirements
Prof. Dr. M. Jarke
I5-DR-0312-2
Engineering

Austauschdienstes

Agenda
 Introduction and motivation

 Related work
CollaborateCom2012

 Our Approach
Khaled Rashed

Cristina Balasoiu
– Expert ranking algorithm
Ralf Klamma

– Robustness of the expert ranking algorithm

 Evaluation

 Conclusions and outlook
Prof. Dr. M. Jarke
I5-DR-0312-3

Austauschdienstes

Introduction

 The expert search and ranking refer to the way of finding a
group of authoritative users with special skills and knowledge
CollaborateCom2012

for a specific category.
Khaled Rashed

Cristina Balasoiu
 The task is very important in online collaborative systems
Ralf Klamma

 Problems: openness and misbehaviour and
– No attention has been made to the trust and reputation of experts

 Solution: Leveraging trust
Prof. Dr. M. Jarke
I5-DR-0312-4

Austauschdienstes

Motivation Examples
Manipulating the truth for war Tidal bores presented as Indian Ocean
propaganda Tsunami

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

 Published as: British soldiers abusing  Published as: 2004 Indian Ocean Tsunami
prisoners in Iraq  Proved to be tidal bores, a four-day-long
 Proved to be fake by Brigadier Geoff government-sponsored tourist festival in
Sheldon who said the vehicle featured China
in the photo had never been to Iraq
Prof. Dr. M. Jarke
 Expert knowledge, analysis and witnesses are needed to identify the fake!
I5-DR-0312-5

A Case Study: Collaborative Fake Multimedia
Austauschdienstes

Detection System
 Collaborative activities (rating, tagging and commenting)
– Provide new means of search, retrieval and media authenticity
evaluation
CollaborateCom2012 – Explicit ratings and tags are used for evaluating authenticity of
multimedia items
Khaled Rashed

Cristina Balasoiu – Reliability: not all of the submitted ratings are reliable
Ralf Klamma – No centralized control mechanism
– Vulnerability to attacks
 Three types of users
– Honest users
– Experts
– Malicious users
Prof. Dr. M. Jarke
I5-DR-0312-6

Austauschdienstes

Research Questions and Goals
 Research questions
– How to measure users’ expertise in collaborative media sharing and
CollaborateCom2012 evaluating systems? and how to rank them?

Khaled Rashed
– What is the implication of trust
Cristina Balasoiu
Ralf Klamma – Robustness! how to ensure robustness of the ranking algorithm
 Goals
– Improve multimedia evaluation

– Reduce impacts of malicious users
Prof. Dr. M. Jarke
I5-DR-0312-7

Austauschdienstes

Related Work

 Probabilistic models e.g.[Tu et al.2010]

 Voting models [Macdonald and Ounis 2006] [Macdonald et al.2008]
CollaborateCom2012
 Link-based approaches PageRank [Brein and Page 1998], HITS
[Kleinberg1999] and their variations. SPEAR algorithm [Noll et al. 2009]
Khaled Rashed

Cristina Balasoiu
Ralf Klamma ExpertRank [Jiao et al. 2009]

 TREC enterprise track -Find the associations between candidates
and documents e.g.[Balog 2006, Balog 2007]

 Machine learning algorithms e.g. [Bian and Liu 2008, Li et al. 2009]
Prof. Dr. M. Jarke
I5-DR-0312-8

Austauschdienstes

Our Approach
 Assumptions
– Expert users tend to have many authenticity ratings
CollaborateCom2012 – Correctly evaluated media are rated by users of high expertise
Khaled Rashed – Following expert users provides more benefits
Cristina Balasoiu
Ralf Klamma  Expert definition
– Rates a big number of media files in an authentic way with respect to
a topic and Highly trusted by his directly connected users

– Should be trustable in evaluating multimedia
Prof. Dr. M. Jarke
I5-DR-0312-9

Austauschdienstes

Expert Ranking Methods
 Domain knowledge driven method
– Considers tags that users assign to media files
– User profile: merging tags user submitted to the media files in the
CollaborateCom2012 system
Khaled Rashed
– Similarity coefficient between the candidate profile and the tags
Cristina Balasoiu assigned to a specific resource
Ralf Klamma – Used to reorder users who voted a media file according to the tag
profile
 Domain knowledge independent method
– Use the connections between users and resources to decide on the
expertise of the users
– A modified version of HITS algorithm
Prof. Dr. M. Jarke
I5-DR-0312-10
– Mutual reinforcement of users expertise and media

Austauschdienstes

MHITS : Expert Ranking Algorithm
 MHITS: Expert ranking algorithm in online collaborative systems
– Link-based approach, based on HITS algorithm

CollaborateCom2012
– HITS
– Authorities: pages that are pointed to by good pages
Khaled Rashed

Cristina Balasoiu
– Hubs: pages that points to good pages
Ralf Klamma – Reinforcement between hubs and authorities
– MHITS
– Users act as hubs (correctly evaluated media rated by them)
– Media files act as authorities
– Mutual reinforcement between users and media files
– Local trust values between users are assigned
Prof. Dr. M. Jarke
I5-DR-0312-11 – Considers the rates of the users

Austauschdienstes

MHITS: Expert Ranking Algorithm

a(m) h(u ) r (u )
u U ( m)

CollaborateCom2012 h(u) β a(m) r(u) ( 1 β) t(u)
m M(u)

Khaled Rashed
Symbol Description

Cristina Balasoiu a(m) Authority score
Ralf Klamma U(m) Set of users pointing to media file m
h(u) Hubness score
r(u) Rating of user u for media file m
 one network for users and ratings t(u) Average trust of the direct connected
users to user u
 one for users only (trust network). M(u)
Set of media files to which user u points

Trust in range [0, 1] Coefficient that weights the influence of
(Information Systems) Ratings 0.5 for a fake vote, the two terms, in range [0, 1]
Prof. Dr. M. Jarke
I5-DR-0312-12 1 for an authentic vote

Austauschdienstes

Robustness of the MHITS Algorithm
 Compromising techniques
– Sybil attack [Douc02], Reputation theft, Whitewashing attack, etc.
– Compromising the input and the output of the algorithm
Sybil attack
CollaborateCom2012

Khaled Rashed – Fundamental problem in online collaborative systems
Cristina Balasoiu
– A malicious user creates many fake accounts (Sybils) which all
Ralf Klamma
reference the user to boost his reputation (attacker’s goal is to be
higher up in the rankings)
 Countermeasures against Sybil attack
SybilGuard [YKGF06] SybilLimit [YGKX08] SumUp [TMLS09]

Protocol type Decentralized Decentralized Centralized
Lehrstuhl Informatik 5 Accepted Sybils per
Prof. Dr. M. Jarke attack edge
I5-DR-0312-13

Austauschdienstes

SumUp
 Centralized approach SumUp Steps
– Aims to aggregate votes in a (1) Assign the source node and
Sybil resilient manner number of votes per media file
CollaborateCom2012
 Key idea – adaptive vote flow (2) Levels assignment
Khaled Rashed technique - that appropriately (3) Pruning step
Cristina Balasoiu
assigns and adjusts link capacities (4) Capacity assignment
Ralf Klamma
in the trust graph to collect the votes (5) Max-flow computation – collect
for an object
votes on each resource
 New: we Integrate SumUp with the (6) Leverage user history to penalize
MHITS Java implementation – used
adversarial nodes
own data structure based on Java
Sparse Arrays
Prof. Dr. M. Jarke
I5-DR-0312-14

Austauschdienstes

Integration of SumUp with MHITS

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

Prof. Dr. M. Jarke
I5-DR-0312-15

Austauschdienstes

Evaluation
 Experimental Setup
– BarabasiAlbert model for generating network
– 300 users
CollaborateCom2012
– 20 media files (10 known to be fake and 10 known to be authentic)
Khaled Rashed
– 800 ratings
Cristina Balasoiu – 3000 trust edges
Ralf Klamma

Prof. Dr. M. Jarke
I5-DR-0312-16

Austauschdienstes

Ratings Distribution

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

Prof. Dr. M. Jarke
I5-DR-0312-17

Austauschdienstes

Evaluation
 Evaluation metrics:
TopK' TopK
– Precision@K recision@K
K
CollaborateCom2012
– Spearman’s rank correlation coefficient
+1 0 -1
Khaled Rashed n

Cristina Balasoiu 6 d i2
Perfect Positive No Correlation Perfect Negative
Ralf Klamma ρs 1 i 1
Correlation Correlation
n(n2 1)
p - Spearman’s coefficient of rank correlation -1 ≤ ps ≤ 1
di - is the different between the rank of xi and the rank of yi
n:- the number of data points in the sample (total number of observations)
 ps = - 1 or 1 high degree of correlation between x any y
 Ps = 0 a lack of linear association between two variables
Prof. Dr. M. Jarke
I5-DR-0312-18

Austauschdienstes

Experimental Results I

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

 No Sybils
HITS MHITS
 Results are compared with the ranking
of the users according to the number of
fair ratings each of them had in the system Spearman 0.87 0.93
(Information Systems) n=15
Prof. Dr. M. Jarke
I5-DR-0312-19


Experimental Results II
Austauschdienstes

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

 10% Sybils HITS MHITS MHITS & SumUp
 4 attack edges
Spearman 0.52 0.68 0.93
n=20
Prof. Dr. M. Jarke
I5-DR-0312-20


Experimental Results III
Austauschdienstes

Precision@K

CollaborateCom2012

Khaled Rashed

Cristina Balasoiu
Ralf Klamma

10% Sybils (one group) and 8 attack edges 20% Sybils (one group) and 24 attack edges
Prof. Dr. M. Jarke
I5-DR-0312-21

Austauschdienstes

Further evaluation
 3% 17% - Number of Sybil votes increased with respect to the
total number of fair votes
– expertise ranking does not change
CollaborateCom2012  9 to 14 and 24 Number of attack edges was increased keeping the
number of Sybil votes to 17% percent of the number of fair votes and
Khaled Rashed
constant number of Sybils (50)
Cristina Balasoiu
Ralf Klamma – precision does not change
 17% 50% and then to 100% the number of Sybil votes Increased
keeping constant the Nr of attack edges (24) and Sybils Nr.
K MHITS MHITS & SumUp MHITS MHITS&SumUp MHITS MHITS & SumUp
20% 20% 50% 50% 100% 100%

12 0.91 0.91 0.27 0.33 0.08 0.08

15 0.93 0.93 0.33 0.40 0.06 0.06
Prof. Dr. M. Jarke
I5-DR-0312-22

Austauschdienstes

Conclusions and Future Work
 Conclusions
– Proposed an expertise ranking algorithm in collaborative systems
CollaborateCom2012 (fake multimedia detection systems)

Khaled Rashed – Leveraging trust and showed the trust implications
Cristina Balasoiu
Ralf Klamma – Combination of expert ranking and resistant to Sybils algorithms
 Future Work
Applying the algorithm on real data and on different data sets

– Temporal analysis –time series analysis

– Integrate the domain knowledge driven method
Prof. Dr. M. Jarke
I5-DR-0312-23

Collabrate com2012 rashed

Recommended

Recommended

More Related Content

Similar to Collabrate com2012 rashed

Similar to Collabrate com2012 rashed (20)

Collabrate com2012 rashed

Editor's Notes