SlideShare a Scribd company logo
1 of 23
Deutschen Akademischen                8th IEEE International Conference on
Austauschdienstes
                                      Collaborative Computing:
                                      Networking, Applications and Worksharing
                                      October 14–17, 2012 Pittsburgh, Pennsylvania, United States




                           Robust Expert Ranking in Online
                           Communities - Fighting Sybil Attacks
CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma                 Khaled A. N. Rashed, Cristina Balasoiu, Ralf Klamma
                                            RWTH Aachen University
                                Advanced Community Information Systems (ACIS)
                                 {rashed|balsoiu|klamma}@dbis.rwth-aachen.de


  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-1
Advanced Community Information
Deutschen Akademischen
Austauschdienstes



                                   Systems (ACIS)



CollaborateCom2012
                                               Responsive
                             Web Engineering                 Community




                                                                             Web Analytics
                                                  Open
                                                             Visualization
Khaled Rashed                                  Community
                                                                 and
 Cristina Balasoiu                             Information
                                                              Simulation
                                                 Systems
   Ralf Klamma


                                               Community      Community
                                                Support        Analytics




  Lehrstuhl Informatik 5
                                                Requirements
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-2
                                                 Engineering
Deutschen Akademischen
Austauschdienstes



                                                       Agenda
                              Introduction and motivation

                              Related work
CollaborateCom2012

                              Our Approach
Khaled Rashed

 Cristina Balasoiu
                               – Expert ranking algorithm
   Ralf Klamma

                               – Robustness of the expert ranking algorithm

                              Evaluation

                              Conclusions and outlook
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-3
Deutschen Akademischen
Austauschdienstes



                                                     Introduction

                              The expert search and ranking refer to the way of finding a
                               group of authoritative users with special skills and knowledge
CollaborateCom2012

                               for a specific category.
Khaled Rashed

 Cristina Balasoiu
                              The task is very important in online collaborative systems
   Ralf Klamma

                              Problems: openness and misbehaviour and
                               – No attention has been made to the trust and reputation of experts

                              Solution: Leveraging trust
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-4
Deutschen Akademischen
Austauschdienstes



                                                    Motivation Examples
                             Manipulating the truth for war           Tidal bores presented as Indian Ocean
                                     propaganda                                       Tsunami


CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma



                            Published as: British soldiers abusing     Published as: 2004 Indian Ocean Tsunami
                             prisoners in Iraq                          Proved to be tidal bores, a four-day-long
                            Proved to be fake by Brigadier Geoff        government-sponsored tourist festival in
                             Sheldon who said the vehicle featured       China
                             in the photo had never been to Iraq
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
                              Expert knowledge, analysis and witnesses are needed to identify the fake!
     I5-DR-0312-5
A Case Study: Collaborative Fake Multimedia
Deutschen Akademischen
Austauschdienstes



                                                    Detection System
                              Collaborative activities (rating, tagging and commenting)
                                – Provide new means of search, retrieval and media authenticity
                                  evaluation
CollaborateCom2012              – Explicit ratings and tags are used for evaluating authenticity of
                                  multimedia items
Khaled Rashed

 Cristina Balasoiu              – Reliability: not all of the submitted ratings are reliable
   Ralf Klamma                  – No centralized control mechanism
                                – Vulnerability to attacks
                              Three types of users
                                – Honest users
                                – Experts
  Lehrstuhl Informatik 5
  (Information Systems)
                                – Malicious users
     Prof. Dr. M. Jarke
     I5-DR-0312-6
Deutschen Akademischen
Austauschdienstes



                                       Research Questions and Goals
                              Research questions
                               – How to measure users’ expertise in collaborative media sharing and
CollaborateCom2012               evaluating systems? and how to rank them?

Khaled Rashed
                               – What is the implication of trust
 Cristina Balasoiu
   Ralf Klamma                 – Robustness! how to ensure robustness of the ranking algorithm
                              Goals
                               – Improve multimedia evaluation

                               – Reduce impacts of malicious users
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-7
Deutschen Akademischen
Austauschdienstes



                                                        Related Work

                              Probabilistic models e.g.[Tu et al.2010]

                              Voting models [Macdonald and Ounis 2006] [Macdonald et al.2008]
CollaborateCom2012
                              Link-based approaches PageRank [Brein and Page 1998], HITS
                               [Kleinberg1999] and their variations. SPEAR algorithm [Noll et al. 2009]
Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma                 ExpertRank [Jiao et al. 2009]

                              TREC enterprise track -Find the associations between candidates
                               and documents e.g.[Balog 2006, Balog 2007]

                              Machine learning algorithms e.g. [Bian and Liu 2008, Li et al. 2009]
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-8
Deutschen Akademischen
Austauschdienstes


                                                     Our Approach
                              Assumptions
                               – Expert users tend to have many authenticity ratings
CollaborateCom2012             – Correctly evaluated media are rated by users of high expertise
Khaled Rashed                  – Following expert users provides more benefits
 Cristina Balasoiu
   Ralf Klamma                Expert definition
                               – Rates a big number of media files in an authentic way with respect to
                                  a topic and Highly trusted by his directly connected users

                               – Should be trustable in evaluating multimedia
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
     I5-DR-0312-9
Deutschen Akademischen
Austauschdienstes


                                          Expert Ranking Methods
                              Domain knowledge driven method
                               – Considers tags that users assign to media files
                               – User profile: merging tags user submitted to the media files in the
CollaborateCom2012               system
Khaled Rashed
                               – Similarity coefficient between the candidate profile and the tags
 Cristina Balasoiu               assigned to a specific resource
   Ralf Klamma                 – Used to reorder users who voted a media file according to the tag
                                 profile
                              Domain knowledge independent method
                               – Use the connections between users and resources to decide on the
                                 expertise of the users
  Lehrstuhl Informatik 5
                               – A modified version of HITS algorithm
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-10
                               – Mutual reinforcement of users expertise and media
Deutschen Akademischen
Austauschdienstes


                                  MHITS : Expert Ranking Algorithm
                              MHITS: Expert ranking algorithm in online collaborative systems
                               – Link-based approach, based on HITS algorithm

CollaborateCom2012
                               – HITS
                                   – Authorities: pages that are pointed to by good pages
Khaled Rashed

 Cristina Balasoiu
                                   – Hubs: pages that points to good pages
   Ralf Klamma                     – Reinforcement between hubs and authorities
                               – MHITS
                                   – Users act as hubs (correctly evaluated media rated by them)
                                   – Media files act as authorities
                                   – Mutual reinforcement between users and media files
  Lehrstuhl Informatik 5
  (Information Systems)
                                   – Local trust values between users are assigned
     Prof. Dr. M. Jarke
    I5-DR-0312-11                  – Considers the rates of the users
Deutschen Akademischen
Austauschdienstes


                                    MHITS: Expert Ranking Algorithm

                                                                         a(m)                h(u ) r (u )
                                                                                      u U ( m)




CollaborateCom2012                                               h(u)    β         a(m) r(u) ( 1 β) t(u)
                                                                             m M(u)


Khaled Rashed
                                                                  Symbol                    Description

 Cristina Balasoiu                                                a(m)       Authority score
   Ralf Klamma                                                    U(m)       Set of users pointing to media file m
                                                                  h(u)       Hubness score
                                                                  r(u)       Rating of user u for media file m
                            one network for users and ratings t(u)          Average trust of the direct connected
                                                                                 users to user u
                            one for users only (trust network). M(u)
                                                                             Set of media files to which user u points

                             Trust in range [0, 1]                          Coefficient that weights the influence of
  Lehrstuhl Informatik 5
  (Information Systems)      Ratings 0.5 for a fake vote,                       the two terms, in range [0, 1]
     Prof. Dr. M. Jarke
    I5-DR-0312-12                      1 for an authentic vote
Deutschen Akademischen
Austauschdienstes


                                    Robustness of the MHITS Algorithm
                               Compromising techniques
                                  – Sybil attack [Douc02], Reputation theft, Whitewashing attack, etc.
                                  – Compromising the input and the output of the algorithm
                                Sybil attack
CollaborateCom2012
                           
Khaled Rashed                       – Fundamental problem in online collaborative systems
 Cristina Balasoiu
                                    – A malicious user creates many fake accounts (Sybils) which all
   Ralf Klamma
                                      reference the user to boost his reputation (attacker’s goal is to be
                                      higher up in the rankings)
                               Countermeasures against Sybil attack
                                                     SybilGuard   [YKGF06]   SybilLimit   [YGKX08]   SumUp   [TMLS09]

                                   Protocol type        Decentralized          Decentralized          Centralized
  Lehrstuhl Informatik 5       Accepted Sybils per
  (Information Systems)
     Prof. Dr. M. Jarke        attack edge
    I5-DR-0312-13
Deutschen Akademischen
Austauschdienstes



                                                             SumUp
                              Centralized approach                                   SumUp Steps
                                – Aims to aggregate votes in a           (1)   Assign the source node and
                                  Sybil resilient manner                       number of votes per media file
CollaborateCom2012
                              Key idea – adaptive vote flow             (2)   Levels assignment
Khaled Rashed                  technique - that appropriately            (3)   Pruning step
 Cristina Balasoiu
                               assigns and adjusts link capacities       (4)   Capacity assignment
   Ralf Klamma
                               in the trust graph to collect the votes   (5)   Max-flow computation – collect
                               for an object
                                                                               votes on each resource
                              New: we Integrate SumUp with the          (6)   Leverage user history to penalize
                               MHITS Java implementation – used
                                                                               adversarial nodes
                               own data structure based on Java
  Lehrstuhl Informatik 5
                               Sparse Arrays
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-14
Deutschen Akademischen
Austauschdienstes


                           Integration of SumUp with MHITS




CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma




  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-15
Deutschen Akademischen
Austauschdienstes



                                                        Evaluation
                              Experimental Setup
                               –   BarabasiAlbert model for generating network
                               –   300 users
CollaborateCom2012
                               –   20 media files (10 known to be fake and 10 known to be authentic)
Khaled Rashed
                               –   800 ratings
 Cristina Balasoiu             –   3000 trust edges
   Ralf Klamma




  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-16
Deutschen Akademischen
Austauschdienstes



                           Ratings Distribution



CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma




  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-17
Deutschen Akademischen
Austauschdienstes



                                                              Evaluation
                              Evaluation metrics:
                                                                               TopK'        TopK
                               – Precision@K                 recision@K
                                                                                        K
CollaborateCom2012
                               – Spearman’s rank correlation coefficient
                                                                          +1                       0                  -1
Khaled Rashed                                           n

 Cristina Balasoiu                              6            d i2
                                                                     Perfect Positive        No Correlation   Perfect Negative
   Ralf Klamma                      ρs     1           i 1
                                                                       Correlation                              Correlation
                                                n(n2         1)
                                p - Spearman’s coefficient of rank correlation -1 ≤ ps ≤ 1
                                di - is the different between the rank of xi and the rank of yi
                                n:- the number of data points in the sample (total number of observations)
                                ps = - 1 or 1 high degree of correlation between x any y
                                Ps = 0 a lack of linear association between two variables
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-18
Deutschen Akademischen
Austauschdienstes


                                              Experimental Results I




CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma




                              No Sybils
                                                                                    HITS   MHITS
                              Results are compared with the ranking
                            of the users according to the number of
                           fair ratings each of them had in the system   Spearman   0.87   0.93
  Lehrstuhl Informatik 5
  (Information Systems)                                                    n=15
     Prof. Dr. M. Jarke
    I5-DR-0312-19
Deutschen Akademischen


                                        Experimental Results II
Austauschdienstes




CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma




                              10% Sybils                  HITS   MHITS   MHITS & SumUp
                              4 attack edges
                                                Spearman   0.52    0.68        0.93
  Lehrstuhl Informatik 5
  (Information Systems)
                                                  n=20
     Prof. Dr. M. Jarke
    I5-DR-0312-20
Deutschen Akademischen


                                          Experimental Results III
Austauschdienstes




                                                                                 Precision@K




CollaborateCom2012



Khaled Rashed

 Cristina Balasoiu
   Ralf Klamma




                           10% Sybils (one group) and 8 attack edges 20% Sybils (one group) and 24 attack edges
  Lehrstuhl Informatik 5
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-21
Deutschen Akademischen
Austauschdienstes



                                                Further evaluation
                              3%       17% - Number of Sybil votes increased with respect to the
                               total number of fair votes
                                – expertise ranking does not change
CollaborateCom2012            9 to 14 and 24 Number of attack edges was increased keeping the
                               number of Sybil votes to 17% percent of the number of fair votes and
Khaled Rashed
                               constant number of Sybils (50)
 Cristina Balasoiu
   Ralf Klamma                  – precision does not change
                              17%       50% and then to 100% the number of Sybil votes Increased
                               keeping constant the Nr of attack edges (24) and Sybils Nr.
                                  K    MHITS   MHITS & SumUp   MHITS   MHITS&SumUp   MHITS   MHITS & SumUp
                                        20%         20%         50%        50%       100%        100%


                                  12   0.91        0.91        0.27       0.33       0.08        0.08

  Lehrstuhl Informatik 5
                                  15   0.93        0.93        0.33       0.40       0.06        0.06
  (Information Systems)
     Prof. Dr. M. Jarke
    I5-DR-0312-22
Deutschen Akademischen
Austauschdienstes



                                   Conclusions and Future Work
                              Conclusions
                               – Proposed an expertise ranking algorithm in collaborative systems
CollaborateCom2012               (fake multimedia detection systems)

Khaled Rashed                  – Leveraging trust and showed the trust implications
 Cristina Balasoiu
   Ralf Klamma                 – Combination of expert ranking and resistant to Sybils algorithms
                              Future Work
                                  Applying the algorithm on real data and on different data sets

                               – Temporal analysis –time series analysis

  Lehrstuhl Informatik 5
  (Information Systems)
                               – Integrate the domain knowledge driven method
     Prof. Dr. M. Jarke
    I5-DR-0312-23

More Related Content

Similar to Collabrate com2012 rashed

Technical Challenges for Realizing Learning Analytics
Technical Challenges for Realizing Learning AnalyticsTechnical Challenges for Realizing Learning Analytics
Technical Challenges for Realizing Learning AnalyticsRalf Klamma
 
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...Michael Derntl
 
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...Ralf Klamma
 
Scaling Community Information Systems
Scaling Community Information SystemsScaling Community Information Systems
Scaling Community Information SystemsRalf Klamma
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Dejan Kovachev
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bhaskar Ghosh
 
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...Milos Kravcik
 
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...Dominik Renzel
 
Simons orcid forum canberra 2018-PIDs in research
Simons orcid forum canberra 2018-PIDs in researchSimons orcid forum canberra 2018-PIDs in research
Simons orcid forum canberra 2018-PIDs in researchARDC
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...IJITE
 
Blueprint for Software Engineering in Technology Enhanced Learning Projects
Blueprint for Software Engineering in Technology Enhanced Learning ProjectsBlueprint for Software Engineering in Technology Enhanced Learning Projects
Blueprint for Software Engineering in Technology Enhanced Learning ProjectsRalf Klamma
 
EU Project Layers: Informal Learning at the Workplace with Video Clips
EU Project Layers: Informal Learning at the Workplace with Video ClipsEU Project Layers: Informal Learning at the Workplace with Video Clips
EU Project Layers: Informal Learning at the Workplace with Video ClipsMilos Kravcik
 
Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web Ralf Klamma
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discoveryadamkraut
 
Using Personal Learning Environments to Support Workplace Learning in Small C...
Using Personal Learning Environments to Support Workplace Learning in Small C...Using Personal Learning Environments to Support Workplace Learning in Small C...
Using Personal Learning Environments to Support Workplace Learning in Small C...Milos Kravcik
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsMarcel Kurovski
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systemsinovex GmbH
 
Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesMilos Kravcik
 
Enterprise 2.0 Exchange Symposium
Enterprise 2.0 Exchange SymposiumEnterprise 2.0 Exchange Symposium
Enterprise 2.0 Exchange SymposiumKai Riemer
 
Reflection Support for Communities on the Web
Reflection Support for Communities on the WebReflection Support for Communities on the Web
Reflection Support for Communities on the WebRalf Klamma
 

Similar to Collabrate com2012 rashed (20)

Technical Challenges for Realizing Learning Analytics
Technical Challenges for Realizing Learning AnalyticsTechnical Challenges for Realizing Learning Analytics
Technical Challenges for Realizing Learning Analytics
 
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
An Embeddable Dashboard for Widget-Based Visual Analytics on Scientific Commu...
 
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
 
Scaling Community Information Systems
Scaling Community Information SystemsScaling Community Information Systems
Scaling Community Information Systems
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...
 
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
Bg linkedin bigdata_martinschultz_symposium_yale_oct2012
 
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...
Supporting Workplace Learning in Small Enterprises by Personal Learning Envir...
 
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...
Requirements Bazaar: Experiences, Added Value & Acceptance of Requirements Ne...
 
Simons orcid forum canberra 2018-PIDs in research
Simons orcid forum canberra 2018-PIDs in researchSimons orcid forum canberra 2018-PIDs in research
Simons orcid forum canberra 2018-PIDs in research
 
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
ANALYSIS OF DEVELOPMENT COOPERATION WITH SHARED AUTHORING ENVIRONMENT IN ACAD...
 
Blueprint for Software Engineering in Technology Enhanced Learning Projects
Blueprint for Software Engineering in Technology Enhanced Learning ProjectsBlueprint for Software Engineering in Technology Enhanced Learning Projects
Blueprint for Software Engineering in Technology Enhanced Learning Projects
 
EU Project Layers: Informal Learning at the Workplace with Video Clips
EU Project Layers: Informal Learning at the Workplace with Video ClipsEU Project Layers: Informal Learning at the Workplace with Video Clips
EU Project Layers: Informal Learning at the Workplace with Video Clips
 
Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web
 
Building Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated DiscoveryBuilding Data Ecosystems for Accelerated Discovery
Building Data Ecosystems for Accelerated Discovery
 
Using Personal Learning Environments to Support Workplace Learning in Small C...
Using Personal Learning Environments to Support Workplace Learning in Small C...Using Personal Learning Environments to Support Workplace Learning in Small C...
Using Personal Learning Environments to Support Workplace Learning in Small C...
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
Identification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based CommunitiesIdentification of Learning Goals in Forum-based Communities
Identification of Learning Goals in Forum-based Communities
 
Enterprise 2.0 Exchange Symposium
Enterprise 2.0 Exchange SymposiumEnterprise 2.0 Exchange Symposium
Enterprise 2.0 Exchange Symposium
 
Reflection Support for Communities on the Web
Reflection Support for Communities on the WebReflection Support for Communities on the Web
Reflection Support for Communities on the Web
 

Collabrate com2012 rashed

  • 1. Deutschen Akademischen 8th IEEE International Conference on Austauschdienstes Collaborative Computing: Networking, Applications and Worksharing October 14–17, 2012 Pittsburgh, Pennsylvania, United States Robust Expert Ranking in Online Communities - Fighting Sybil Attacks CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma Khaled A. N. Rashed, Cristina Balasoiu, Ralf Klamma RWTH Aachen University Advanced Community Information Systems (ACIS) {rashed|balsoiu|klamma}@dbis.rwth-aachen.de Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-1
  • 2. Advanced Community Information Deutschen Akademischen Austauschdienstes Systems (ACIS) CollaborateCom2012 Responsive Web Engineering Community Web Analytics Open Visualization Khaled Rashed Community and Cristina Balasoiu Information Simulation Systems Ralf Klamma Community Community Support Analytics Lehrstuhl Informatik 5 Requirements (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-2 Engineering
  • 3. Deutschen Akademischen Austauschdienstes Agenda  Introduction and motivation  Related work CollaborateCom2012  Our Approach Khaled Rashed Cristina Balasoiu – Expert ranking algorithm Ralf Klamma – Robustness of the expert ranking algorithm  Evaluation  Conclusions and outlook Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-3
  • 4. Deutschen Akademischen Austauschdienstes Introduction  The expert search and ranking refer to the way of finding a group of authoritative users with special skills and knowledge CollaborateCom2012 for a specific category. Khaled Rashed Cristina Balasoiu  The task is very important in online collaborative systems Ralf Klamma  Problems: openness and misbehaviour and – No attention has been made to the trust and reputation of experts  Solution: Leveraging trust Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-4
  • 5. Deutschen Akademischen Austauschdienstes Motivation Examples Manipulating the truth for war Tidal bores presented as Indian Ocean propaganda Tsunami CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma  Published as: British soldiers abusing  Published as: 2004 Indian Ocean Tsunami prisoners in Iraq  Proved to be tidal bores, a four-day-long  Proved to be fake by Brigadier Geoff government-sponsored tourist festival in Sheldon who said the vehicle featured China in the photo had never been to Iraq Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke  Expert knowledge, analysis and witnesses are needed to identify the fake! I5-DR-0312-5
  • 6. A Case Study: Collaborative Fake Multimedia Deutschen Akademischen Austauschdienstes Detection System  Collaborative activities (rating, tagging and commenting) – Provide new means of search, retrieval and media authenticity evaluation CollaborateCom2012 – Explicit ratings and tags are used for evaluating authenticity of multimedia items Khaled Rashed Cristina Balasoiu – Reliability: not all of the submitted ratings are reliable Ralf Klamma – No centralized control mechanism – Vulnerability to attacks  Three types of users – Honest users – Experts Lehrstuhl Informatik 5 (Information Systems) – Malicious users Prof. Dr. M. Jarke I5-DR-0312-6
  • 7. Deutschen Akademischen Austauschdienstes Research Questions and Goals  Research questions – How to measure users’ expertise in collaborative media sharing and CollaborateCom2012 evaluating systems? and how to rank them? Khaled Rashed – What is the implication of trust Cristina Balasoiu Ralf Klamma – Robustness! how to ensure robustness of the ranking algorithm  Goals – Improve multimedia evaluation – Reduce impacts of malicious users Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-7
  • 8. Deutschen Akademischen Austauschdienstes Related Work  Probabilistic models e.g.[Tu et al.2010]  Voting models [Macdonald and Ounis 2006] [Macdonald et al.2008] CollaborateCom2012  Link-based approaches PageRank [Brein and Page 1998], HITS [Kleinberg1999] and their variations. SPEAR algorithm [Noll et al. 2009] Khaled Rashed Cristina Balasoiu Ralf Klamma ExpertRank [Jiao et al. 2009]  TREC enterprise track -Find the associations between candidates and documents e.g.[Balog 2006, Balog 2007]  Machine learning algorithms e.g. [Bian and Liu 2008, Li et al. 2009] Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-8
  • 9. Deutschen Akademischen Austauschdienstes Our Approach  Assumptions – Expert users tend to have many authenticity ratings CollaborateCom2012 – Correctly evaluated media are rated by users of high expertise Khaled Rashed – Following expert users provides more benefits Cristina Balasoiu Ralf Klamma  Expert definition – Rates a big number of media files in an authentic way with respect to a topic and Highly trusted by his directly connected users – Should be trustable in evaluating multimedia Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-9
  • 10. Deutschen Akademischen Austauschdienstes Expert Ranking Methods  Domain knowledge driven method – Considers tags that users assign to media files – User profile: merging tags user submitted to the media files in the CollaborateCom2012 system Khaled Rashed – Similarity coefficient between the candidate profile and the tags Cristina Balasoiu assigned to a specific resource Ralf Klamma – Used to reorder users who voted a media file according to the tag profile  Domain knowledge independent method – Use the connections between users and resources to decide on the expertise of the users Lehrstuhl Informatik 5 – A modified version of HITS algorithm (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-10 – Mutual reinforcement of users expertise and media
  • 11. Deutschen Akademischen Austauschdienstes MHITS : Expert Ranking Algorithm  MHITS: Expert ranking algorithm in online collaborative systems – Link-based approach, based on HITS algorithm CollaborateCom2012 – HITS – Authorities: pages that are pointed to by good pages Khaled Rashed Cristina Balasoiu – Hubs: pages that points to good pages Ralf Klamma – Reinforcement between hubs and authorities – MHITS – Users act as hubs (correctly evaluated media rated by them) – Media files act as authorities – Mutual reinforcement between users and media files Lehrstuhl Informatik 5 (Information Systems) – Local trust values between users are assigned Prof. Dr. M. Jarke I5-DR-0312-11 – Considers the rates of the users
  • 12. Deutschen Akademischen Austauschdienstes MHITS: Expert Ranking Algorithm a(m) h(u ) r (u ) u U ( m) CollaborateCom2012 h(u) β a(m) r(u) ( 1 β) t(u) m M(u) Khaled Rashed Symbol Description Cristina Balasoiu a(m) Authority score Ralf Klamma U(m) Set of users pointing to media file m h(u) Hubness score r(u) Rating of user u for media file m  one network for users and ratings t(u) Average trust of the direct connected users to user u  one for users only (trust network). M(u) Set of media files to which user u points Trust in range [0, 1] Coefficient that weights the influence of Lehrstuhl Informatik 5 (Information Systems) Ratings 0.5 for a fake vote, the two terms, in range [0, 1] Prof. Dr. M. Jarke I5-DR-0312-12 1 for an authentic vote
  • 13. Deutschen Akademischen Austauschdienstes Robustness of the MHITS Algorithm  Compromising techniques – Sybil attack [Douc02], Reputation theft, Whitewashing attack, etc. – Compromising the input and the output of the algorithm Sybil attack CollaborateCom2012  Khaled Rashed – Fundamental problem in online collaborative systems Cristina Balasoiu – A malicious user creates many fake accounts (Sybils) which all Ralf Klamma reference the user to boost his reputation (attacker’s goal is to be higher up in the rankings)  Countermeasures against Sybil attack SybilGuard [YKGF06] SybilLimit [YGKX08] SumUp [TMLS09] Protocol type Decentralized Decentralized Centralized Lehrstuhl Informatik 5 Accepted Sybils per (Information Systems) Prof. Dr. M. Jarke attack edge I5-DR-0312-13
  • 14. Deutschen Akademischen Austauschdienstes SumUp  Centralized approach SumUp Steps – Aims to aggregate votes in a (1) Assign the source node and Sybil resilient manner number of votes per media file CollaborateCom2012  Key idea – adaptive vote flow (2) Levels assignment Khaled Rashed technique - that appropriately (3) Pruning step Cristina Balasoiu assigns and adjusts link capacities (4) Capacity assignment Ralf Klamma in the trust graph to collect the votes (5) Max-flow computation – collect for an object votes on each resource  New: we Integrate SumUp with the (6) Leverage user history to penalize MHITS Java implementation – used adversarial nodes own data structure based on Java Lehrstuhl Informatik 5 Sparse Arrays (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-14
  • 15. Deutschen Akademischen Austauschdienstes Integration of SumUp with MHITS CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-15
  • 16. Deutschen Akademischen Austauschdienstes Evaluation  Experimental Setup – BarabasiAlbert model for generating network – 300 users CollaborateCom2012 – 20 media files (10 known to be fake and 10 known to be authentic) Khaled Rashed – 800 ratings Cristina Balasoiu – 3000 trust edges Ralf Klamma Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-16
  • 17. Deutschen Akademischen Austauschdienstes Ratings Distribution CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-17
  • 18. Deutschen Akademischen Austauschdienstes Evaluation  Evaluation metrics: TopK' TopK – Precision@K recision@K K CollaborateCom2012 – Spearman’s rank correlation coefficient +1 0 -1 Khaled Rashed n Cristina Balasoiu 6 d i2 Perfect Positive No Correlation Perfect Negative Ralf Klamma ρs 1 i 1 Correlation Correlation n(n2 1) p - Spearman’s coefficient of rank correlation -1 ≤ ps ≤ 1 di - is the different between the rank of xi and the rank of yi n:- the number of data points in the sample (total number of observations)  ps = - 1 or 1 high degree of correlation between x any y  Ps = 0 a lack of linear association between two variables Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-18
  • 19. Deutschen Akademischen Austauschdienstes Experimental Results I CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma  No Sybils HITS MHITS  Results are compared with the ranking of the users according to the number of fair ratings each of them had in the system Spearman 0.87 0.93 Lehrstuhl Informatik 5 (Information Systems) n=15 Prof. Dr. M. Jarke I5-DR-0312-19
  • 20. Deutschen Akademischen Experimental Results II Austauschdienstes CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma  10% Sybils HITS MHITS MHITS & SumUp  4 attack edges Spearman 0.52 0.68 0.93 Lehrstuhl Informatik 5 (Information Systems) n=20 Prof. Dr. M. Jarke I5-DR-0312-20
  • 21. Deutschen Akademischen Experimental Results III Austauschdienstes Precision@K CollaborateCom2012 Khaled Rashed Cristina Balasoiu Ralf Klamma 10% Sybils (one group) and 8 attack edges 20% Sybils (one group) and 24 attack edges Lehrstuhl Informatik 5 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-21
  • 22. Deutschen Akademischen Austauschdienstes Further evaluation  3% 17% - Number of Sybil votes increased with respect to the total number of fair votes – expertise ranking does not change CollaborateCom2012  9 to 14 and 24 Number of attack edges was increased keeping the number of Sybil votes to 17% percent of the number of fair votes and Khaled Rashed constant number of Sybils (50) Cristina Balasoiu Ralf Klamma – precision does not change  17% 50% and then to 100% the number of Sybil votes Increased keeping constant the Nr of attack edges (24) and Sybils Nr. K MHITS MHITS & SumUp MHITS MHITS&SumUp MHITS MHITS & SumUp 20% 20% 50% 50% 100% 100% 12 0.91 0.91 0.27 0.33 0.08 0.08 Lehrstuhl Informatik 5 15 0.93 0.93 0.33 0.40 0.06 0.06 (Information Systems) Prof. Dr. M. Jarke I5-DR-0312-22
  • 23. Deutschen Akademischen Austauschdienstes Conclusions and Future Work  Conclusions – Proposed an expertise ranking algorithm in collaborative systems CollaborateCom2012 (fake multimedia detection systems) Khaled Rashed – Leveraging trust and showed the trust implications Cristina Balasoiu Ralf Klamma – Combination of expert ranking and resistant to Sybils algorithms  Future Work Applying the algorithm on real data and on different data sets – Temporal analysis –time series analysis Lehrstuhl Informatik 5 (Information Systems) – Integrate the domain knowledge driven method Prof. Dr. M. Jarke I5-DR-0312-23

Editor's Notes

  1. Fake multimedia and misbehaviour
  2. e.g. Press Agencies
  3. we discuss the notions of experts and expertise in the context of collaborative fake multimedia detection systems.Here we try to define the expert and we asume that ….Improve media evaluation (by increasing the impact of experts)
  4. SybilGuard, SybilLimitaredescentralizedSumUpiscentralizerdSybilGuard is based on the “social network” among user identities, where an edge between two identities indicates a human-established trustrelationship. Malicious users can create many identities but few trust relationships. Thus, there is a disproportionately-small “cut” in the graph between the sybil nodes and the honest nodes. SybilGuard exploits this property to bound the number of identities a malicious usercancreate.SybilLimit – leverages the same insight as SybilGuard but is an improved version that reduces the accepted Sybil nodes of a honest node from O(nlogn) to O(logn) for n honest nodesWhen all nodes vote, SumUp leads to much lower attack capacity than SybilLimit despite the same asymptotic bound per attack edgeFirst, SumUp’s bound of 1 + log n inTheorem 5.1 is a loose upper bound of the actual average capacity. Second, since links pointing to lower-levelnodes are not eligible for ticket distribution, many incoming links of an adversarial nodes have zero tickets and thusare assigned capacity of one
  5. P@K computes for a given result of ranked users, the fraction of relevant results in the top K results. The higher the precision, the betterthe performance is. We use this metric to compare the results of the expert ranking algorithms that we developed with the ranking of experts resulted by counting the numberof fair votes.Spearman’s rank correlation coefficienis a non-parametric measure of statistical dependence between two ranked lists.Spearman’s rank correlation coefficient it is based on rank order of scores and not the score data. Correlation Coefficient between the ranked variables d= Difference of rank between paired item in two series (lists).
  6. For this step of the evaluation, I assume that all users in the network are behaving ina fair way and are rating a random number of media files. So the only way the userscan rate a media file wrong, is when the user has no competence in the specific topic.What is different in the two methods isthat, besides the reinforcement between users voting fairly and authentic media files,the ranking in the case of the MHITS considers also the local trustvalues the user has in the social network.Since average precision ignores the exact rank of a user, we use the Spearman's rankcorrelation coefficient to get a better view of the efficiency. In Table 6.2, the correlationcoefficients for n = 15 are presented. One can notice that the result of the MHITS algorithm is higher correlated to the fair number of media file ranking as thevalue gets closer to 1
  7. From the results, we can see that our proposed model integration of Sumup to Mhits algorithm outperforms the HITS and the MHITS with out SumUp, which confirms the effectiveness of our approachAs it can be seen, the MHITS in combination with SumUp performs better for K = 10 and then for K = 20 the precision decreases much rapidly even than the MHITS. We think that this happens due to the fact that some Sybil users are already entering the ranking for K = 20 due to their high local trust values and therefore the precision decreases.
  8. It can be noticed that by increasing the number of the Sybils, the attack edges or even the votes (up to 50% of the number of the fair votes), the ranking of the users do not change dramatically. Also it can be seen that the Modified HITS with SumUp performs only slightly better than the ModifiedHITS alone. The reason for these facts is that the steps that are additionally done by SumUp when run together with HITS which are: pruning of the trust network, assignment of capacity in the network and elimination of the links that posses high negative history do not affect the Sybils.The reason for this is that the capacity assignment does not reach them so votes from Sybils do not reach the source node. In this case, the edges connecting Sybils to fair nodes do not accumulate negative history and therefore are not eliminated. On this resulting network, Modified HITS is run again. The Sybils are kept and due to the high local trust values that they have from the other Sybil nodes in the group, they get into the top rank of experts.
  9. Combination of expert ranking and resistant to Sybils algorithms to ensure robustness