SlideShare une entreprise Scribd logo
1  sur  44
HARVESTING INTELLIGENCE
FROM USER INTERACTIONS
      Rajendra Akerkar
Outline

 What is collective intelligence?

 B i technical concepts behind collective intelligence
  Basic  h i l           b hi d ll     i   i   lli

 Many forms of user interaction

 Example of how user interaction is converted into
  intelligence
Web users are undergoing a transformation…

 Users are expressing themselves. This expression may
  be in the form of:
    sharing their opinions on a product or a service through
     reviews or comments; through sharing and tagging
     content; through participation in an online community; or
     by contributing new content.


    This increased user interaction and participation gives rise
     to data that can be converted into intelligence in your
     application. The use of intelligence to personalize a site for
     a user, to aid him in searching and making decisions, and
     to make the application more sticky are cherished goals
     that web applications try to f lf ll
      h      b     l               fulfill.
Wisdom of the Crowd

 “Under the right circumstances, groups are extraordinarily
  intelligent, and are often smarter than the smartest people
  in them ”
     them.

 “If the process is sound, the more people you involve in
  solving a problem, the better the result will be.”
            problem                             be

 A crowd’s collective intelligence will produce better results
  than those of a small group of experts if four basic
  conditions are met.
The “wise crowds” are valuable when they’re
composed of individuals who…
 Have diverse opinions;

 Wh
  When the i di id l aren’t afraid to express their
         h individuals   ’ f id                h i
  opinions;

 When there’s diversity in the crowd; and
   h    h   ’ d              h      d    d

 When there’s a way to aggregate all the information and
  use it in the decision-making process.
Collective intelligence

 To effectively use the information provided by others to
  improve one’s application.

 When a group of individuals collaborate or compete with
  each other, intelligence or behavior that otherwise
  didn t
  didn’t exist suddenly emerges
 A user may be influenced by other users either directly
  or through intelligence derived from the applications by
  mining the data.
Collective intelligence of users is

 The intelligence that’s extracted out from the collective
  set of interactions and contributions made by users.

 The use of this intelligence to act as a filter for what’s
  valuable in your application for a user
   —This filter takes into account a user’s preferences and
    interactions to provide relevant information to the user.

 Th
  There are a huge number of ways this information can
              h        b     f    thi i f     ti
  be processed and interpreted
To apply collective intelligence in your
application. You need to
1. allow users to interact with your site and with each
   other, learning about each user through their
   interactions and contributions
                    contributions.

2. aggregate what you learn about your users and their
   contributions using some useful models
                                   models.

3. leverage those models to recommend relevant content
   to a user
        user.
Three components to harnessing intelligence:
1 – Allow users to interact, 2 – Learn about your users in
   aggregate,
   aggregate 3 – Personalise content using user
   interactions data and aggregate data.
Sources of information

 Content-based—
    based on information about the item itself, usually
     keywords or phrases occurring in the item.
     k      d      h           i   i th it

 Collaborative-based—
    based on the interactions of users.
Algorithms for applying Intelligence

 correlate users with content and with each other,
    need a common language to compute relevance between
     items, b t
     it     between users, and between users and items.
                             db t              d it

 Content-based relevance is anchored in the content
  itself… (i f
  it lf (information retrieval systems)
                ti     t i   l    t   )

 Collaborative-based relevance leverages the user
  interaction data to d t t meaningful relationships
  i t    ti   d t t detect       i f l l ti     hi

 Unstructured text: to understand how metadata can be
  developed f
  d   l   d from unstructured text
                      t  t   dt t
Abstracting types of content
                                                    applications 
                                                        li ti
                                                     users and items

                                                    I
                                                     Items ?

                                                    social-networking:
                                                     user is also a type of
                                                              l           f
                                                     item

Metadata  professionally developed keywords, user-generated tags,
           keywords extracted by an algorithm after analyzing the text,
           ratings, popularity ranking etc.
           Profile based
           Profile-based and user-action based data
                               user action

Metadata as a set of attributes that help qualify an item.
Sources for generating metadata about an item




 users and items having an associated vector of
  metadata attributes.

 The similarity or relevance between two users or two
  items or a user and item  measured by looking at the
  similarity between the two vectors
                              vectors.
Representing user information
Users provide a rich set of information
Generating intelligence
 Content-based analysis and collaborative filtering
    to build a representation for the content
      • Terms or phrases
      • Terms are converted into their basic form by a process known as
        stemming. Terms with their associated weights (term vectors), then
        represent the metadata associated with the text. Similarity between
        two content items is measured by measuring the similarity associated
        with their term vectors.

    to use the information provided by the interactions of
     users to predict items of interest for a user
      • to match a user’s metadata to that of other similar users and
        recommend items liked by them (Language independent methods)
      • E.g. users rate items, so CF approach find patterns in the way items
        have been rated by the user and other users to find additional items of
        interest for a user
      • Amazon, Netflix, and Google
Collaborative filtering

 Memory-based and model-based
    a similarity measure is used to find similar users and then
     make a prediction using a weighted average of the ratings
     of the similar users
    to build a model for prediction using a variety of
     approaches: linear algebra, probabilistic methods, neural
     networks, clustering, latent classes, and so on

 A collaborative filtering algorithm usually works by
                          g g               y        y
  searching a large group of people and finding a smaller set
  with tastes similar to yours.
                                         Collecting Preferences
                                         Recommending It
                                         R           di Items
                                         Matching Products
                                         Item-Based Filtering
Harnessing Collective Intelligence to transform
from content-centric to user-centric applications
 Prior to the user-centric revolution, many applications put
  little emphasis on the user. These applications, known as
  content-centric applications,
  content centric applications focused on the best way to
  present the content and were generally static from user
  to user and from day to day.

 User-centric applications leverage Collective Intelligence
  to fundamentally change how the user interacts with the
  Web application
       application.

 User-centric applications make the user the center of the
  web experience and dynamically reshuffle the content
  based on what’s known about the user and what the user
  explicitly asks for.
User-centric applications are composed of the
following four components
   Core competency: The main reason
    why a user comes to the application.

   Community: Connecting users with
    other users of interest, social
    networking, finding other users who may
    p
    provide answers to a user’s q questions.

   Leveraging user-generated content:
    Incorporating generated content and
    interactions of users to provide additional
    content to users.

   Building a marketplace: Monetizing
    the application by product and/or service
    placements and showing relevant
    advertisements.
Classifying User – Generated Information
Concept of a dataset

    Densely populated dataset
      • It has more rows than columns
      • The dataset is richly populated


 Clustering & build a predictive model
    E
     E.g. similar users according to age and/or sex might be a
           i il              di            d/        i h b
     good predictor of the number of minutes a user will spend
     on the site
      • age  a good predictor
      • the number of minutes spent is inversely proportional to the
        age
      • a simple linear model
      • minutes spent = 50 – age of user
Concept of a dataset

                               • Set of users viewed any of the videos on
                                 our site within the timeframe


 High-dimensional, sparsely populated
    a generalization of th term vector representation
             li ti     f the t      t           t ti
    This representation is useful to find similar users and is
     known as the User Item matrix


    users are represented as rows
    the total number of videos represented as columns
    Properties: more rows than columns, richly populated
 Users are represented as
     columns
    the videos as rows

 Users who have viewed this
  video have also viewed
  these other videos

 Properties
    number of columns is large,
    sparsely populated with
     nonzero entries in a few
     columns
    multidimensional vector
Forms of user interaction

 need to quantify the quality of the interaction

 R i
  Rating and voting interaction
           d    i   i       i
    explicit in the user’s intent
    way of getting feedback on how well the user liked the
     item
    is quantifiable and can be used directly
    Voting is similar to rating. However, a vote can have only
          g                    g         ,                    y
     two values—1 for a positive vote and -1 for a negative
     vote


       interactions such as using clicks are noisy—the intent of
       the user isn’t known perfectly and is implicit
Persistence of
ratings
 Entities:
    User & Items

 user_item_rating is a mapping table that has a
  composite key, consisting of the user ID and
  content ID
    The cardinality between the entities show that
      • Each user may rate 0 or more items.
      • Each rating is associated with only one user.
      • An item may contain 0 or more ratings.
      •   Each rating is associated with only one item.
                                                          digg.com, allows users
                                                          to contribute and vote
    What are the top 10 rated items?                     on interesting articles
Forwarding a lin

 Similar to voting,
  forwarding the content
  to others can be
  considered a positive
  vote for the item by the
  user
Bookmarking and saving

 By bookmarking
  URLs, a user is
  explicitly
  expressing interest
  in the material
  associated with the
  bookmark. URLs
  that are commonly
  bookmarked
  bubble up higher in
  the site.
Purchasing items

 users purchase items 
  casting an explicit vote
  of confidence in the
  item
    Amazon (Item-to-Item
     recommendation engine)
       Users that buy similar
       items can be correlated
       Items that have been
       bought by other users can
       be recommended to a user
       …
Click-stream                           news.google.com personalisation

 When a list of items is
  presented to a user … …
      positive vote       item
      clicked
   Looking at whether an item
    was visited and the time
    spent on it provides useful
         t          id      f l
    information.
   You can also gather useful
    statistics from this data:
       • ■ What is the average time a
         user spends on a particular
         item?
       • ■ For a user, what is the
         average time spent on any
         given article?
Reviews

 Opinions and tastes are often expressed through
  reviews and recommendations. These have the
  greatest impact on other users when
    They’re unbiased
    The reviews are from similar users
    They’re from a person of influence

 Just like voting for articles at Digg, other users
  can endorse a reviewer or vote on his reviews
Converting user interaction into intelligence
 User interaction gets converted into a dataset for learning.




     three users who’ve rated photographs

 a number of ways to transform raw ratings from users into
  intelligence
     aggregate all the ratings about the item and provide the average
        •   to create a Top 10 Rated Items list
        • constantly promoting the popular content
 Given the set of data, we answer two questions in our example:
    What are the set of related items for a given item?
    For a user, who are the other users that are similar to the user?

 Three approaches:
       • cosine-based similarity,
       • correlation-based similarity, and
       • adjusted-cosine-based s a ty
         adjusted cos e based similarity.
Cosine-based similarity computation
   takes the dot product of two vectors

   to learn about the photos, we transpose the matrix
                       photos

                                                          a row corresponds to a photo while
                                                          the columns (users) correspond to
                                                          dimensions that describe the photo



   normalize the values for each of the rows
         by dividing each of the cell entries by the square root of the sum of
          the squares of entries in a particular row
The similarity between Photo 1 and Photo 2 is computed as
(0.8018 * 0.7428) + (0.5345 * 0.3714) + (0.2673 * 0.557) = 0.943
 Item-to-item similarity table

 Wh t are the set of related items for a given item?
  What     th    t f l t d it       f      i    it ?
    According to the table, Photo1 and Photo2 are very similar.

 To determine similar users,
    associated with each user is a vector, where the rating associated
     with each item corresponds to a dimension in the vector
    analysis process is similar to calculating the item-to-item similarity
     table

 User-to-user similarity table
Intelligence from other forms of user
interactions
 How other forms of user-interaction get transformed into
  metadata?

 Approaches
    content-based and
    collaboration-based
content-based approach

 metadata is associated with each item

 Thi t
  This term vector could b created b analyzing th content of
                t      ld be     t d by     l i the  t t f
  the item or using tagging information by users

 The term vector consists of keywords or tags with a relative
  weight associated with each term.
     As the user saves content, visits content, or writes
      recommendations, she i h i the metadata associated with each
                d i       h inherits h         d           i d ih h
Collaboration-based approach

 analysis of data collected by bookmarking, saving an item,
  recommending an item
    a sparsely populated dataset



 What are other items that have been bookmarked by other
  users who bookmarked the same articles as a specific
  user?
    When the user is John, the answer is Article 3 — Doe has
     bookmarked Article 1 and also Article 3.

 What are the related items based on the bookmarking
  patterns of the users?
Collaboration-based approach

   Here useful to invert the dataset:

     The users correspond to the dimensions of the vector for an article.
     Similarities between two items are measured by computing the dot
      product between them
     The normalized matrix is



     The item-to-item similarity matrix based on this data is
                                          LEARNING: if someone bookmarks Article 1,
                                          you should recommend Article 3 to the user,
                                                                                  user
                                          and if the user bookmarks Article 2, you should
                                          also recommend Article 3
 A similar analysis can be performed by using
  information from the items the user
    saves,
    purchases, and
    recommends.

 You can further refine your analysis by
  associating
    data only from users that are similar to a user based on
     user-profile information.
Summary

 Metadata associated with users and items can be used
  to derive intelligence in the form of building
  recommendation engines and predictive models for
  personalization, and for enhancing search.
References

   S. Alag, Collective intelligence in action, Manning, 2009

   H. Marmanis, D. Babenko , Algorithms of the Intelligent Web,
               ,                g                      g       ,
    Manning, 2009

   T. Segaran , Programming Collective Intelligence: Building Smart Web
    2.0 Applications O’Reilly
    2 0 Applications, O Reilly

   Wang, Jun, Arjen P. de Vries, and Marcel J.T. Reinders. Unifying User-
    based and Item-based Collaborative Filtering Approaches by Similarity
    Fusion. 2006
    Fusion 2006.
    http://ict.ewi.tudelft.nl/pub/jun/sigir06_similarityfuson.pdf
Thank you!
      y
      Rajendra Ak k
      R j d    Akerkar
Vestlandsforsking, Sogndal, NORWAY

     E mail:
     E-mail: rak@vestforsk.no

  URL: www.tmrfindia.org/ra.html

Contenu connexe

Tendances

Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
Kaniska Mandal
 
FIND MY VENUE: Content & Review Based Location Recommendation System
FIND MY VENUE: Content & Review Based Location Recommendation SystemFIND MY VENUE: Content & Review Based Location Recommendation System
FIND MY VENUE: Content & Review Based Location Recommendation System
IJTET Journal
 
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
BO TRUE ACTIVITIES SL
 
Social Group Recommendation based on Big Data
Social Group Recommendation based on Big DataSocial Group Recommendation based on Big Data
Social Group Recommendation based on Big Data
ijtsrd
 
The application of data mining to recommender systems
The application of data mining to recommender systems The application of data mining to recommender systems
The application of data mining to recommender systems
sunsine123
 

Tendances (20)

Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectors
 
Human Vs Digital Search
Human Vs Digital SearchHuman Vs Digital Search
Human Vs Digital Search
 
Ac02411221125
Ac02411221125Ac02411221125
Ac02411221125
 
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTINGFACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
FACILITATING VIDEO SOCIAL MEDIA SEARCH USING SOCIAL-DRIVEN TAGS COMPUTING
 
Following the user’s interests in mobile context aware recommender systems
Following the user’s interests in mobile context aware recommender systemsFollowing the user’s interests in mobile context aware recommender systems
Following the user’s interests in mobile context aware recommender systems
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhaval
 
FIND MY VENUE: Content & Review Based Location Recommendation System
FIND MY VENUE: Content & Review Based Location Recommendation SystemFIND MY VENUE: Content & Review Based Location Recommendation System
FIND MY VENUE: Content & Review Based Location Recommendation System
 
Embracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital LibrariesEmbracing Social Software And Semantic Web In Digital Libraries
Embracing Social Software And Semantic Web In Digital Libraries
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional DatasetsProjection Multi Scale Hashing Keyword Search in Multidimensional Datasets
Projection Multi Scale Hashing Keyword Search in Multidimensional Datasets
 
ASK - LOST 2.0: A Web - based Tool for Social Tagging of Digital Educational ...
ASK - LOST 2.0: A Web - based Tool for Social Tagging of Digital Educational ...ASK - LOST 2.0: A Web - based Tool for Social Tagging of Digital Educational ...
ASK - LOST 2.0: A Web - based Tool for Social Tagging of Digital Educational ...
 
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
 
Semantics-aware Content-based Recommender Systems
Semantics-aware Content-based Recommender SystemsSemantics-aware Content-based Recommender Systems
Semantics-aware Content-based Recommender Systems
 
Social Group Recommendation based on Big Data
Social Group Recommendation based on Big DataSocial Group Recommendation based on Big Data
Social Group Recommendation based on Big Data
 
The application of data mining to recommender systems
The application of data mining to recommender systems The application of data mining to recommender systems
The application of data mining to recommender systems
 
Eavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging SiteEavesdropping on the Twitter Microblogging Site
Eavesdropping on the Twitter Microblogging Site
 
CS6010 Social Network Analysis Unit IV
CS6010 Social Network Analysis Unit IVCS6010 Social Network Analysis Unit IV
CS6010 Social Network Analysis Unit IV
 
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
IRJET - Socirank Identifying and Ranking Prevalent News Topics using Social M...
 
Introductionto agents
Introductionto agentsIntroductionto agents
Introductionto agents
 

En vedette

GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
Neo4j
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
Ernesto Mislej
 
Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)
Lora Aroyo
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
mozgkarakaya
 

En vedette (10)

Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a SurveyActive Learning in Collaborative Filtering Recommender Systems : a Survey
Active Learning in Collaborative Filtering Recommender Systems : a Survey
 
GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
GraphConnect Europe 2016 - Creating the Best Teams Ever with Collaborative Fi...
 
Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011Recommender Systems! @ASAI 2011
Recommender Systems! @ASAI 2011
 
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsCold-Start Management with Cross-Domain Collaborative Filtering and Tags
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
 
Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)Lecture 7: How to STUDY the Social Web? (2014)
Lecture 7: How to STUDY the Social Web? (2014)
 
Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Collaborative filtering
Collaborative filteringCollaborative filtering
Collaborative filtering
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Similaire à Harvesting Intelligence from User Interactions

Digital Trails Dave King 1 5 10 Part 2 D3
Digital Trails   Dave King   1 5 10   Part 2   D3Digital Trails   Dave King   1 5 10   Part 2   D3
Digital Trails Dave King 1 5 10 Part 2 D3
Dave King
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Editor IJAIEM
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
vivatechijri
 

Similaire à Harvesting Intelligence from User Interactions (20)

Digital Trails Dave King 1 5 10 Part 2 D3
Digital Trails   Dave King   1 5 10   Part 2   D3Digital Trails   Dave King   1 5 10   Part 2   D3
Digital Trails Dave King 1 5 10 Part 2 D3
 
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
Web 2.0 Collective Intelligence - How to use collective intelligence techniqu...
 
The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...The Web and the Collective Intelligence - How to use Collective Intelligence ...
The Web and the Collective Intelligence - How to use Collective Intelligence ...
 
Social search
Social searchSocial search
Social search
 
M045067275
M045067275M045067275
M045067275
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 
WORD
WORDWORD
WORD
 
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
Network Awareness Tool - Learning Analytics in the workplace: 
Detecting and ...
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Social and organizational perspective in HCI
Social and organizational perspective in HCISocial and organizational perspective in HCI
Social and organizational perspective in HCI
 
A Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media DataA Review: Text Classification on Social Media Data
A Review: Text Classification on Social Media Data
 
O017148084
O017148084O017148084
O017148084
 
Al26234241
Al26234241Al26234241
Al26234241
 
Analysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMMAnalysis on Recommended System for Web Information Retrieval Using HMM
Analysis on Recommended System for Web Information Retrieval Using HMM
 
Recommendation System Using Social Networking
Recommendation System Using Social Networking Recommendation System Using Social Networking
Recommendation System Using Social Networking
 
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
IRJET- Analysis on Existing Methodologies of User Service Rating Prediction S...
 
CHAPTER -12 it.pptx
CHAPTER -12 it.pptxCHAPTER -12 it.pptx
CHAPTER -12 it.pptx
 
Review and analysis of machine learning and soft computing approaches for use...
Review and analysis of machine learning and soft computing approaches for use...Review and analysis of machine learning and soft computing approaches for use...
Review and analysis of machine learning and soft computing approaches for use...
 
Session, focus and engagement
Session, focus and engagementSession, focus and engagement
Session, focus and engagement
 
IJSRED-V2I2P09
IJSRED-V2I2P09IJSRED-V2I2P09
IJSRED-V2I2P09
 

Plus de R A Akerkar

Big data in Business Innovation
Big data in Business Innovation   Big data in Business Innovation
Big data in Business Innovation
R A Akerkar
 
Linked open data
Linked open dataLinked open data
Linked open data
R A Akerkar
 
Semi structure data extraction
Semi structure data extractionSemi structure data extraction
Semi structure data extraction
R A Akerkar
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data sets
R A Akerkar
 
Description logics
Description logicsDescription logics
Description logics
R A Akerkar
 
Case Based Reasoning
Case Based ReasoningCase Based Reasoning
Case Based Reasoning
R A Akerkar
 
Semantic Markup
Semantic Markup Semantic Markup
Semantic Markup
R A Akerkar
 
Intelligent natural language system
Intelligent natural language systemIntelligent natural language system
Intelligent natural language system
R A Akerkar
 
Knowledge Organization Systems
Knowledge Organization SystemsKnowledge Organization Systems
Knowledge Organization Systems
R A Akerkar
 
Rational Unified Process for User Interface Design
Rational Unified Process for User Interface DesignRational Unified Process for User Interface Design
Rational Unified Process for User Interface Design
R A Akerkar
 
Unified Modelling Language
Unified Modelling LanguageUnified Modelling Language
Unified Modelling Language
R A Akerkar
 

Plus de R A Akerkar (20)

Rajendraakerkar lemoproject
Rajendraakerkar lemoprojectRajendraakerkar lemoproject
Rajendraakerkar lemoproject
 
Big Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social MediaBig Data and Harvesting Data from Social Media
Big Data and Harvesting Data from Social Media
 
Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?Can You Really Make Best Use of Big Data?
Can You Really Make Best Use of Big Data?
 
Big data in Business Innovation
Big data in Business Innovation   Big data in Business Innovation
Big data in Business Innovation
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?
 
Connecting and Exploiting Big Data
Connecting and Exploiting Big DataConnecting and Exploiting Big Data
Connecting and Exploiting Big Data
 
Linked open data
Linked open dataLinked open data
Linked open data
 
Semi structure data extraction
Semi structure data extractionSemi structure data extraction
Semi structure data extraction
 
Big data: analyzing large data sets
Big data: analyzing large data setsBig data: analyzing large data sets
Big data: analyzing large data sets
 
Description logics
Description logicsDescription logics
Description logics
 
Data Mining
Data MiningData Mining
Data Mining
 
Link analysis
Link analysisLink analysis
Link analysis
 
artificial intelligence
artificial intelligenceartificial intelligence
artificial intelligence
 
Case Based Reasoning
Case Based ReasoningCase Based Reasoning
Case Based Reasoning
 
Semantic Markup
Semantic Markup Semantic Markup
Semantic Markup
 
Intelligent natural language system
Intelligent natural language systemIntelligent natural language system
Intelligent natural language system
 
Data mining
Data miningData mining
Data mining
 
Knowledge Organization Systems
Knowledge Organization SystemsKnowledge Organization Systems
Knowledge Organization Systems
 
Rational Unified Process for User Interface Design
Rational Unified Process for User Interface DesignRational Unified Process for User Interface Design
Rational Unified Process for User Interface Design
 
Unified Modelling Language
Unified Modelling LanguageUnified Modelling Language
Unified Modelling Language
 

Dernier

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

Harvesting Intelligence from User Interactions

  • 1. HARVESTING INTELLIGENCE FROM USER INTERACTIONS Rajendra Akerkar
  • 2. Outline  What is collective intelligence?  B i technical concepts behind collective intelligence Basic h i l b hi d ll i i lli  Many forms of user interaction  Example of how user interaction is converted into intelligence
  • 3. Web users are undergoing a transformation…  Users are expressing themselves. This expression may be in the form of:  sharing their opinions on a product or a service through reviews or comments; through sharing and tagging content; through participation in an online community; or by contributing new content.  This increased user interaction and participation gives rise to data that can be converted into intelligence in your application. The use of intelligence to personalize a site for a user, to aid him in searching and making decisions, and to make the application more sticky are cherished goals that web applications try to f lf ll h b l fulfill.
  • 4. Wisdom of the Crowd  “Under the right circumstances, groups are extraordinarily intelligent, and are often smarter than the smartest people in them ” them.  “If the process is sound, the more people you involve in solving a problem, the better the result will be.” problem be  A crowd’s collective intelligence will produce better results than those of a small group of experts if four basic conditions are met.
  • 5. The “wise crowds” are valuable when they’re composed of individuals who…  Have diverse opinions;  Wh When the i di id l aren’t afraid to express their h individuals ’ f id h i opinions;  When there’s diversity in the crowd; and h h ’ d h d d  When there’s a way to aggregate all the information and use it in the decision-making process.
  • 6. Collective intelligence  To effectively use the information provided by others to improve one’s application.  When a group of individuals collaborate or compete with each other, intelligence or behavior that otherwise didn t didn’t exist suddenly emerges
  • 7.  A user may be influenced by other users either directly or through intelligence derived from the applications by mining the data.
  • 8. Collective intelligence of users is  The intelligence that’s extracted out from the collective set of interactions and contributions made by users.  The use of this intelligence to act as a filter for what’s valuable in your application for a user —This filter takes into account a user’s preferences and interactions to provide relevant information to the user.  Th There are a huge number of ways this information can h b f thi i f ti be processed and interpreted
  • 9. To apply collective intelligence in your application. You need to 1. allow users to interact with your site and with each other, learning about each user through their interactions and contributions contributions. 2. aggregate what you learn about your users and their contributions using some useful models models. 3. leverage those models to recommend relevant content to a user user.
  • 10. Three components to harnessing intelligence: 1 – Allow users to interact, 2 – Learn about your users in aggregate, aggregate 3 – Personalise content using user interactions data and aggregate data.
  • 11. Sources of information  Content-based—  based on information about the item itself, usually keywords or phrases occurring in the item. k d h i i th it  Collaborative-based—  based on the interactions of users.
  • 12. Algorithms for applying Intelligence  correlate users with content and with each other,  need a common language to compute relevance between items, b t it between users, and between users and items. db t d it  Content-based relevance is anchored in the content itself… (i f it lf (information retrieval systems) ti t i l t )  Collaborative-based relevance leverages the user interaction data to d t t meaningful relationships i t ti d t t detect i f l l ti hi  Unstructured text: to understand how metadata can be developed f d l d from unstructured text t t dt t
  • 13. Abstracting types of content  applications  li ti users and items  I Items ?  social-networking: user is also a type of l f item Metadata  professionally developed keywords, user-generated tags, keywords extracted by an algorithm after analyzing the text, ratings, popularity ranking etc. Profile based Profile-based and user-action based data user action Metadata as a set of attributes that help qualify an item.
  • 14. Sources for generating metadata about an item  users and items having an associated vector of metadata attributes.  The similarity or relevance between two users or two items or a user and item  measured by looking at the similarity between the two vectors vectors.
  • 16. Users provide a rich set of information
  • 17. Generating intelligence  Content-based analysis and collaborative filtering  to build a representation for the content • Terms or phrases • Terms are converted into their basic form by a process known as stemming. Terms with their associated weights (term vectors), then represent the metadata associated with the text. Similarity between two content items is measured by measuring the similarity associated with their term vectors.  to use the information provided by the interactions of users to predict items of interest for a user • to match a user’s metadata to that of other similar users and recommend items liked by them (Language independent methods) • E.g. users rate items, so CF approach find patterns in the way items have been rated by the user and other users to find additional items of interest for a user • Amazon, Netflix, and Google
  • 18. Collaborative filtering  Memory-based and model-based  a similarity measure is used to find similar users and then make a prediction using a weighted average of the ratings of the similar users  to build a model for prediction using a variety of approaches: linear algebra, probabilistic methods, neural networks, clustering, latent classes, and so on  A collaborative filtering algorithm usually works by g g y y searching a large group of people and finding a smaller set with tastes similar to yours. Collecting Preferences Recommending It R di Items Matching Products Item-Based Filtering
  • 19. Harnessing Collective Intelligence to transform from content-centric to user-centric applications  Prior to the user-centric revolution, many applications put little emphasis on the user. These applications, known as content-centric applications, content centric applications focused on the best way to present the content and were generally static from user to user and from day to day.  User-centric applications leverage Collective Intelligence to fundamentally change how the user interacts with the Web application application.  User-centric applications make the user the center of the web experience and dynamically reshuffle the content based on what’s known about the user and what the user explicitly asks for.
  • 20. User-centric applications are composed of the following four components  Core competency: The main reason why a user comes to the application.  Community: Connecting users with other users of interest, social networking, finding other users who may p provide answers to a user’s q questions.  Leveraging user-generated content: Incorporating generated content and interactions of users to provide additional content to users.  Building a marketplace: Monetizing the application by product and/or service placements and showing relevant advertisements.
  • 21.
  • 22. Classifying User – Generated Information
  • 23. Concept of a dataset  Densely populated dataset • It has more rows than columns • The dataset is richly populated  Clustering & build a predictive model  E E.g. similar users according to age and/or sex might be a i il di d/ i h b good predictor of the number of minutes a user will spend on the site • age  a good predictor • the number of minutes spent is inversely proportional to the age • a simple linear model • minutes spent = 50 – age of user
  • 24. Concept of a dataset • Set of users viewed any of the videos on our site within the timeframe  High-dimensional, sparsely populated  a generalization of th term vector representation li ti f the t t t ti  This representation is useful to find similar users and is known as the User Item matrix  users are represented as rows  the total number of videos represented as columns  Properties: more rows than columns, richly populated
  • 25.  Users are represented as columns  the videos as rows  Users who have viewed this video have also viewed these other videos  Properties  number of columns is large,  sparsely populated with nonzero entries in a few columns  multidimensional vector
  • 26. Forms of user interaction  need to quantify the quality of the interaction  R i Rating and voting interaction d i i i  explicit in the user’s intent  way of getting feedback on how well the user liked the item  is quantifiable and can be used directly  Voting is similar to rating. However, a vote can have only g g , y two values—1 for a positive vote and -1 for a negative vote  interactions such as using clicks are noisy—the intent of the user isn’t known perfectly and is implicit
  • 27. Persistence of ratings  Entities:  User & Items  user_item_rating is a mapping table that has a composite key, consisting of the user ID and content ID  The cardinality between the entities show that • Each user may rate 0 or more items. • Each rating is associated with only one user. • An item may contain 0 or more ratings. • Each rating is associated with only one item. digg.com, allows users to contribute and vote  What are the top 10 rated items? on interesting articles
  • 28. Forwarding a lin  Similar to voting, forwarding the content to others can be considered a positive vote for the item by the user
  • 29. Bookmarking and saving  By bookmarking URLs, a user is explicitly expressing interest in the material associated with the bookmark. URLs that are commonly bookmarked bubble up higher in the site.
  • 30. Purchasing items  users purchase items  casting an explicit vote of confidence in the item  Amazon (Item-to-Item recommendation engine)  Users that buy similar items can be correlated  Items that have been bought by other users can be recommended to a user …
  • 31. Click-stream news.google.com personalisation  When a list of items is presented to a user … …  positive vote  item clicked  Looking at whether an item was visited and the time spent on it provides useful t id f l information.  You can also gather useful statistics from this data: • ■ What is the average time a user spends on a particular item? • ■ For a user, what is the average time spent on any given article?
  • 32. Reviews  Opinions and tastes are often expressed through reviews and recommendations. These have the greatest impact on other users when  They’re unbiased  The reviews are from similar users  They’re from a person of influence  Just like voting for articles at Digg, other users can endorse a reviewer or vote on his reviews
  • 33. Converting user interaction into intelligence  User interaction gets converted into a dataset for learning.  three users who’ve rated photographs  a number of ways to transform raw ratings from users into intelligence  aggregate all the ratings about the item and provide the average • to create a Top 10 Rated Items list • constantly promoting the popular content
  • 34.  Given the set of data, we answer two questions in our example:  What are the set of related items for a given item?  For a user, who are the other users that are similar to the user?  Three approaches: • cosine-based similarity, • correlation-based similarity, and • adjusted-cosine-based s a ty adjusted cos e based similarity.
  • 35. Cosine-based similarity computation  takes the dot product of two vectors  to learn about the photos, we transpose the matrix photos a row corresponds to a photo while the columns (users) correspond to dimensions that describe the photo  normalize the values for each of the rows  by dividing each of the cell entries by the square root of the sum of the squares of entries in a particular row The similarity between Photo 1 and Photo 2 is computed as (0.8018 * 0.7428) + (0.5345 * 0.3714) + (0.2673 * 0.557) = 0.943
  • 36.  Item-to-item similarity table  Wh t are the set of related items for a given item? What th t f l t d it f i it ?  According to the table, Photo1 and Photo2 are very similar.  To determine similar users,  associated with each user is a vector, where the rating associated with each item corresponds to a dimension in the vector  analysis process is similar to calculating the item-to-item similarity table  User-to-user similarity table
  • 37. Intelligence from other forms of user interactions  How other forms of user-interaction get transformed into metadata?  Approaches  content-based and  collaboration-based
  • 38. content-based approach  metadata is associated with each item  Thi t This term vector could b created b analyzing th content of t ld be t d by l i the t t f the item or using tagging information by users  The term vector consists of keywords or tags with a relative weight associated with each term.  As the user saves content, visits content, or writes recommendations, she i h i the metadata associated with each d i h inherits h d i d ih h
  • 39. Collaboration-based approach  analysis of data collected by bookmarking, saving an item, recommending an item  a sparsely populated dataset  What are other items that have been bookmarked by other users who bookmarked the same articles as a specific user?  When the user is John, the answer is Article 3 — Doe has bookmarked Article 1 and also Article 3.  What are the related items based on the bookmarking patterns of the users?
  • 40. Collaboration-based approach  Here useful to invert the dataset:  The users correspond to the dimensions of the vector for an article.  Similarities between two items are measured by computing the dot product between them  The normalized matrix is  The item-to-item similarity matrix based on this data is LEARNING: if someone bookmarks Article 1, you should recommend Article 3 to the user, user and if the user bookmarks Article 2, you should also recommend Article 3
  • 41.  A similar analysis can be performed by using information from the items the user  saves,  purchases, and  recommends.  You can further refine your analysis by associating  data only from users that are similar to a user based on user-profile information.
  • 42. Summary  Metadata associated with users and items can be used to derive intelligence in the form of building recommendation engines and predictive models for personalization, and for enhancing search.
  • 43. References  S. Alag, Collective intelligence in action, Manning, 2009  H. Marmanis, D. Babenko , Algorithms of the Intelligent Web, , g g , Manning, 2009  T. Segaran , Programming Collective Intelligence: Building Smart Web 2.0 Applications O’Reilly 2 0 Applications, O Reilly  Wang, Jun, Arjen P. de Vries, and Marcel J.T. Reinders. Unifying User- based and Item-based Collaborative Filtering Approaches by Similarity Fusion. 2006 Fusion 2006. http://ict.ewi.tudelft.nl/pub/jun/sigir06_similarityfuson.pdf
  • 44. Thank you! y Rajendra Ak k R j d Akerkar Vestlandsforsking, Sogndal, NORWAY E mail: E-mail: rak@vestforsk.no URL: www.tmrfindia.org/ra.html