SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
DIGITAL
                            Institute for Information and Communication Technologies




                                Pragmatic metadata matters:
                           How data about the usage of data affects
                                   semantic user models
                                Claudia Wagner, Markus Strohmaier, Yulan He



Sunday, October 23, 2011
Example
                                                     Semantic Metadata




                                                           sioc:content
                    sioc:name
                                      sioc:has_creator
                                                                    rdf:type


                           rdf:type                                       sioc:Post

      sioc:UserAccount
               2
                                             foaf:Person
                      sioc:account_of
Sunday, October 23, 2011
Example
                           Pragmatic Metadata




               3


Sunday, October 23, 2011
Example
                           Pragmatic Metadata




               3


Sunday, October 23, 2011
Example
                           Pragmatic Metadata




               3


Sunday, October 23, 2011
Example
                           Pragmatic Metadata




               3


Sunday, October 23, 2011
Example
                           Pragmatic Metadata




               3


Sunday, October 23, 2011
Aim
                   Can pragmatic metadata support the generation of semantic
                   metadata and if yes how?




                       sioc:name                                   sioc:content
                                      sioc:has_creator
                                                                            rdf:type


                           rdf:type
                                                         ?   sioc:topic
                                                                                  sioc:Post
                                         foaf:interest

      sioc:UserAccount                                               ?
               4
                                             foaf:Person
                      sioc:account_of
Sunday, October 23, 2011
Experimental Setup
                   § Methodology
                      § Topic Modeling Algorithms to learn topics (probability
                         distributions of words) and annotate users and posts
                         with topics
                      § Incorporated different types of pragmatic metadata
                         into the Topic Models
                      § Compared different models via their predictive
                         performance

                   § Dataset
                      § Boards.ie
                         § Forums, Posts and Users
                         § User`s authoring and replying behavior
                      § Training Dataset: First and last week of February 2006
                      § Test Dataset: 3 future posts of each user
               5


Sunday, October 23, 2011
Evaluation

                   § Compare different models by testing their predictive
                      performance on held out posts.

                                             Log Likelihood of a word of user`s
                                           future post given the model we learned




                                        Sum over all words in a user`s future post


                   § Assumption: a better user topic model reacts less
                      perplex on future posts authored by a user and needs
                      less trainings samples.
               6


Sunday, October 23, 2011
Methodology
                                                                          LDA
                   § How to learn topics and annotate users with topics?




                                                  Text

                      §
                                                         Latent Dirichlet Allocation (LDA)
                                      T1:                (Blei et al, 2003)
                                      mac: 0.3
                                      iMac: 0.13
                                      PC: 0.03
                                      computer: 0.04
                                      ....
                           T1 T2 T3
               7




Sunday, October 23, 2011
Methodology
                                                                DMR
                   § How to incorporate metadata into topic models?

                   § Dirichlet Multinomial Regression (DMR) Topic Models
                      (Mimno et al, 2008)




                   § Observe feature vector x per document
                      § Draw „fresh“ alpha for each document which depends
                         on observed features x and the feature distribution per
                         topic λt


               8
                           ∝   dt=   exp(λt Xdt)

Sunday, October 23, 2011
Methodology

                                                                     Post 7
    ID       Alg           Doc    Metadata                                                           Future


    M1       LDA           Post   -                                                                   Past
                                                                     Post 1
                                                          authored
    M2       LDA           User   -
                                                                     Post 2
    M3       DMR           Post   author
    M4       DMR           User   author                             Post 3
                                                                                       replies to
                                                 User 1
    M5       DMR           Post   reply-user
                                                                                                    Post 4
                                                                                  authored
    M6       DMR           User   reply-user
                                                                                                    Post 5
    M7       DMR           Post   related-user
    M8       DMR           User   related-user                                User 2                Post 6

               9


Sunday, October 23, 2011
Post	
  training	
  scheme	
  
                                            (M3,	
  M5	
  and	
  M7)


                   § Different user activities performed on content



                                     Baseline	
  LDA	
  
                                     (M1	
  and	
  M2)




                                 Models	
  which	
  take	
  user	
  replies	
  into	
  account.
                                 (M6	
  and	
  M8)
              10


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results

    ID       Alg           Doc    Metadata
                                                                     Post 7                          Future
    M1       LDA           Post   -
    M2       LDA           User   -                                                                   Past
                                                                     Post 1
                                                          authored
    M3       DMR           Post   author
                                                                     Post 2
    M4       DMR           User   author
                                                                     Post 3
    M5       DMR           Post   reply-user     User 1                                replies to


    M6       DMR           User   reply-user                                                        Post 4
                                                                                  authored

                                                                                                    Post 5
    M7       DMR           Post   related-user

                                                                              User 2                Post 6
    M8       DMR           User   related-user
              11


Sunday, October 23, 2011
Results
                   § The topics of users who reply to a user are also likely for
                      this user
                   § Therefore, if 2 users get replies from the same users
                      than they are more likely to talk about the same topics

                   § Topic models which incorporate pragmatic metadata per
                      user can indeed improve models
                   § Topic models which incorporate pragmatic metadata per
                      post often over-fit data
                      § Model Assumptions are too strict!

                   § Idea: Incorporate behavioral user similarities
                   § Intuition: users which are similar are more likely to talk
                      about the same topics
                   § How to measure behavioral similarity?
                      § forum usage
              12
                      § communication behavior

Sunday, October 23, 2011
Methodology
                                                           Post 7                  Future

     ID      Alg      Doc Metadata
                                                                                   Past
                                                           Post 1

                                               authored
     M9      DMR Post top 10 forums                        Post 2


                                      User 1               Post 3
     M10 DMR User top 10 forums        f1          f15
                                       f2          f20
                                       f3          f31         authored   Post 4
                  top 10               f4          f12
     M11 DMR Post communication        f5          f5
                                                                          Post 5
                  partner              f6          f6
                                       f7          f17
                                       f8          f18                    Post 6
                  top 10               f9          f19    User 2
     M12 DMR User communication        f10         f10
                  partner
              13


Sunday, October 23, 2011
Post	
  training	
  scheme	
  
                                                 (M3,	
  M9	
  and	
  M11)

                           Baseline	
  LDA	
  
                           (M1	
  and	
  M2)

                                                                                  User	
  training	
  
                                                                                  scheme	
  
                                                                                  (M4,	
  M10	
  
                                                                                  and	
  M12)


                                                                                  Models	
  M12	
  	
  
                                                                                  incorporates	
  user	
  
                                                                                  similari;es	
  based	
  on	
  
                                                                                  their	
  communica;on
                                                                                  behavior
              14


Sunday, October 23, 2011
Results
                   § Topic models seem to benefit from taking behavioral
                      user similarities into account

                   § Users who behave similar (regarding their forum usage
                      and communication behavior) are likely to talk about the
                      same topics

                   § Common communication-partner seem to be more
                      predictive for common topics than common forums




              15


Sunday, October 23, 2011
Conclusions
                   § Pragmatic metadata may help to learn better semantic
                      user models

                   § But pragmatic metadata observed on a post level often
                      over-fits data

                   § Pragmatic Metadata on a user level seems to improve
                      the predictive performance of topic models
                      § If posts of 2 users are “used” in a similar way then
                         they are more likely to talk about the same topics
                      § If 2 users behave similar (tend to post to same forums
                         or tend to talk to same users) they are more likely to
                         talk about same topics.
                      § Common communication-partner seem to be more
                         predictive for common topics than common forums

              16


Sunday, October 23, 2011
Limitations and Future Work
                   § Perplexity and semantic interpretability of topics do not
                      necessarily correlate (Chang et al., 2009)
                   § Separate evaluation of semantic coherence of topics


                   § Analyzing different types of behavior- and usage-related
                      metadata and explore to what extent they may reveal
                      information about the semantics of data
                      § behavior on social streams such as Twitter
                      § tagging behavior
                      § navigation behavior




              17


Sunday, October 23, 2011
References
                   §   David M. Blei, Andrew Ng, Michael Jordan. Latent Dirichlet allocation. JMLR (3)
                        (2003) pp. 993-1022

                   §   Chang, J., Boyd-graber, J., Gerrish, S., Wang, C. and Blei, D. Reading Tea
                        Leaves: How Humans Interpret Topic Models, Neural Information Processing
                        Systems, NIPS (2009)

                   §   Mimno, D.M. and McCallum, A. Topic Models Conditioned on Arbitrary Features
                        with Dirichlet-multinomial Regression. In Proceedings of UAI. (2008), pp. 411-418




              18


Sunday, October 23, 2011

Contenu connexe

Tendances

Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep LearningAdam Gibson
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
Patch SVDD: Patch-level SVDD for Anomaly Detection and SegmentationPatch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentationtaeseon ryu
 
DynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesDynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesOscar Corcho
 
KR Workshop 1 - Ontologies
KR Workshop 1 - OntologiesKR Workshop 1 - Ontologies
KR Workshop 1 - OntologiesMichele Pasin
 
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen Mykonos
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen MykonosBertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen Mykonos
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen MykonosEUscreen
 
Pal gov.tutorial4.session1 2.whatisontology
Pal gov.tutorial4.session1 2.whatisontologyPal gov.tutorial4.session1 2.whatisontology
Pal gov.tutorial4.session1 2.whatisontologyMustafa Jarrar
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Christoph Lange
 

Tendances (15)

Lecture1 - Machine Learning
Lecture1 - Machine LearningLecture1 - Machine Learning
Lecture1 - Machine Learning
 
Information Retrieval with Deep Learning
Information Retrieval with Deep LearningInformation Retrieval with Deep Learning
Information Retrieval with Deep Learning
 
Jmora.di.oeg.3x1e
Jmora.di.oeg.3x1eJmora.di.oeg.3x1e
Jmora.di.oeg.3x1e
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Lecture19
Lecture19Lecture19
Lecture19
 
Lecture4 - Machine Learning
Lecture4 - Machine LearningLecture4 - Machine Learning
Lecture4 - Machine Learning
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Lecture20
Lecture20Lecture20
Lecture20
 
Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
Patch SVDD: Patch-level SVDD for Anomaly Detection and SegmentationPatch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
Patch SVDD: Patch-level SVDD for Anomaly Detection and Segmentation
 
DynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniquesDynaLearn: Problem-based learning supported by semantic techniques
DynaLearn: Problem-based learning supported by semantic techniques
 
KR Workshop 1 - Ontologies
KR Workshop 1 - OntologiesKR Workshop 1 - Ontologies
KR Workshop 1 - Ontologies
 
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen Mykonos
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen MykonosBertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen Mykonos
Bertini - Automatic Metadata Extraction in VidiVideo & im3i @EUscreen Mykonos
 
2012 12 12_adam_v_final
2012 12 12_adam_v_final2012 12 12_adam_v_final
2012 12 12_adam_v_final
 
Pal gov.tutorial4.session1 2.whatisontology
Pal gov.tutorial4.session1 2.whatisontologyPal gov.tutorial4.session1 2.whatisontology
Pal gov.tutorial4.session1 2.whatisontology
 
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
 

Similaire à SDOW (ISWC2011)

A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...Hiroshi Ono
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic WebMarin Dimitrov
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1iotest
 
Algorithm
AlgorithmAlgorithm
Algorithmseobear
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...Dataconomy Media
 
Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Databaseijbuiiir1
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Robert McDermott
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Robert McDermott
 
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...NIPS2010 reading: Semi-supervised learning with adversarially missing label i...
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...Akisato Kimura
 
The secret life of rules in Software Engineering
The secret life of rules in Software EngineeringThe secret life of rules in Software Engineering
The secret life of rules in Software EngineeringJordi Cabot
 
@lis agent communication, ontologies, protocols, semantic web 2003
@lis   agent communication, ontologies, protocols, semantic web 2003@lis   agent communication, ontologies, protocols, semantic web 2003
@lis agent communication, ontologies, protocols, semantic web 2003Luigi Ceccaroni
 
Building a Semantic search Engine in a library
Building a Semantic search Engine in a libraryBuilding a Semantic search Engine in a library
Building a Semantic search Engine in a librarySEECS NUST
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic webWorawith Sangkatip
 
Newsletter Infographics (8).pdf
Newsletter Infographics (8).pdfNewsletter Infographics (8).pdf
Newsletter Infographics (8).pdfFiza987241
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$Sof Ouni
 
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISION
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISIONA FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISION
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISIONMarcos Nieto
 
The Very Model of a Modern Metamodeler
The Very Model of a Modern MetamodelerThe Very Model of a Modern Metamodeler
The Very Model of a Modern MetamodelerEd Seidewitz
 

Similaire à SDOW (ISWC2011) (20)

A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categor...
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1Semantic IoT Semantic Inter-Operability Practices - Part 1
Semantic IoT Semantic Inter-Operability Practices - Part 1
 
Algorithm
AlgorithmAlgorithm
Algorithm
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Semantic Digital Libraries
Semantic Digital LibrariesSemantic Digital Libraries
Semantic Digital Libraries
 
Towards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational DatabaseTowards Ontology Development Based on Relational Database
Towards Ontology Development Based on Relational Database
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
Basics of Generative AI: Models, Tokenization, Embeddings, Text Similarity, V...
 
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...NIPS2010 reading: Semi-supervised learning with adversarially missing label i...
NIPS2010 reading: Semi-supervised learning with adversarially missing label i...
 
The secret life of rules in Software Engineering
The secret life of rules in Software EngineeringThe secret life of rules in Software Engineering
The secret life of rules in Software Engineering
 
@lis agent communication, ontologies, protocols, semantic web 2003
@lis   agent communication, ontologies, protocols, semantic web 2003@lis   agent communication, ontologies, protocols, semantic web 2003
@lis agent communication, ontologies, protocols, semantic web 2003
 
Building a Semantic search Engine in a library
Building a Semantic search Engine in a libraryBuilding a Semantic search Engine in a library
Building a Semantic search Engine in a library
 
Ontology mapping for the semantic web
Ontology mapping for the semantic webOntology mapping for the semantic web
Ontology mapping for the semantic web
 
OCL3_10_05.pptx
OCL3_10_05.pptxOCL3_10_05.pptx
OCL3_10_05.pptx
 
Newsletter Infographics (8).pdf
Newsletter Infographics (8).pdfNewsletter Infographics (8).pdf
Newsletter Infographics (8).pdf
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Iot ontologies state of art$$$
Iot ontologies state of art$$$Iot ontologies state of art$$$
Iot ontologies state of art$$$
 
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISION
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISIONA FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISION
A FRIENDLY APPROACH TO PARTICLE FILTERS IN COMPUTER VISION
 
The Very Model of a Modern Metamodeler
The Very Model of a Modern MetamodelerThe Very Model of a Modern Metamodeler
The Very Model of a Modern Metamodeler
 

Plus de Claudia Wagner

Measuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaMeasuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaClaudia Wagner
 
Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Claudia Wagner
 
It's a Man's Wikipedia?
It's a Man's Wikipedia? It's a Man's Wikipedia?
It's a Man's Wikipedia? Claudia Wagner
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Claudia Wagner
 
When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...Claudia Wagner
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsClaudia Wagner
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISClaudia Wagner
 
Spatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsClaudia Wagner
 
Eswc2013 audience short
Eswc2013 audience shortEswc2013 audience short
Eswc2013 audience shortClaudia Wagner
 
The Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksClaudia Wagner
 
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users Claudia Wagner
 
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Claudia Wagner
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsClaudia Wagner
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsClaudia Wagner
 
The wisdom in Tweetonomies
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in TweetonomiesClaudia Wagner
 

Plus de Claudia Wagner (18)

Measuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaMeasuring Gender Inequality in Wikipedia
Measuring Gender Inequality in Wikipedia
 
Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"
 
It's a Man's Wikipedia?
It's a Man's Wikipedia? It's a Man's Wikipedia?
It's a Man's Wikipedia?
 
Food and Culture
Food and CultureFood and Culture
Food and Culture
 
Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
 
When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...
 
WWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging Streams
 
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESIS
 
Spatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary Patterns
 
Eswc2013 audience short
Eswc2013 audience shortEswc2013 audience short
Eswc2013 audience short
 
The Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social Networks
 
It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users
 
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
 
Socialbots www2012
Socialbots www2012Socialbots www2012
Socialbots www2012
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Knowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
 
The wisdom in Tweetonomies
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in Tweetonomies
 

Dernier

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 

Dernier (20)

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

SDOW (ISWC2011)

  • 1. DIGITAL Institute for Information and Communication Technologies Pragmatic metadata matters: How data about the usage of data affects semantic user models Claudia Wagner, Markus Strohmaier, Yulan He Sunday, October 23, 2011
  • 2. Example Semantic Metadata sioc:content sioc:name sioc:has_creator rdf:type rdf:type sioc:Post sioc:UserAccount 2 foaf:Person sioc:account_of Sunday, October 23, 2011
  • 3. Example Pragmatic Metadata 3 Sunday, October 23, 2011
  • 4. Example Pragmatic Metadata 3 Sunday, October 23, 2011
  • 5. Example Pragmatic Metadata 3 Sunday, October 23, 2011
  • 6. Example Pragmatic Metadata 3 Sunday, October 23, 2011
  • 7. Example Pragmatic Metadata 3 Sunday, October 23, 2011
  • 8. Aim Can pragmatic metadata support the generation of semantic metadata and if yes how? sioc:name sioc:content sioc:has_creator rdf:type rdf:type ? sioc:topic sioc:Post foaf:interest sioc:UserAccount ? 4 foaf:Person sioc:account_of Sunday, October 23, 2011
  • 9. Experimental Setup § Methodology § Topic Modeling Algorithms to learn topics (probability distributions of words) and annotate users and posts with topics § Incorporated different types of pragmatic metadata into the Topic Models § Compared different models via their predictive performance § Dataset § Boards.ie § Forums, Posts and Users § User`s authoring and replying behavior § Training Dataset: First and last week of February 2006 § Test Dataset: 3 future posts of each user 5 Sunday, October 23, 2011
  • 10. Evaluation § Compare different models by testing their predictive performance on held out posts. Log Likelihood of a word of user`s future post given the model we learned Sum over all words in a user`s future post § Assumption: a better user topic model reacts less perplex on future posts authored by a user and needs less trainings samples. 6 Sunday, October 23, 2011
  • 11. Methodology LDA § How to learn topics and annotate users with topics? Text § Latent Dirichlet Allocation (LDA) T1: (Blei et al, 2003) mac: 0.3 iMac: 0.13 PC: 0.03 computer: 0.04 .... T1 T2 T3 7 Sunday, October 23, 2011
  • 12. Methodology DMR § How to incorporate metadata into topic models? § Dirichlet Multinomial Regression (DMR) Topic Models (Mimno et al, 2008) § Observe feature vector x per document § Draw „fresh“ alpha for each document which depends on observed features x and the feature distribution per topic λt 8 ∝ dt= exp(λt Xdt) Sunday, October 23, 2011
  • 13. Methodology Post 7 ID Alg Doc Metadata Future M1 LDA Post - Past Post 1 authored M2 LDA User - Post 2 M3 DMR Post author M4 DMR User author Post 3 replies to User 1 M5 DMR Post reply-user Post 4 authored M6 DMR User reply-user Post 5 M7 DMR Post related-user M8 DMR User related-user User 2 Post 6 9 Sunday, October 23, 2011
  • 14. Post  training  scheme   (M3,  M5  and  M7) § Different user activities performed on content Baseline  LDA   (M1  and  M2) Models  which  take  user  replies  into  account. (M6  and  M8) 10 Sunday, October 23, 2011
  • 15. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 16. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 17. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 18. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 19. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 20. Results ID Alg Doc Metadata Post 7 Future M1 LDA Post - M2 LDA User - Past Post 1 authored M3 DMR Post author Post 2 M4 DMR User author Post 3 M5 DMR Post reply-user User 1 replies to M6 DMR User reply-user Post 4 authored Post 5 M7 DMR Post related-user User 2 Post 6 M8 DMR User related-user 11 Sunday, October 23, 2011
  • 21. Results § The topics of users who reply to a user are also likely for this user § Therefore, if 2 users get replies from the same users than they are more likely to talk about the same topics § Topic models which incorporate pragmatic metadata per user can indeed improve models § Topic models which incorporate pragmatic metadata per post often over-fit data § Model Assumptions are too strict! § Idea: Incorporate behavioral user similarities § Intuition: users which are similar are more likely to talk about the same topics § How to measure behavioral similarity? § forum usage 12 § communication behavior Sunday, October 23, 2011
  • 22. Methodology Post 7 Future ID Alg Doc Metadata Past Post 1 authored M9 DMR Post top 10 forums Post 2 User 1 Post 3 M10 DMR User top 10 forums f1 f15 f2 f20 f3 f31 authored Post 4 top 10 f4 f12 M11 DMR Post communication f5 f5 Post 5 partner f6 f6 f7 f17 f8 f18 Post 6 top 10 f9 f19 User 2 M12 DMR User communication f10 f10 partner 13 Sunday, October 23, 2011
  • 23. Post  training  scheme   (M3,  M9  and  M11) Baseline  LDA   (M1  and  M2) User  training   scheme   (M4,  M10   and  M12) Models  M12     incorporates  user   similari;es  based  on   their  communica;on behavior 14 Sunday, October 23, 2011
  • 24. Results § Topic models seem to benefit from taking behavioral user similarities into account § Users who behave similar (regarding their forum usage and communication behavior) are likely to talk about the same topics § Common communication-partner seem to be more predictive for common topics than common forums 15 Sunday, October 23, 2011
  • 25. Conclusions § Pragmatic metadata may help to learn better semantic user models § But pragmatic metadata observed on a post level often over-fits data § Pragmatic Metadata on a user level seems to improve the predictive performance of topic models § If posts of 2 users are “used” in a similar way then they are more likely to talk about the same topics § If 2 users behave similar (tend to post to same forums or tend to talk to same users) they are more likely to talk about same topics. § Common communication-partner seem to be more predictive for common topics than common forums 16 Sunday, October 23, 2011
  • 26. Limitations and Future Work § Perplexity and semantic interpretability of topics do not necessarily correlate (Chang et al., 2009) § Separate evaluation of semantic coherence of topics § Analyzing different types of behavior- and usage-related metadata and explore to what extent they may reveal information about the semantics of data § behavior on social streams such as Twitter § tagging behavior § navigation behavior 17 Sunday, October 23, 2011
  • 27. References § David M. Blei, Andrew Ng, Michael Jordan. Latent Dirichlet allocation. JMLR (3) (2003) pp. 993-1022 § Chang, J., Boyd-graber, J., Gerrish, S., Wang, C. and Blei, D. Reading Tea Leaves: How Humans Interpret Topic Models, Neural Information Processing Systems, NIPS (2009) § Mimno, D.M. and McCallum, A. Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression. In Proceedings of UAI. (2008), pp. 411-418 18 Sunday, October 23, 2011