SlideShare a Scribd company logo
1 of 10
Download to read offline
Interaction-level relations for Opinion Analysis
               Putting forth the benefits of Textometry

               Sentiment Analysis Symposium 2011
               Manhattan Conference Center, New York, USA



                Marguerite Leenhardt - PhD Student in Applied Linguistics, NLP, Textometry   SYLED/CLA2T - Paris 3 University
                mleenhardt@le-semiopole.fr




                                                                                                                                April 12th, 2011

mardi 12 avril 2011
TEXTOMETRY ?

              - branch of statistical study of linguistic data                           TWOFOLD TEXT SEGMENTATION PROCESS
                                                                                      GENERATES THE DATASET’S CANVAS/FRAMEWORK

              - text considered as possessing its own internal structure

              - bypassing information extraction step (qualitative                                         CONTENTS
                                                                                                           textual sequences organized in
              coding)                                                        CORPUS                        sentences, paragraphs, ...

                 > applying statistical and probabilistic calculations to                   b.   b.
                 the units that make up comparable texts in a corpus                   a.   b.   b.
                 > mostly based on hypergeometric model and                                 b.
                 proximity algorithms                                                                          CONTAINERS
                                                                                                               annotation systems (e.g.
                 > reveals structures that would remain hidden due to                                          sentence or paragraph segmentation
                                                                                            d.   d.            markers considered a specific type of
                 the quantity of data                                                                          annotation on contents)
                                                                                       c.   d.   d.
              - robust method processing data without external                              d.
              ressources constraints (lexicons, dictionnaries, ontologies)

              - analyzing objects distribution within the corpus
              framework




mardi 12 avril 2011
IDENTIFYING MAJOR TRENDS AND OPPOSITIONS IN A DATASET

              - Corpus Cocoon : online media analysis following a product launch - 40 000 words

              - Factorial Correspondence Analysis is used to determine distance between textual objetcs compared on the basis of
              proximity algorithm (positioning sets of elements in the corpus space)

              - Closest objects heavily cite the press release ; blogs cite Named Entities (brand and product) but diverge from the press
              release.
                        !




                                       AFC output to compare user’s comments on different web supports ; french corpus


mardi 12 avril 2011
INTERACTION-LEVEL RELATIONS : WHY ?

              - textual interactions as the main material for Opinion Mining/Sentiment Analysis

              - contextual analysis as an important challenge (Pang & Lee, 2008) and a major ressource for
              interpretation (Somasundaran, 2010) : interactional features are informative on a global scale (discourse
              ≠ interaction)

              - Textometry as a means to go beyond the local context boundaries by taking global dimensions into
              account : text is considered a component in and of itself (bottum-up approach)

              - «A lot of information is often not captured in the handbuilt model and lost.» (Boiy et al., 2007)

              - qualitative coding should not be the first approach but a second step after mining corpus-based
              knowledge




mardi 12 avril 2011
INTERACTION-LEVEL RELATIONS : HOW ?
     annotating interactional relations between
     user’s contributions in a given discussion
     > linking and specifying containers




                                              > Corpus enhanced with qualitative information
                                              > Acquiring information on the context : conversational tree
                                              > Determining zones of intensity in a discussion feed (computer-
                                              assisted task)




                                                                                                                 Named Entities Recognition +
                                                                                                                 matchnig paraphrases
                                              > Analyzing linguistic specificness of linked containers vs. the
                                              whole corpus                                                       Corpus-driven lexical ressource
                                              > Building corpus-driven linguistic ressources (textometric        (LR) for thematic analysis
                                              objects)                                                           Corpus-driven lexical ressource
                                                                                                                 (LR) for opinion




mardi 12 avril 2011
PROJECTING THE CORPUS-DRIVEN LINGUISTIC RESSOURCES FOR OPINION

              - Corpus Cocoon : the LR is projected on the dataset’s canvas/framework to highlight distribution of opinions
              amongst UGCs (adaptation of the Appraisal Theory scale for opinion orientation)

              - Distributional Inventory is used to identify major trends in opinion expression ; here, most of UGCs are not
              relevant as they only cite the brand in congratulation messages to the bloggers who posted on the product launch.




                                                                                                      !
                                                      Opinion distribution amongst user’s comments




mardi 12 avril 2011
«I» NETWORK IN THE ORANGE CORPUS




mardi 12 avril 2011
«FORFAIT» IN THE ORANGE CORPUS




mardi 12 avril 2011
ORANGE LEXICO-SEMANTIC NETWORK




mardi 12 avril 2011
Merci !
           Marguerite Leenhardt PhD student
           mleenhardt@le-semiopole.fr




mardi 12 avril 2011

More Related Content

Viewers also liked

Chesterfield
ChesterfieldChesterfield
ChesterfieldGabirice
 
Motivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their TextbooksMotivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their Textbooksjabdurrashid
 
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéSSzakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS987987
 
Legalis Munkavegzes
Legalis MunkavegzesLegalis Munkavegzes
Legalis Munkavegzes987987
 

Viewers also liked (6)

Chesterfield
ChesterfieldChesterfield
Chesterfield
 
Motivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their TextbooksMotivating Visual Arts Students To Utilize Their Textbooks
Motivating Visual Arts Students To Utilize Their Textbooks
 
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéSSzakmai Gyakorlati FoglalkoztatóI EgyeztetéS
Szakmai Gyakorlati FoglalkoztatóI EgyeztetéS
 
Legalis Munkavegzes
Legalis MunkavegzesLegalis Munkavegzes
Legalis Munkavegzes
 
Daniel 2 B
Daniel 2 BDaniel 2 B
Daniel 2 B
 
Amina
AminaAmina
Amina
 

Recently uploaded

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 

Recently uploaded (20)

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 

Interaction-level relations for Opinion Analysis Putting forth the benefits of Textometry

  • 1. Interaction-level relations for Opinion Analysis Putting forth the benefits of Textometry Sentiment Analysis Symposium 2011 Manhattan Conference Center, New York, USA Marguerite Leenhardt - PhD Student in Applied Linguistics, NLP, Textometry SYLED/CLA2T - Paris 3 University mleenhardt@le-semiopole.fr April 12th, 2011 mardi 12 avril 2011
  • 2. TEXTOMETRY ? - branch of statistical study of linguistic data TWOFOLD TEXT SEGMENTATION PROCESS GENERATES THE DATASET’S CANVAS/FRAMEWORK - text considered as possessing its own internal structure - bypassing information extraction step (qualitative CONTENTS textual sequences organized in coding) CORPUS sentences, paragraphs, ... > applying statistical and probabilistic calculations to b. b. the units that make up comparable texts in a corpus a. b. b. > mostly based on hypergeometric model and b. proximity algorithms CONTAINERS annotation systems (e.g. > reveals structures that would remain hidden due to sentence or paragraph segmentation d. d. markers considered a specific type of the quantity of data annotation on contents) c. d. d. - robust method processing data without external d. ressources constraints (lexicons, dictionnaries, ontologies) - analyzing objects distribution within the corpus framework mardi 12 avril 2011
  • 3. IDENTIFYING MAJOR TRENDS AND OPPOSITIONS IN A DATASET - Corpus Cocoon : online media analysis following a product launch - 40 000 words - Factorial Correspondence Analysis is used to determine distance between textual objetcs compared on the basis of proximity algorithm (positioning sets of elements in the corpus space) - Closest objects heavily cite the press release ; blogs cite Named Entities (brand and product) but diverge from the press release. ! AFC output to compare user’s comments on different web supports ; french corpus mardi 12 avril 2011
  • 4. INTERACTION-LEVEL RELATIONS : WHY ? - textual interactions as the main material for Opinion Mining/Sentiment Analysis - contextual analysis as an important challenge (Pang & Lee, 2008) and a major ressource for interpretation (Somasundaran, 2010) : interactional features are informative on a global scale (discourse ≠ interaction) - Textometry as a means to go beyond the local context boundaries by taking global dimensions into account : text is considered a component in and of itself (bottum-up approach) - «A lot of information is often not captured in the handbuilt model and lost.» (Boiy et al., 2007) - qualitative coding should not be the first approach but a second step after mining corpus-based knowledge mardi 12 avril 2011
  • 5. INTERACTION-LEVEL RELATIONS : HOW ? annotating interactional relations between user’s contributions in a given discussion > linking and specifying containers > Corpus enhanced with qualitative information > Acquiring information on the context : conversational tree > Determining zones of intensity in a discussion feed (computer- assisted task) Named Entities Recognition + matchnig paraphrases > Analyzing linguistic specificness of linked containers vs. the whole corpus Corpus-driven lexical ressource > Building corpus-driven linguistic ressources (textometric (LR) for thematic analysis objects) Corpus-driven lexical ressource (LR) for opinion mardi 12 avril 2011
  • 6. PROJECTING THE CORPUS-DRIVEN LINGUISTIC RESSOURCES FOR OPINION - Corpus Cocoon : the LR is projected on the dataset’s canvas/framework to highlight distribution of opinions amongst UGCs (adaptation of the Appraisal Theory scale for opinion orientation) - Distributional Inventory is used to identify major trends in opinion expression ; here, most of UGCs are not relevant as they only cite the brand in congratulation messages to the bloggers who posted on the product launch. ! Opinion distribution amongst user’s comments mardi 12 avril 2011
  • 7. «I» NETWORK IN THE ORANGE CORPUS mardi 12 avril 2011
  • 8. «FORFAIT» IN THE ORANGE CORPUS mardi 12 avril 2011
  • 10. Merci ! Marguerite Leenhardt PhD student mleenhardt@le-semiopole.fr mardi 12 avril 2011