Algorithms for the thematic analysis of twitter datasets
1. Algorithms for the Thematic Analysis of Twitter Datasets Twitter: aneesha Email: aneesha.bakharia@gmail.com #comtech2011 Twitter Workshop Presented by: Aneesha Bakharia
2.
3. Types of Qualitative Content Analysis (Hsieh and Shannon, 2006) Concentrate on Summative and Conventional (Inductive) Coding Approach Study Begins With Derivation of Codes Algorithms Summative Keywords Keywords identified before and during analysis Unsupervised and semi-supervised algorithms: NMF , NTF LDA and traditional clustering algorithms. Conventional (Inductive) Observation Categories developed during analysis Directed (Deductive) Theory Categories derived from pre-existing theory prior to analysis Supervised classification algorithms: Support Vector Machines
4.
5.
6.
7. Non-negative Matrix Factorisation Features Matrix Weights Matrix Theme 1 Theme 2 Word 1 Word 2 Word 2 Tweet 1 Tweet 1 Tweet 1 Word 1 Word 2 Word 3 Theme 1 0.5 0 1 Theme 2 0 0.5 0 Theme 1 Theme 2 Tweet 1 1 0 Tweet 2 0 1 Tweet 3 0 1
14. Non-negative Tensor Matrix Factorisation Tweet – Word - Time Matrix Month April Word 1 Word 2 Word n Tweet 1 1 0 2 Tweet 2 0 1 0 Tweet 3 0 1 1 March Word 1 Word 2 Word n Tweet 1 1 0 2 Tweet 2 0 1 0 Tweet 3 0 1 1 Feb Word 1 Word 2 Word n Tweet 1 1 0 2 Tweet 2 0 1 0 Tweet 3 0 1 1 Jan Word 1 Word 2 Word n Tweet 1 1 0 2 Tweet 2 0 1 0 Tweet 3 0 1 1
15. Non-negative Tensor Matrix Factorisation Nonnegative Tensor Factorization for Knowledge Discovery http://cisml.utk.edu/Seminars/2010/Berry.pdf CISML Seminar Series, Fall 2010, Michael W. Berry
16.
17.
18.
19. Looking for Collaborators Twitter: aneesha Email: aneesha.bakharia@gmail.com Twitter Graphics from Webdesigner Depot http:// www.webdesignerdepot.com Graphics converted to wmf format by Elizabeth Hall