SlideShare une entreprise Scribd logo
1  sur  16
jobknowledge.eu 
facebook.com/jobknowledge 
@Jobknowledge 
Small talk 
Text mining in organizational research: 
a review and a case study 
Vladimer Kobayashi, Hannah Berkers, Stefan Mol, Gabór Kismihók & Deanne den Hartog
Overview 
The case study: Extracting job information from vacancies 
• The problem: Modernizing job analysis 
• The data: 500,000 online vacancies 
• The use of a framework: knowledge from the job analysis field 
• The techniques: feature extraction 
• The results: Successful automatic categorization of job information 
The review: text mining techniques and tasks in organizational research 
• The task: Invitation for a special issue on big data in ORM 
• The paper: Our structure so far 
• The question: Feedback
The case study: Extracting job information from vacancies 
The problem: Modernizing job analysis 
Jobs are changing, but job analysis is lagging behind 
• Seen as a tedious and expensive, but necessary task 
• Not up to speed with the changes in work 
• Accuracy of job analysis using job incumbents as a source is questioned 
• Not taking advantage of the ‘big data’ opportunities
The case study: Extracting job information from vacancies 
The data: 500,000 English online vacancies 
An often overlooked rich source of job information 
Could facilitate upscaling amount of data used in job analysis
The case study: Extracting job information from vacancies 
The use of a framework: knowledge from the job analysis field 
Skills can be extracted from job advertisements (Sodhi & Son, 2009; Smith & Ali, 2014) 
Studies conducted in the field of Information Technologies with a focus on the use of 
technologies 
Need for a more deductive approach (George, Haas, & Pentland, 2014) 
We go beyond this research by using knowledge from the job analysis field 
We categorize job information based on the basic distinction between job attributes 
and job activities (Sackett & Laczo, 2003) 
First step toward the extraction of finer grained job information
The case study: Extracting job information from vacancies 
The use of a framework: knowledge from the job analysis field 
Categorization into job attributes and job activities 
Use of manual labelling of 300 random vacancies (3,921 labelled sentences) 
Based on definitions of the finer grained job features (either attribute or activity), such 
as knowledge, abilities, tasks, responsibilities etc.
The case study: Extracting job information from vacancies 
The techniques: Feature extraction 
Feature Matrix 
TEXT PREPROCESSING TEXT ENCODING 
Text Preprocessing 
• Sentence and word tokenization 
• Lower case transformation 
• Stopwords removal, e.g. the, and, etc 
• Extra whitespace 
• Lemmatization 
Text Encoding 
• Linguistic preprocessing, e.g. part of 
speech (POS) tagging 
F E A T U R E S 
S E N T E N C E S 
Job Vacancies Preprocessed Vacancies
The case study: Extracting job information from vacancies 
Feature list 
• Sentence Length (after removing certain words) 
• POS of first word (job activity sentences usually start with a verb) 
• First word (both kind of sentences often start with certain words) 
• Last Word (job attribute sentences commonly end with certain words ) 
• Proportion of nouns and adjectives 
• Proportion of verbs and TO 
• Proportion of verbs followed by noun, verb, adjectives, adverb 
• Frequent words
The case study: Extracting job information from vacancies 
Application of Data Mining Techniques to the Feature Matrix 
• Naïve Bayes 
• Support Vector Machines 
• Random Forest 
The results: Successful automatic categorization of job information 
At least 95% mean accuracy based on 10-fold cross validation 
compared with the base classifier accuracy of 55%
The case study: Extracting job information from vacancies 
Future work 
• Semi-supervised labelling 
• Finer classification 
• Consideration of more features
The review: Text mining techniques in organizational research 
The task: Invitation for a special issue on big data in ORM 
Introduce the methods of text analysis to organizational scientists 
Review of various techniques for mining textual data 
The pros and cons of different approaches (best practices) 
Illustrations from the current project on job analysis showing how 
these procedures can be applied to a substantive area
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
1. Introduction 
Text data in organizational research and issues that could be solved with text mining 
Introduce the case study on text mining in job analysis 
2. Review of text mining techniques 
Definitions and terminology 
Text preprocessing 
3 tasks done in text mining: classification, feature construction, and feature selection 
Evaluating text mining results
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
2. Review of text mining techniques 
For each task 
a) Text mining techniques applied to perform the tasks 
b) Possibilities for applying Organizational frameworks 
c) Advantages and disadvantages of these techniques illustrated with 
examples from Organizational Research and other fields 
d) Illustration from our case study
The review: Text mining techniques in organizational research 
The paper: Our structure so far 
3. Discussion of opportunities and challenges of text mining in Organizational Research 
Opportunities such as extending the application of text mining to other problems in 
Organizational Research (input?) 
Challenges such as dealing with data size, access and protection of data, language 
issues etc. 
4. Conclusion
The review: Text mining techniques in organizational research 
The question: Feedback 
What problems you are dealing with right now (or in the past) that make use of text 
data? 
What are the opportunities that you see for text mining? 
Which part of text mining would you like to learn more about? 
Do you have experience in submitting a manuscript to ORM?
References 
The question: Feedback 
George, G., Haas, M.R. & Pentland, A. (2014). From the editors: Big Data and Management. 
Academy of Management Journal, 57 (2), 321-326. 
Sackett, P.R., & Laczo, R.M. (2003). Job and Work Analysis. In Comprehensive Handbook of 
Psychology: Industrial and Organizational Psychology, vol. 12, ed. W.C. Borman, D.R. Ilgen, 
& R.J. Klimoski, pp. 21-37. New York: Wiley. 
Smith, D., & Ali, A. (2014). Analysing Computer Programming Job Trend Using Web Data Mining. 
Issues in Informing Science and Information Technology, 11, 203-214. 
Sodhi, M.S., & Son, B-G. (2009). Content Analysis of O.R. Job Advertisements to Infer Required Skills. 
Journal of the Operational Research Society, 61, 1315-1327.

Contenu connexe

Tendances

Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Anna Lisa Gentile
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Techniquepaperpublications3
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A SurveyIRJET Journal
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrievalKU Leuven
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining FrameworkPrakhyath Rai
 
Towards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsTowards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsCITE
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...GESIS
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)9866825059
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibEl Habib NFAOUI
 
Review of Various Text Categorization Methods
Review of Various Text Categorization MethodsReview of Various Text Categorization Methods
Review of Various Text Categorization Methodsiosrjce
 

Tendances (16)

Text mining
Text miningText mining
Text mining
 
Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013Web Scale Information Extraction tutorial ecml2013
Web Scale Information Extraction tutorial ecml2013
 
An Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search TechniqueAn Advanced IR System of Relational Keyword Search Technique
An Advanced IR System of Relational Keyword Search Technique
 
CV
CVCV
CV
 
IRJET- Missing Value Evaluation in SQL Queries: A Survey
IRJET- 	  Missing Value Evaluation in SQL Queries: A SurveyIRJET- 	  Missing Value Evaluation in SQL Queries: A Survey
IRJET- Missing Value Evaluation in SQL Queries: A Survey
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Text MIning
Text MIningText MIning
Text MIning
 
Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Text Mining Framework
Text Mining FrameworkText Mining Framework
Text Mining Framework
 
Ir 01
Ir   01Ir   01
Ir 01
 
Towards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong StudentsTowards Automatic Analysis of Online Discussions among Hong Kong Students
Towards Automatic Analysis of Online Discussions among Hong Kong Students
 
Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...Are topic-specific search term, journal name and author name recommendations ...
Are topic-specific search term, journal name and author name recommendations ...
 
Model of information retrieval (3)
Model  of information retrieval (3)Model  of information retrieval (3)
Model of information retrieval (3)
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Review of Various Text Categorization Methods
Review of Various Text Categorization MethodsReview of Various Text Categorization Methods
Review of Various Text Categorization Methods
 
Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ... Methodology Mashups: Systematic Searches, Plus ...
Methodology Mashups: Systematic Searches, Plus ...
 

En vedette

SK4 / U.1 - Small Talk
SK4 / U.1 -  Small TalkSK4 / U.1 -  Small Talk
SK4 / U.1 - Small TalkLee Gonz
 
Lesson four small talk and gossip
Lesson four small talk and gossipLesson four small talk and gossip
Lesson four small talk and gossipJody Bryant
 
Small talk while visiting in someone's home and on a home tour
Small talk while visiting in someone's home  and on a home tourSmall talk while visiting in someone's home  and on a home tour
Small talk while visiting in someone's home and on a home tourculturebump
 
[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements
[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements
[Webinar] Make Small Talk Track Changes - See Big Sales IimprovementsSalesScripter
 
Secrets of small talk asking questions
Secrets of small talk  asking questionsSecrets of small talk  asking questions
Secrets of small talk asking questions- Freelance
 

En vedette (10)

SK4 / U.1 - Small Talk
SK4 / U.1 -  Small TalkSK4 / U.1 -  Small Talk
SK4 / U.1 - Small Talk
 
Small talk unit 6
Small talk unit 6Small talk unit 6
Small talk unit 6
 
Lesson four small talk and gossip
Lesson four small talk and gossipLesson four small talk and gossip
Lesson four small talk and gossip
 
Small talk while visiting in someone's home and on a home tour
Small talk while visiting in someone's home  and on a home tourSmall talk while visiting in someone's home  and on a home tour
Small talk while visiting in someone's home and on a home tour
 
SMALL TALK
SMALL TALKSMALL TALK
SMALL TALK
 
[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements
[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements
[Webinar] Make Small Talk Track Changes - See Big Sales Iimprovements
 
Small talk techniques
Small talk techniquesSmall talk techniques
Small talk techniques
 
Secrets of small talk asking questions
Secrets of small talk  asking questionsSecrets of small talk  asking questions
Secrets of small talk asking questions
 
Small Talk: Business
Small Talk:  BusinessSmall Talk:  Business
Small Talk: Business
 
Small talk. ppt
Small talk. pptSmall talk. ppt
Small talk. ppt
 

Similaire à H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014

Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...alessio_ferrari
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningIOSR Journals
 
Query formulation process
Query formulation processQuery formulation process
Query formulation processmalathimurugan
 
Data presentation and analysis for case study research
Data presentation and analysis for case study researchData presentation and analysis for case study research
Data presentation and analysis for case study researchhomedenogrey
 
Publishing Qualitative Research
Publishing Qualitative ResearchPublishing Qualitative Research
Publishing Qualitative ResearchJoel West
 
Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Debanjan Mahata
 
Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Debanjan Mahata
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineeringalessio_ferrari
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge  Text CorpusA Novel Data mining Technique to Discover Patterns from Huge  Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge Text CorpusIJMER
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesIUPUI
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search enginesunyil96
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
tr-2006-17.doc Word document
tr-2006-17.doc Word documenttr-2006-17.doc Word document
tr-2006-17.doc Word documentbutest
 

Similaire à H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014 (20)

Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
 
A0210110
A0210110A0210110
A0210110
 
Chapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data MiningChapter 1: Introduction to Data Mining
Chapter 1: Introduction to Data Mining
 
Classification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern MiningClassification of News and Research Articles Using Text Pattern Mining
Classification of News and Research Articles Using Text Pattern Mining
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data presentation and analysis for case study research
Data presentation and analysis for case study researchData presentation and analysis for case study research
Data presentation and analysis for case study research
 
Publishing Qualitative Research
Publishing Qualitative ResearchPublishing Qualitative Research
Publishing Qualitative Research
 
Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017Search powered by deep learning smart data 2017
Search powered by deep learning smart data 2017
 
Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017Search Powered by Deep Learning SmartData 2017
Search Powered by Deep Learning SmartData 2017
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge  Text CorpusA Novel Data mining Technique to Discover Patterns from Huge  Text Corpus
A Novel Data mining Technique to Discover Patterns from Huge Text Corpus
 
Information Systems & Knowledge Structures
Information Systems & Knowledge StructuresInformation Systems & Knowledge Structures
Information Systems & Knowledge Structures
 
E43022023
E43022023E43022023
E43022023
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
Text Mining
Text MiningText Mining
Text Mining
 
Inverted files for text search engines
Inverted files for text search enginesInverted files for text search engines
Inverted files for text search engines
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
tr-2006-17.doc Word document
tr-2006-17.doc Word documenttr-2006-17.doc Word document
tr-2006-17.doc Word document
 

Plus de job_knowledge_research

Overcoming barriers to adoption for Learning Analytics in a Dutch University
Overcoming barriers to adoption for Learning Analytics in a Dutch UniversityOvercoming barriers to adoption for Learning Analytics in a Dutch University
Overcoming barriers to adoption for Learning Analytics in a Dutch Universityjob_knowledge_research
 
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...job_knowledge_research
 

Plus de job_knowledge_research (6)

Aias newsletter autumn 2014 web.13-15
Aias newsletter autumn 2014   web.13-15Aias newsletter autumn 2014   web.13-15
Aias newsletter autumn 2014 web.13-15
 
ICAP 2014 - CJKR presentation
ICAP 2014 - CJKR presentationICAP 2014 - CJKR presentation
ICAP 2014 - CJKR presentation
 
Overcoming barriers to adoption for Learning Analytics in a Dutch University
Overcoming barriers to adoption for Learning Analytics in a Dutch UniversityOvercoming barriers to adoption for Learning Analytics in a Dutch University
Overcoming barriers to adoption for Learning Analytics in a Dutch University
 
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...
Mol, S.T. (2013, October). Job Knowledge Based Personnel Selection & The Cent...
 
Theta award third place
Theta award third placeTheta award third place
Theta award third place
 
CJKR poster
CJKR posterCJKR poster
CJKR poster
 

Dernier

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Dernier (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

H Berkers & V.Kobayashi: Small talk ORM paper 29 9-2014

  • 1. jobknowledge.eu facebook.com/jobknowledge @Jobknowledge Small talk Text mining in organizational research: a review and a case study Vladimer Kobayashi, Hannah Berkers, Stefan Mol, Gabór Kismihók & Deanne den Hartog
  • 2. Overview The case study: Extracting job information from vacancies • The problem: Modernizing job analysis • The data: 500,000 online vacancies • The use of a framework: knowledge from the job analysis field • The techniques: feature extraction • The results: Successful automatic categorization of job information The review: text mining techniques and tasks in organizational research • The task: Invitation for a special issue on big data in ORM • The paper: Our structure so far • The question: Feedback
  • 3. The case study: Extracting job information from vacancies The problem: Modernizing job analysis Jobs are changing, but job analysis is lagging behind • Seen as a tedious and expensive, but necessary task • Not up to speed with the changes in work • Accuracy of job analysis using job incumbents as a source is questioned • Not taking advantage of the ‘big data’ opportunities
  • 4. The case study: Extracting job information from vacancies The data: 500,000 English online vacancies An often overlooked rich source of job information Could facilitate upscaling amount of data used in job analysis
  • 5. The case study: Extracting job information from vacancies The use of a framework: knowledge from the job analysis field Skills can be extracted from job advertisements (Sodhi & Son, 2009; Smith & Ali, 2014) Studies conducted in the field of Information Technologies with a focus on the use of technologies Need for a more deductive approach (George, Haas, & Pentland, 2014) We go beyond this research by using knowledge from the job analysis field We categorize job information based on the basic distinction between job attributes and job activities (Sackett & Laczo, 2003) First step toward the extraction of finer grained job information
  • 6. The case study: Extracting job information from vacancies The use of a framework: knowledge from the job analysis field Categorization into job attributes and job activities Use of manual labelling of 300 random vacancies (3,921 labelled sentences) Based on definitions of the finer grained job features (either attribute or activity), such as knowledge, abilities, tasks, responsibilities etc.
  • 7. The case study: Extracting job information from vacancies The techniques: Feature extraction Feature Matrix TEXT PREPROCESSING TEXT ENCODING Text Preprocessing • Sentence and word tokenization • Lower case transformation • Stopwords removal, e.g. the, and, etc • Extra whitespace • Lemmatization Text Encoding • Linguistic preprocessing, e.g. part of speech (POS) tagging F E A T U R E S S E N T E N C E S Job Vacancies Preprocessed Vacancies
  • 8. The case study: Extracting job information from vacancies Feature list • Sentence Length (after removing certain words) • POS of first word (job activity sentences usually start with a verb) • First word (both kind of sentences often start with certain words) • Last Word (job attribute sentences commonly end with certain words ) • Proportion of nouns and adjectives • Proportion of verbs and TO • Proportion of verbs followed by noun, verb, adjectives, adverb • Frequent words
  • 9. The case study: Extracting job information from vacancies Application of Data Mining Techniques to the Feature Matrix • Naïve Bayes • Support Vector Machines • Random Forest The results: Successful automatic categorization of job information At least 95% mean accuracy based on 10-fold cross validation compared with the base classifier accuracy of 55%
  • 10. The case study: Extracting job information from vacancies Future work • Semi-supervised labelling • Finer classification • Consideration of more features
  • 11. The review: Text mining techniques in organizational research The task: Invitation for a special issue on big data in ORM Introduce the methods of text analysis to organizational scientists Review of various techniques for mining textual data The pros and cons of different approaches (best practices) Illustrations from the current project on job analysis showing how these procedures can be applied to a substantive area
  • 12. The review: Text mining techniques in organizational research The paper: Our structure so far 1. Introduction Text data in organizational research and issues that could be solved with text mining Introduce the case study on text mining in job analysis 2. Review of text mining techniques Definitions and terminology Text preprocessing 3 tasks done in text mining: classification, feature construction, and feature selection Evaluating text mining results
  • 13. The review: Text mining techniques in organizational research The paper: Our structure so far 2. Review of text mining techniques For each task a) Text mining techniques applied to perform the tasks b) Possibilities for applying Organizational frameworks c) Advantages and disadvantages of these techniques illustrated with examples from Organizational Research and other fields d) Illustration from our case study
  • 14. The review: Text mining techniques in organizational research The paper: Our structure so far 3. Discussion of opportunities and challenges of text mining in Organizational Research Opportunities such as extending the application of text mining to other problems in Organizational Research (input?) Challenges such as dealing with data size, access and protection of data, language issues etc. 4. Conclusion
  • 15. The review: Text mining techniques in organizational research The question: Feedback What problems you are dealing with right now (or in the past) that make use of text data? What are the opportunities that you see for text mining? Which part of text mining would you like to learn more about? Do you have experience in submitting a manuscript to ORM?
  • 16. References The question: Feedback George, G., Haas, M.R. & Pentland, A. (2014). From the editors: Big Data and Management. Academy of Management Journal, 57 (2), 321-326. Sackett, P.R., & Laczo, R.M. (2003). Job and Work Analysis. In Comprehensive Handbook of Psychology: Industrial and Organizational Psychology, vol. 12, ed. W.C. Borman, D.R. Ilgen, & R.J. Klimoski, pp. 21-37. New York: Wiley. Smith, D., & Ali, A. (2014). Analysing Computer Programming Job Trend Using Web Data Mining. Issues in Informing Science and Information Technology, 11, 203-214. Sodhi, M.S., & Son, B-G. (2009). Content Analysis of O.R. Job Advertisements to Infer Required Skills. Journal of the Operational Research Society, 61, 1315-1327.