Soumettre la recherche
Mettre en ligne
Pipeline for Modeling Automated Scoring (PyData NYC 2015)
•
1 j'aime
•
969 vues
D
DesiLinguist
Suivre
Slides from my PyData NYC 2015 talk.
Lire moins
Lire la suite
Ingénierie
Signaler
Partager
Signaler
Partager
1 sur 58
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Oer and stem short term fellows residential course april 2011 steve swithenby
Oer and stem short term fellows residential course april 2011 steve swithenby
Support Centre for Open Resources in Education
Saving Pepper Seeds ~ ASI
Saving Pepper Seeds ~ ASI
Seeds
Robotlab jupyter
Robotlab jupyter
Tony Hirst
G R E Revised General Test Slides For Ed U S A & Fulbright 6 2011
G R E Revised General Test Slides For Ed U S A & Fulbright 6 2011
Educational Adviser
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Top 10 “Must Do's” for Implementing an Assessment Process in Your Program
Top 10 “Must Do's” for Implementing an Assessment Process in Your Program
ExamSoft
Recommandé
Oer and stem short term fellows residential course april 2011 steve swithenby
Oer and stem short term fellows residential course april 2011 steve swithenby
Support Centre for Open Resources in Education
Saving Pepper Seeds ~ ASI
Saving Pepper Seeds ~ ASI
Seeds
Robotlab jupyter
Robotlab jupyter
Tony Hirst
G R E Revised General Test Slides For Ed U S A & Fulbright 6 2011
G R E Revised General Test Slides For Ed U S A & Fulbright 6 2011
Educational Adviser
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Module 5 powerpoint
Module 5 powerpoint
Megan Berger
Top 10 “Must Do's” for Implementing an Assessment Process in Your Program
Top 10 “Must Do's” for Implementing an Assessment Process in Your Program
ExamSoft
JRT Presentation
JRT Presentation
tdimattia
JRT Training Presentation
JRT Training Presentation
tdimattia
Elar module 5 oct16
Elar module 5 oct16
Megan Berger
On the wrong tram ll 1210 clint smith
On the wrong tram ll 1210 clint smith
clintos
COMEDK UGET 2011
COMEDK UGET 2011
narender yadav
Non-Cognitive Factors as Predictors of Student Success
Non-Cognitive Factors as Predictors of Student Success
wmiller824
Vet courses for schools
Vet courses for schools
Marry Davis
OBTC Capability Statement
OBTC Capability Statement
Stephen Bowerman
Initial Teacher Training - Eligio Cerval-Pena
Initial Teacher Training - Eligio Cerval-Pena
IMI PQ NET Romania
Ielts
Ielts
knowledge icon
SAT Math Workbook chapter-1-TestMentor's
SAT Math Workbook chapter-1-TestMentor's
Test Mentor LLC
Fall 2011 NT4CM
Fall 2011 NT4CM
David Olson
Youth4work Prep Tests
Youth4work Prep Tests
Youth4work.com
Resonance.ac.in
Resonance.ac.in
saikat bhowmick
Online assignment
Online assignment
deepthirajesh
Dr. Connie Johnson: Student Success & MyFoundationsLab
Dr. Connie Johnson: Student Success & MyFoundationsLab
Pearson North America
Pedagogy assignment
Pedagogy assignment
reshmafmtc
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
wendolynhalbert
International Entrance Exams to Study Overseas
International Entrance Exams to Study Overseas
IntStu
E Capability What And How
E Capability What And How
clintos
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
VelmuruganTECE
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
irfanmechengr
Contenu connexe
Similaire à Pipeline for Modeling Automated Scoring (PyData NYC 2015)
JRT Presentation
JRT Presentation
tdimattia
JRT Training Presentation
JRT Training Presentation
tdimattia
Elar module 5 oct16
Elar module 5 oct16
Megan Berger
On the wrong tram ll 1210 clint smith
On the wrong tram ll 1210 clint smith
clintos
COMEDK UGET 2011
COMEDK UGET 2011
narender yadav
Non-Cognitive Factors as Predictors of Student Success
Non-Cognitive Factors as Predictors of Student Success
wmiller824
Vet courses for schools
Vet courses for schools
Marry Davis
OBTC Capability Statement
OBTC Capability Statement
Stephen Bowerman
Initial Teacher Training - Eligio Cerval-Pena
Initial Teacher Training - Eligio Cerval-Pena
IMI PQ NET Romania
Ielts
Ielts
knowledge icon
SAT Math Workbook chapter-1-TestMentor's
SAT Math Workbook chapter-1-TestMentor's
Test Mentor LLC
Fall 2011 NT4CM
Fall 2011 NT4CM
David Olson
Youth4work Prep Tests
Youth4work Prep Tests
Youth4work.com
Resonance.ac.in
Resonance.ac.in
saikat bhowmick
Online assignment
Online assignment
deepthirajesh
Dr. Connie Johnson: Student Success & MyFoundationsLab
Dr. Connie Johnson: Student Success & MyFoundationsLab
Pearson North America
Pedagogy assignment
Pedagogy assignment
reshmafmtc
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
wendolynhalbert
International Entrance Exams to Study Overseas
International Entrance Exams to Study Overseas
IntStu
E Capability What And How
E Capability What And How
clintos
Similaire à Pipeline for Modeling Automated Scoring (PyData NYC 2015)
(20)
JRT Presentation
JRT Presentation
JRT Training Presentation
JRT Training Presentation
Elar module 5 oct16
Elar module 5 oct16
On the wrong tram ll 1210 clint smith
On the wrong tram ll 1210 clint smith
COMEDK UGET 2011
COMEDK UGET 2011
Non-Cognitive Factors as Predictors of Student Success
Non-Cognitive Factors as Predictors of Student Success
Vet courses for schools
Vet courses for schools
OBTC Capability Statement
OBTC Capability Statement
Initial Teacher Training - Eligio Cerval-Pena
Initial Teacher Training - Eligio Cerval-Pena
Ielts
Ielts
SAT Math Workbook chapter-1-TestMentor's
SAT Math Workbook chapter-1-TestMentor's
Fall 2011 NT4CM
Fall 2011 NT4CM
Youth4work Prep Tests
Youth4work Prep Tests
Resonance.ac.in
Resonance.ac.in
Online assignment
Online assignment
Dr. Connie Johnson: Student Success & MyFoundationsLab
Dr. Connie Johnson: Student Success & MyFoundationsLab
Pedagogy assignment
Pedagogy assignment
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
Case Study MBA Schools in Asia-Pacific Grading GuideQNT561.docx
International Entrance Exams to Study Overseas
International Entrance Exams to Study Overseas
E Capability What And How
E Capability What And How
Dernier
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
VelmuruganTECE
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
irfanmechengr
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
sdickerson1
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Erbil Polytechnic University
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Dr.Costas Sachpazis
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
jennyeacort
Transport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
GOPINATHS437943
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
bibisarnayak0
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
aditya806802
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
Dr SOUNDIRARAJ N
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
HafizMudaserAhmad
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
Romil Mishra
Earthing details of Electrical Substation
Earthing details of Electrical Substation
stephanwindworld
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
121011101441
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
Erbil Polytechnic University
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
ChristianCDAM
Past, Present and Future of Generative AI
Past, Present and Future of Generative AI
abhishek36461
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
Mark Billinghurst
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
NiranjanYadav41
Dernier
(20)
Internet of things -Arshdeep Bahga .pptx
Internet of things -Arshdeep Bahga .pptx
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Transport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
Autonomous emergency braking system (aeb) ppt.ppt
Autonomous emergency braking system (aeb) ppt.ppt
home automation using Arduino by Aditya Prasad
home automation using Arduino by Aditya Prasad
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
Earthing details of Electrical Substation
Earthing details of Electrical Substation
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
Risk Management in Engineering Construction Project
Risk Management in Engineering Construction Project
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Ch10-Global Supply Chain - Cadena de Suministro.pdf
Past, Present and Future of Generative AI
Past, Present and Future of Generative AI
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
BSNL Internship Training presentation.pptx
BSNL Internship Training presentation.pptx
Pipeline for Modeling Automated Scoring (PyData NYC 2015)
1.
A Pipeline for
Modeling Automated Scoring Using Python, R and Jupyter Notebooks Copyright © 2015 by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Nitin Madnani, Anastassia Loukina & Lei Chen
2.
Machine Learning & Educational
Assessment A Pythonic Love Story Nitin Madnani, Anastassia Loukina & Lei Chen Copyright © 2015 by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141
3.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Testing Service • A non-profit educational organization founded in 1947, headquartered in Princeton, New Jersey (N≊3500). • Designs and administers global as well as domestic educational assessments (GRE®, TOEFL®, PRAXIS® etc.) • Conducts and publishes extensive research on psychometrics, statistics, cognitive science, and computer science.[1] • Mission: To advance quality and equity in education by providing fair and valid assessments, research and related services. 3 [1] http://search.ets.org/researcher/
4.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Two Parts • Part 1: What makes educational assessment a challenging application for machine learning? • Part 2: How does Python help us address some of these challenges at ETS? 4
5.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessment Machine Learning Part 1
6.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6
7.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz
8.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz Homework Assignment
9.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz MOOC Assignments Homework Assignment
10.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz MOOC Assignments K-12 Standardized Tests Homework Assignment
11.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz MOOC Assignments Teacher Certification K-12 Standardized Tests Homework Assignment
12.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz MOOC Assignments Teacher Certification K-12 Standardized Tests Homework Assignment Practice Tests
13.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz MOOC Assignments TOEFL/IELTS Teacher Certification K-12 Standardized Tests Homework Assignment Practice Tests
14.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS Teacher Certification K-12 Standardized Tests Homework Assignment Practice Tests
15.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS Teacher Certification GMAT K-12 Standardized Tests Homework Assignment Practice Tests
16.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 6 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS GED Teacher Certification GMAT K-12 Standardized Tests Homework Assignment Practice Tests
17.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 7 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS GED Teacher Certification GMAT K-12 Standardized Tests Homework Assignment Practice Tests
18.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 8 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS GED Teacher Certification GMAT K-12 Standardized Tests Homework Assignment High Stakes Practice Tests
19.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 “High Stakes” • A test with results that have important, direct consequences for the test-takers. • A test-taker would want to understand what their score means and how it maps to what they did on the test. 9
20.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 10 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS GED Teacher Certification GMAT K-12 Standardized Tests Homework Assignment High Stakes Practice Tests
21.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessments 11 Classroom Quiz GRE MOOC Assignments TOEFL/IELTS GED Teacher Certification GMAT K-12 Standardized Tests Homework Assignment High Stakes Practice Tests
22.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 The GRE • Graduate Record Examination, designed and administered by ETS. • Used by at least 3000 colleges and universities across the world for graduate school applications to MS, MBA & PhD programs.[1] • ~575,000 test-takers from ~200 countries between July 2013 and June 2014 (50% women, 45% men). [2] • Three sections: • Verbal Reasoning • Quantitative Reasoning • Analytical Writing 12 [2] http://www.ets.org/s/gre/pdf/snapshot_test_taker_data_2014.pdf[1] https://www.ets.org/s/gre/pdf/gre_aidi_fellowships.pdf
23.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 The GRE • Graduate Record Examination, designed and administered by ETS. • Used by at least 3000 colleges and universities across the world for graduate school applications to MS, MBA & PhD programs.[1] • ~575,000 test-takers from ~200 countries between July 2013 and June 2014 (50% women, 45% men). [2] • Three sections: • Verbal Reasoning • Quantitative Reasoning • Analytical Writing 13 [2] http://www.ets.org/s/gre/pdf/snapshot_test_taker_data_2014.pdf[1] https://www.ets.org/s/gre/pdf/gre_aidi_fellowships.pdf
24.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 GRE Analytical Writing 14 “As people rely more and more on technology to solve problems, the ability of humans to think for themselves will surely deteriorate.” Directions: Write a response in which you discuss the extent to which you agree or disagree with the statement and explain your reasoning for the position you take. https://www.ets.org/gre/revised_general/prepare/analytical_writing/issue/scoring_guide Score 6. Outstanding articulates a clear and insightful position develops the position fully well-focused, well-organized analysis conveys ideas fluently and precisely demonstrates superior facility with English Score 1. Fundamentally Deficient provides little/no evidence of understanding disorganized or extremely brief severe problems with sentence structure pervasive errors in grammar incoherent and meaning not clear …
25.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays Given the stakes, our scoring methodology must maximize: • Accuracy: how accurately does the assigned score measure the analytical skills of the test-taker? • Interpretability: how easily can test-takers understand why they was assigned a particular score and what that score means? 15
26.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays Given the stakes, our scoring methodology must maximize: • Accuracy: how accurately does the assigned score measure the analytical skills of the test-taker? • Interpretability: how easily can test-takers understand why they was assigned a particular score and what that score means? 15 It would also be nice to minimize: • Cost: how efficiently can we score each test (how much money can we save the test-taker in fees)?
27.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays 16
28.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays 17 Essay Scoring Guide Trained Human Readers High Accuracy Medium Interpretability High Cost Option 1
29.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays 18 Essay Scoring Guide Features Automated Scoring System (Machine Learning) Medium Accuracy (Choice of) High Interpretability Low Cost Option 2
30.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Scoring essays 19 Essay Scoring Guide One Trained Human Reader Scoring Guide Features Automated Scoring System (Machine Learning) Final score Human Score System Score As good as using two human readers[1]. [1] http://www.ets.org/Media/Research/pdf/RD_Connections2.pdf
31.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 E-rater 20 Scoring Guide Features Automated Scoring System (Machine Learning)
32.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 E-rater 20 “Essay Rater” Linear regression trained on older essays written to the same topic and scored by human readers. Features errors in grammar (e.g., subject-verb agreement) usage errors (incorrect prepositions/articles) mechanics errors (capitalization, spelling) errors in style (repetitious word use) discourse structure (presence of a thesis statement, main points) vocabulary sophistication essay organization Automated Essay Scoring With e-rater® V.2, The Journal of Technology, Learning, and Assessment, Volume 4(3), 2006 Scoring Guide Features Automated Scoring System (Machine Learning)
33.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 E-rater & Research 21 E-rater still an active area of research at ETS Design new features; examine their effect on performance, and whether they overlap with existing features. Try more sophisticated machine learning models (higher accuracy worth lower interpretability?) Last year, 10 new e-rater features proposed just for GRE! GRE one of a dozen assessments, e-rater one of many automated scoring engines Research untenable for a large group (>15 scientists) without a standardized pipeline. Scoring Guide Features Automated Scoring System (Machine Learning)
34.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Need an end-to-end machine learning pipeline that can: • Work on (almost) all platforms, • Read features in any tabular format and clean it up, • Efficiently apply filtering, scaling and transformations, • Train any specified model with those features, and • Generate a standardized, detailed report of performance on unseen essays. 22 Ideal research pipeline
35.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessment Machine Learning
36.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Educational Assessment Machine Learning Python Part 2
37.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 25 Python Pipeline Input Preprocess Model Evaluate Report Input final self-contained report
38.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 26
39.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 26 Model Name (str) Training Features (csv/tsv/xls) Unseen Test Features (csv/tsv/xls) Feature Definitions (json)
40.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 26 Model Name (str) Training Features (csv/tsv/xls) Unseen Test Features (csv/tsv/xls) Feature Definitions (json) 1.Input • Read files into data frames • Check for missing feature columns, exclude others • Filter out non-numeric and blank values • Standardize essay ID and essay score column names pandas
41.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 26 Model Name (str) Training Features (csv/tsv/xls) Unseen Test Features (csv/tsv/xls) Feature Definitions (json) Training Data Frame (Raw) Test Data Frame (Raw) 1.Input • Read files into data frames • Check for missing feature columns, exclude others • Filter out non-numeric and blank values • Standardize essay ID and essay score column names pandas
42.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 27 Model Name (str) Feature Definitions (json) Training Data Frame (Raw) Test Data Frame (Raw)
43.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 27 Model Name (str) Feature Definitions (json) Training Data Frame (Raw) Test Data Frame (Raw) 2.Preprocess • Filter out user-flagged rows, if so specified • Remove feature outliers & “intelligently” apply feature transformations (log, inv, sqrt, etc.), if available • Standardize all features (center and scale) numpy + pandas
44.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 27 Model Name (str) Feature Definitions (json) Training Data Frame (Raw) Test Data Frame (Raw) Training Data Frame (Processed) Test Data Frame (Processed) 2.Preprocess • Filter out user-flagged rows, if so specified • Remove feature outliers & “intelligently” apply feature transformations (log, inv, sqrt, etc.), if available • Standardize all features (center and scale) numpy + pandas
45.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 28 Model Name (str) Feature Definitions (json) Training Data Frame (Processed) Test Data Frame (Processed)
46.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 28 Model Name (str) Feature Definitions (json) Training Data Frame (Processed) Test Data Frame (Processed) 3.Model • Train regression/classification model via SKLL API or R • Grid-search using a task-appropriate objective • Serializes model to disk (using joblib) R + skll
47.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 28 Model Name (str) Feature Definitions (json) Training Data Frame (Processed) Test Data Frame (Processed) 3.Model • Train regression/classification model via SKLL API or R • Grid-search using a task-appropriate objective • Serializes model to disk (using joblib) R + skll SKLL (pronounced “skull”) provides an API and command-line utilities to make it much simpler to run common scikit-learn experiments with pre-generated features. (Presented by @dsblanch at PyData 2013 & 2014) https://github.com/EducationalTestingService/skll
48.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 28 Model Name (str) Feature Definitions (json) Training Data Frame (Processed) Test Data Frame (Processed) 3.Model • Train regression/classification model via SKLL API or R • Grid-search using a task-appropriate objective • Serializes model to disk (using joblib) R + skll
49.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 28 Model Name (str) Feature Definitions (json) Training Data Frame (Processed) Test Data Frame (Processed) 3.Model • Train regression/classification model via SKLL API or R • Grid-search using a task-appropriate objective • Serializes model to disk (using joblib) R + skll Serialized model
50.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 29 Training Data Frame (Processed) Test Data Frame (Processed) Serialized model
51.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 29 Training Data Frame (Processed) Test Data Frame (Processed) Serialized model 4.Evaluate • Use serialized model to compute test set predictions • Trim and re-scale predictions to match training data • Compute a set of standard evaluation metrics by comparing predictions to test set human scores skll + pandas
52.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 29 Training Data Frame (Processed) Test Data Frame (Processed) Serialized model Test Data Predictions Evaluation Statistics 4.Evaluate • Use serialized model to compute test set predictions • Trim and re-scale predictions to match training data • Compute a set of standard evaluation metrics by comparing predictions to test set human scores skll + pandas
53.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 30 Test Data Predictions Evaluation Statistics Training Data Frame (Processed) Test Data Frame (Processed) Serialized model
54.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 30 Test Data Predictions Evaluation Statistics Training Data Frame (Processed) Test Data Frame (Processed) Serialized model 5.Report • Determine what report sections should be included • Merge pre-existing section templates (.ipynb files) • Dynamically Run final .ipynb file (via ExecutePreprocessor and environment variables) • Convert report to HTML using HTMLExporter jupyter + seaborn + pandas
55.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 30 Test Data Predictions Evaluation Statistics Training Data Frame (Processed) Test Data Frame (Processed) Serialized model Final Report (.ipynb) Final Report (html) 5.Report • Determine what report sections should be included • Merge pre-existing section templates (.ipynb files) • Dynamically Run final .ipynb file (via ExecutePreprocessor and environment variables) • Convert report to HTML using HTMLExporter jupyter + seaborn + pandas
56.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Demo 31
57.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Summary • Machine learning in high-stakes educational assessment requires additional number crunching to verify accuracy and interpretability. • Need a pipeline to compare a large number of research experiments using a standardized, easy-to-read report. • The scientific Python stack makes it super easy to implement all stages of the pipeline! • In progress • Release under open-source license (2016 release) • A CherryPy/JS web-app to allow wider reach 32
58.
Copyright © 2015
by Educational Testing Service. All rights reserved. ETS and the ETS logo are registered trademarks of Educational Testing Service (ETS). MEASURING THE POWER OF LEARNING is a trademark of ETS. 30141 Questions? 33 https://github.com/EducationalTestingService https://github.com/desilinguist @haikuman
Télécharger maintenant