SlideShare une entreprise Scribd logo
1  sur  28
Automatic transcription of
video files
Carlos Turró
Universitat Politecnica de Valencia
Agenda
• Why automatic transcription
• State of the art: The transLectures project
• Automatic transcription of Lecture Recordings: The Opencast Project
• Notes & the near future
Why automatic transcription of video files?
• Accessibility
Why automatic transcription of video files?
• Accessibility
• Searching into a video file
• Searching into a video repository
• Topic identification
• …and much more
The transLectures project
• Development of an engine for Automated Speech Recognition (ASR)
for lectures & educational content
• Development of translation tools for that content
• Implementation
• Case studies: Videolectures.NET & Polimedia (UPV video repository)
• Real-life evaluation
• Integration into Opencast
http://www.translectures.eu
5
transLectures partners
12 Nov 2013
Name Country
1 Universitat Politècnica de València (MLLP) Spain
2 Xerox SAS France
3 Institut Jožef Stefan Slovenia
3+ Knowledge for All Foundation UK
4 RWTH Aachen University Germany
5 EML – European Media Laboratory Germany
6 DDS – Deluxe Digital Studios UK
36 Months
November 2014
Statistical transcription (and translation)
Acustic
Model
Language
Model
Sound ASR Engine
Statistical transcription (and translation)
Acustic
Model
Language
Model
Manually transcripted
voice Modeling Engine
Architecture of TransLectures
Lecture
Language
Model
Slides
Extra
content
Result
Intelligent interaction
Transcription Translation
Languages
12 Nov 2013
1
0
• Transcription (ASR)
• EN
• SL
• ES
• Translation (MT)
• EN>SL , SL>EN
• EN>ES , ES>EN
• EN>FR
• EN>DE
Transcription and Translation Platform
Transcription and Translation Platform API
Transcription and Translation Platform
• Post-editing web interface (in HTML5)
Example video
• https://media.upv.es/?id=b444d12e-db23-9a4f-9b3b-d1d9275d4cb4
Scientifical Evaluations
• WER = Word Error Ratio
• The lower the better
• Usually, a human transcriptor
has a WER around 12
Beyond transLectures
Beyond transLectures
WER
Language M10 M17
Dutch 25.7 24.5
Italian 21.2 17.7
Portuguese 45.9 43.0
Spanish 15.9 14.4
Estonian N/A 27.1
French N/A 22.7
Beyond transLectures
The Opencast Community is…
Universities, companies and people:
• concerned with academic video
• attracted to the Opencast values of openly exchanging ideas,
experience, knowledge and code
• committed to building and maintaining a robust, flexible, high-quality
open source lecture capture and academic video management
solution.
Now also part of
Full-featured Lecture Recording ecosystem
Who uses Opencast?
Around the world, with
strong adoption in
Europe especially.
43 Adopters with public
information (May 2014)
30+ commercial partner
clients
http://opencast.org/matterhor
n-adopters
Yesterday’s tweet
Indexing in Opencast
• Opencast has built-in OCR indexing capabilities
Video (slides) -> OCR (hunspell) -> Word list filter -> Apache Lucene search
server
• New operations can be added
Video (slides) -> transcription (tL) -> Apache Lucene search server
or
Video (slides) -> OCR (hunspell) -> transcription (tL) -> Word list filter ->Apache
Lucene search server
Why do I need an indexing server?
• Powerful, Accurate and Efficient Search Algorithms
• ranked searching -- best results returned first
• many powerful query types: phrase queries, wildcard queries, proximity
queries, range queries and more
• fielded searching (e.g. title, author, contents)
• sorting by any field
• multiple-index searching with merged results
• allows simultaneous update and searching
• flexible faceting, highlighting, joins and result grouping
• fast, memory-efficient and typo-tolerant suggesters
Demo on searching
• https://media.upv.es
Notes & the near future
• ASR Technology is enough good for automated transcription of videos
… with enough good sound
• There are lecture recording systems that enables to plug
transcriptions for searching
…like Opencast
• There are already things to solve
• Transcription speed (in good progress)
• Topic indentification
• Adding more languages
Thanks!
Questions?
Learning more ….
transLectures
http://translectures.eu
Video in a multilingual context (EMMA)
http://association.media-and-learning.eu/portal/resource/ml-webinar-
video-multilingual-context
Opencast State of the Project
http://lanyrd.com/2015/apereo/sdmpry/

Contenu connexe

En vedette

En vedette (8)

Shan_Oracle_EBS
Shan_Oracle_EBSShan_Oracle_EBS
Shan_Oracle_EBS
 
Data Science Governance
Data Science GovernanceData Science Governance
Data Science Governance
 
Watch newsletter nov2014
Watch newsletter nov2014Watch newsletter nov2014
Watch newsletter nov2014
 
Open edx developing x-blocks @ upvalencia (4)
Open edx   developing x-blocks @ upvalencia (4)Open edx   developing x-blocks @ upvalencia (4)
Open edx developing x-blocks @ upvalencia (4)
 
Sir Isaac Newton
Sir Isaac NewtonSir Isaac Newton
Sir Isaac Newton
 
09 e00348
09 e0034809 e00348
09 e00348
 
Jose antonio sanchez merino impress práctica 1
Jose antonio sanchez merino impress práctica 1Jose antonio sanchez merino impress práctica 1
Jose antonio sanchez merino impress práctica 1
 
La robotica
La roboticaLa robotica
La robotica
 

Similaire à Automatic transcription of video files sig media

The Avalon Media System: An Open Source Audio/Video System for Libraries and ...
The Avalon Media System: An Open Source Audio/Video System for Libraries and ...The Avalon Media System: An Open Source Audio/Video System for Libraries and ...
The Avalon Media System: An Open Source Audio/Video System for Libraries and ...Avalon Media System
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013MediaMixerCommunity
 
Introducing Matterhorn
Introducing MatterhornIntroducing Matterhorn
Introducing MatterhornKenji Lamb
 
REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...MEDEA Awards
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EUmoocs
 
Technion IR: Institutional Repository with DSpace
Technion IR: Institutional Repository with DSpaceTechnion IR: Institutional Repository with DSpace
Technion IR: Institutional Repository with DSpaceElena Yaroshenko
 
Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Stephen Marquard
 
Avalon at Stanford University Libraries
Avalon at Stanford University LibrariesAvalon at Stanford University Libraries
Avalon at Stanford University LibrariesAvalon Media System
 
Education using FIRE
Education using FIRE Education using FIRE
Education using FIRE FORGE project
 
Enriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella PlayerEnriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella PlayerCarlos Turró Ribalta
 
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...mharpasu
 
Podcasting De Luxe
Podcasting De LuxePodcasting De Luxe
Podcasting De LuxeMartin Ebner
 
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...Alexis Monville
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for UniversitiesEducational Technology
 
OpenChain at EOLE 2017
OpenChain at EOLE 2017OpenChain at EOLE 2017
OpenChain at EOLE 2017Shane Coughlan
 
Presentatie ILS Koha
Presentatie ILS KohaPresentatie ILS Koha
Presentatie ILS KohaFers
 

Similaire à Automatic transcription of video files sig media (20)

The Avalon Media System: An Open Source Audio/Video System for Libraries and ...
The Avalon Media System: An Open Source Audio/Video System for Libraries and ...The Avalon Media System: An Open Source Audio/Video System for Libraries and ...
The Avalon Media System: An Open Source Audio/Video System for Libraries and ...
 
Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013Intelligent tools-mitja-jermol-2013-bali-7 may2013
Intelligent tools-mitja-jermol-2013-bali-7 may2013
 
Introducing Matterhorn
Introducing MatterhornIntroducing Matterhorn
Introducing Matterhorn
 
REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
 
An overview of EPrints :The University of Glasgow's Experience
An overview of EPrints :The University of Glasgow's ExperienceAn overview of EPrints :The University of Glasgow's Experience
An overview of EPrints :The University of Glasgow's Experience
 
Technion IR: Institutional Repository with DSpace
Technion IR: Institutional Repository with DSpaceTechnion IR: Institutional Repository with DSpace
Technion IR: Institutional Repository with DSpace
 
Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015Opencast Project Update at Open Apereo 2015
Opencast Project Update at Open Apereo 2015
 
Avalon at Stanford University Libraries
Avalon at Stanford University LibrariesAvalon at Stanford University Libraries
Avalon at Stanford University Libraries
 
Education using FIRE
Education using FIRE Education using FIRE
Education using FIRE
 
Enriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella PlayerEnriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella Player
 
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...
It takes a Village: Implementing a Homegrown Solution for Streaming Video Res...
 
Avalon Media System update
Avalon Media System updateAvalon Media System update
Avalon Media System update
 
Podcasting De Luxe
Podcasting De LuxePodcasting De Luxe
Podcasting De Luxe
 
Videolectures for ocwc2010
Videolectures for ocwc2010Videolectures for ocwc2010
Videolectures for ocwc2010
 
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...
ScrumDay 2014 - Développer des produits avec des équipes distribuées - Alexis...
 
Automated Podcasting System for Universities
Automated Podcasting System for UniversitiesAutomated Podcasting System for Universities
Automated Podcasting System for Universities
 
OpenChain at EOLE 2017
OpenChain at EOLE 2017OpenChain at EOLE 2017
OpenChain at EOLE 2017
 
Presentatie ILS Koha
Presentatie ILS KohaPresentatie ILS Koha
Presentatie ILS Koha
 
ELIXIR TCG update
ELIXIR TCG updateELIXIR TCG update
ELIXIR TCG update
 

Plus de Carlos Turró Ribalta

User derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upvUser derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upvCarlos Turró Ribalta
 
Hacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPVHacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPVCarlos Turró Ribalta
 
Pedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de ValènciaPedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de ValènciaCarlos Turró Ribalta
 
Video is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP ValenciaVideo is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP ValenciaCarlos Turró Ribalta
 

Plus de Carlos Turró Ribalta (6)

User derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upvUser derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upv
 
Paella player and Opencast
Paella player and OpencastPaella player and Opencast
Paella player and Opencast
 
Hacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPVHacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPV
 
Paella player 5
Paella player 5Paella player 5
Paella player 5
 
Pedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de ValènciaPedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de València
 
Video is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP ValenciaVideo is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP Valencia
 

Dernier

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 

Dernier (20)

Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 

Automatic transcription of video files sig media

  • 1. Automatic transcription of video files Carlos Turró Universitat Politecnica de Valencia
  • 2. Agenda • Why automatic transcription • State of the art: The transLectures project • Automatic transcription of Lecture Recordings: The Opencast Project • Notes & the near future
  • 3. Why automatic transcription of video files? • Accessibility
  • 4. Why automatic transcription of video files? • Accessibility • Searching into a video file • Searching into a video repository • Topic identification • …and much more
  • 5. The transLectures project • Development of an engine for Automated Speech Recognition (ASR) for lectures & educational content • Development of translation tools for that content • Implementation • Case studies: Videolectures.NET & Polimedia (UPV video repository) • Real-life evaluation • Integration into Opencast http://www.translectures.eu 5
  • 6. transLectures partners 12 Nov 2013 Name Country 1 Universitat Politècnica de València (MLLP) Spain 2 Xerox SAS France 3 Institut Jožef Stefan Slovenia 3+ Knowledge for All Foundation UK 4 RWTH Aachen University Germany 5 EML – European Media Laboratory Germany 6 DDS – Deluxe Digital Studios UK 36 Months November 2014
  • 7. Statistical transcription (and translation) Acustic Model Language Model Sound ASR Engine
  • 8. Statistical transcription (and translation) Acustic Model Language Model Manually transcripted voice Modeling Engine
  • 10. Languages 12 Nov 2013 1 0 • Transcription (ASR) • EN • SL • ES • Translation (MT) • EN>SL , SL>EN • EN>ES , ES>EN • EN>FR • EN>DE
  • 13. Transcription and Translation Platform • Post-editing web interface (in HTML5)
  • 15. Scientifical Evaluations • WER = Word Error Ratio • The lower the better • Usually, a human transcriptor has a WER around 12
  • 17. Beyond transLectures WER Language M10 M17 Dutch 25.7 24.5 Italian 21.2 17.7 Portuguese 45.9 43.0 Spanish 15.9 14.4 Estonian N/A 27.1 French N/A 22.7
  • 19. The Opencast Community is… Universities, companies and people: • concerned with academic video • attracted to the Opencast values of openly exchanging ideas, experience, knowledge and code • committed to building and maintaining a robust, flexible, high-quality open source lecture capture and academic video management solution. Now also part of
  • 21. Who uses Opencast? Around the world, with strong adoption in Europe especially. 43 Adopters with public information (May 2014) 30+ commercial partner clients http://opencast.org/matterhor n-adopters
  • 23. Indexing in Opencast • Opencast has built-in OCR indexing capabilities Video (slides) -> OCR (hunspell) -> Word list filter -> Apache Lucene search server • New operations can be added Video (slides) -> transcription (tL) -> Apache Lucene search server or Video (slides) -> OCR (hunspell) -> transcription (tL) -> Word list filter ->Apache Lucene search server
  • 24. Why do I need an indexing server? • Powerful, Accurate and Efficient Search Algorithms • ranked searching -- best results returned first • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more • fielded searching (e.g. title, author, contents) • sorting by any field • multiple-index searching with merged results • allows simultaneous update and searching • flexible faceting, highlighting, joins and result grouping • fast, memory-efficient and typo-tolerant suggesters
  • 25. Demo on searching • https://media.upv.es
  • 26. Notes & the near future • ASR Technology is enough good for automated transcription of videos … with enough good sound • There are lecture recording systems that enables to plug transcriptions for searching …like Opencast • There are already things to solve • Transcription speed (in good progress) • Topic indentification • Adding more languages
  • 28. Learning more …. transLectures http://translectures.eu Video in a multilingual context (EMMA) http://association.media-and-learning.eu/portal/resource/ml-webinar- video-multilingual-context Opencast State of the Project http://lanyrd.com/2015/apereo/sdmpry/