SlideShare une entreprise Scribd logo
1  sur  13
RSI Archive: our experience working with
Speech to Text and Semantic Analysis
Sarah-Haye Aziz and Lorenzo Vassallo
May 17, 2013
2
Come al solito, anche la recente
inaugurazione dell'ultima monumentale
opera di quell'eccezionale scultore che
Giacomo Manzù, vale a dire la nuova porta
del Duomo di Rotterdam avuto il sapore
di un avvenimento straordinario di
risonanza internazionale e per lavori in
corso Fabio Bonetti è riuscito ad avvicinare
l'insigne maestro bergamasco, a buon
diritto ritenuto ormai uno dei più alti
interpreti del nostro tempo, artista fra i più
grandi del secolo e non solo per la misura
del suo talento ma anche per il rigore
morale di cui è sempre stato esempio in
anni di sorta, tormentata, ispirata
attività.
Credits: Giacomo Manzù, Fabio Bonetti
Geographic Therms: Rotterdam
Themes: arte, cultura, intrattenimento
Errors
è
as
Audio Transcription
ha
Categorization
3
Outline
1. Why an automatic indexing system?
2. The project timeline
3. Two paths: system and archivists workflow overview
4. Does it work? We learned that...
5. Next steps
6. Some advices
4
Why an automatic indexing system?
RSI has a consolidated cataloguing system (CMM)
with a well-defined human workflow from 2008
RSI has plenty undocumented historical material
and no capacity to document it.
Increase (plus) the documented material adding
an automation but not substituting (vs) the archivist.
Not vs but plus!
5
Archivists and Technicians Synergy
Project timeline
DeploymentDeploymentTuningTuningAnalysis & StartupAnalysis & Startup
Workflow DesignWorkflow Design
Language ModelLanguage Model
Tv & Radio
Programmes Choice
Tv & Radio
Programmes Choice
Workflow ReviewWorkflow Review
Transcription TestTranscription Test
System TestSystem Test
6
Documenting a material: two paths
Ingestion
Catalogue
Publishing
Transcription Engine
Audio + Key frames
Semantic Engine
Audio
and Video
Key frames
Archivist
Documentation
+
Refinements
Speech
Transcription
Text +
Sequences
Categorization
Text + Sequences
Credits
SIA
Themes +
Geographical therms
Human
audio listening
and
transcription
+
Archivist
documentation
7
The two paths for the archivist
Start ?
Invoke
Indexing
Human Task
on Catalogue
Detailed documentation
Manual creation of
logical sequences
Automated
Transcription and
Categorisation
Detailed documentation
Automatic creation of
logical sequences
Publish
Doc
Level
Basic Human
Limited set of
documented metadata
High Human
with Automation
Limited set of
documented metadata
Automatic creation of
logical sequences
?
?
Human Task
on Catalogue
Yes
No
Doc
Level
High Human
Basic Human
with Automation
8
The archivist – Francesco Veri
9
Does it work? Yes! But…
Differences between Radio and TV
Background Music/Noise does not help the transcription.
Based only on silences and
without key frames, the system
creates too many sequences.
Key frames help to locate a
change of context.
Speech rhythm and pauses are different between and .
10
Next steps (1) – Capitalize Editorial Texts
Semantic Engine
Categorization CatalogueAudio
+
Editorial Texts
11
Next steps (2) – Capitalize 24h Radio Logging
24h Radio Logging
0 24
SIA
(Transcription and
Semantic Engine)
Transcription &
Categorization
Catalogue
Automatic
Cut
12
If you... some advice
Involve the Archivists
Take a different approach in Radio and TV
Choose the right Tv & Radio Programmes
13
sarah-haye.aziz@rsi.ch
lorenzo.vassallo@rsi.ch

Contenu connexe

Similaire à Presentation 17 may morning case study 2 sarahhaye aziz

Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosEUscreen
 
Apache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalApache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalBertrand Delacretaz
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesLibriotech
 
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...NETWAYS
 
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...Codemotion
 
PHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolPHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolAlessandro Cinelli (cirpo)
 
Searching information in a collection of video-lectures
Searching information in a collection of video-lecturesSearching information in a collection of video-lectures
Searching information in a collection of video-lecturesronchet
 
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmIETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmLorenzo Miniero
 
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...NETWAYS
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Janifer Gatenby
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21Lorenzo Miniero
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...Andrea Bollini
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...4Science
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
Lectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webLectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webronchet
 
DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...depositMO
 
Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Roberto Minelli
 
Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02mohankota
 

Similaire à Presentation 17 may morning case study 2 sarahhaye aziz (20)

Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen MykonosMuehlberger - PrestoPrime case study 2 @EUscreen Mykonos
Muehlberger - PrestoPrime case study 2 @EUscreen Mykonos
 
Apache Stanbol Incubation Proposal
Apache Stanbol Incubation ProposalApache Stanbol Incubation Proposal
Apache Stanbol Incubation Proposal
 
F/LOSS in Norwegian libraries
F/LOSS in Norwegian librariesF/LOSS in Norwegian libraries
F/LOSS in Norwegian libraries
 
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
OSMC 2014 | Processing millions of logs with Logstash and integrating with El...
 
Knoxbug2016
Knoxbug2016Knoxbug2016
Knoxbug2016
 
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
PHP is the king, nodejs is the prince and Python is the fool - Alessandro Cin...
 
PHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the foolPHP is the King, nodejs the prince and python the fool
PHP is the King, nodejs the prince and python the fool
 
Searching information in a collection of video-lectures
Searching information in a collection of video-lecturesSearching information in a collection of video-lectures
Searching information in a collection of video-lectures
 
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup StockholmIETF remote participation via Meetecho @ WebRTC Meetup Stockholm
IETF remote participation via Meetecho @ WebRTC Meetup Stockholm
 
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
OSMC 2014: Processing millions of logs with Logstash and integrating with Ela...
 
Multilingualism ifla 2014 08
Multilingualism ifla 2014 08Multilingualism ifla 2014 08
Multilingualism ifla 2014 08
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
COAR Venice 2017 Next Generation Repository Session: What can be done, right ...
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
Lectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the webLectures On Demand: delivering traditional lectures over the web
Lectures On Demand: delivering traditional lectures over the web
 
DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...DepositMOre: Applying tools to increase full-text content in institutional re...
DepositMOre: Applying tools to increase full-text content in institutional re...
 
Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]Visualizing Developer Interactions [VISSOFT2014]
Visualizing Developer Interactions [VISSOFT2014]
 
Mini Project- Audio Enhancement
Mini Project- Audio EnhancementMini Project- Audio Enhancement
Mini Project- Audio Enhancement
 
Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02Miniproject audioenhancement-100223094301-phpapp02
Miniproject audioenhancement-100223094301-phpapp02
 

Plus de Nederlands Instituut voor Beeld en Geluid

Plus de Nederlands Instituut voor Beeld en Geluid (11)

Presentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoekPresentation 17 may morning keynote cees snoek
Presentation 17 may morning keynote cees snoek
 
Presentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wyliePresentation 17 may afternoon casestudy 2 liam wylie
Presentation 17 may afternoon casestudy 2 liam wylie
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoevenPresentation 16 may morning semantic linking rutger verhoeven
Presentation 16 may morning semantic linking rutger verhoeven
 
Presentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourionPresentation 16 may morning casestudy 2 xavier jacques jourion
Presentation 16 may morning casestudy 2 xavier jacques jourion
 
Presentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijkePresentation 16 may morning casestudy 1 maarten de rijke
Presentation 16 may morning casestudy 1 maarten de rijke
 
Presentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hoolandPresentation 16 may morning keynote seth van hooland
Presentation 16 may morning keynote seth van hooland
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Presentation 16 may casestudy daniel steinmeier
Presentation 16 may casestudy daniel steinmeierPresentation 16 may casestudy daniel steinmeier
Presentation 16 may casestudy daniel steinmeier
 
Presentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unanderPresentation 16 may casestudy 2 evalisgreen kaisa unander
Presentation 16 may casestudy 2 evalisgreen kaisa unander
 
Presentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smetPresentation 16 may archive achievements awards tom de smet
Presentation 16 may archive achievements awards tom de smet
 

Dernier

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Presentation 17 may morning case study 2 sarahhaye aziz

  • 1. RSI Archive: our experience working with Speech to Text and Semantic Analysis Sarah-Haye Aziz and Lorenzo Vassallo May 17, 2013
  • 2. 2 Come al solito, anche la recente inaugurazione dell'ultima monumentale opera di quell'eccezionale scultore che Giacomo Manzù, vale a dire la nuova porta del Duomo di Rotterdam avuto il sapore di un avvenimento straordinario di risonanza internazionale e per lavori in corso Fabio Bonetti è riuscito ad avvicinare l'insigne maestro bergamasco, a buon diritto ritenuto ormai uno dei più alti interpreti del nostro tempo, artista fra i più grandi del secolo e non solo per la misura del suo talento ma anche per il rigore morale di cui è sempre stato esempio in anni di sorta, tormentata, ispirata attività. Credits: Giacomo Manzù, Fabio Bonetti Geographic Therms: Rotterdam Themes: arte, cultura, intrattenimento Errors è as Audio Transcription ha Categorization
  • 3. 3 Outline 1. Why an automatic indexing system? 2. The project timeline 3. Two paths: system and archivists workflow overview 4. Does it work? We learned that... 5. Next steps 6. Some advices
  • 4. 4 Why an automatic indexing system? RSI has a consolidated cataloguing system (CMM) with a well-defined human workflow from 2008 RSI has plenty undocumented historical material and no capacity to document it. Increase (plus) the documented material adding an automation but not substituting (vs) the archivist. Not vs but plus!
  • 5. 5 Archivists and Technicians Synergy Project timeline DeploymentDeploymentTuningTuningAnalysis & StartupAnalysis & Startup Workflow DesignWorkflow Design Language ModelLanguage Model Tv & Radio Programmes Choice Tv & Radio Programmes Choice Workflow ReviewWorkflow Review Transcription TestTranscription Test System TestSystem Test
  • 6. 6 Documenting a material: two paths Ingestion Catalogue Publishing Transcription Engine Audio + Key frames Semantic Engine Audio and Video Key frames Archivist Documentation + Refinements Speech Transcription Text + Sequences Categorization Text + Sequences Credits SIA Themes + Geographical therms Human audio listening and transcription + Archivist documentation
  • 7. 7 The two paths for the archivist Start ? Invoke Indexing Human Task on Catalogue Detailed documentation Manual creation of logical sequences Automated Transcription and Categorisation Detailed documentation Automatic creation of logical sequences Publish Doc Level Basic Human Limited set of documented metadata High Human with Automation Limited set of documented metadata Automatic creation of logical sequences ? ? Human Task on Catalogue Yes No Doc Level High Human Basic Human with Automation
  • 8. 8 The archivist – Francesco Veri
  • 9. 9 Does it work? Yes! But… Differences between Radio and TV Background Music/Noise does not help the transcription. Based only on silences and without key frames, the system creates too many sequences. Key frames help to locate a change of context. Speech rhythm and pauses are different between and .
  • 10. 10 Next steps (1) – Capitalize Editorial Texts Semantic Engine Categorization CatalogueAudio + Editorial Texts
  • 11. 11 Next steps (2) – Capitalize 24h Radio Logging 24h Radio Logging 0 24 SIA (Transcription and Semantic Engine) Transcription & Categorization Catalogue Automatic Cut
  • 12. 12 If you... some advice Involve the Archivists Take a different approach in Radio and TV Choose the right Tv & Radio Programmes