SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
audio and video fingerprinting




John Schavemaker, Werner Bailer, Peter-Jan Doets, Jaap Blom
techniek even in kort:

    duplicaatherkenning (video fingerprinting)
        • bestaat een video in onze databases?


    categorisatie
        • wat voor categorie video is het? Nieuws, sport, film?

    object- en logoherkenning
        • bestaat een object of logo (plaatje) in onze databases?

    Zie ook ons online rapport over stand van de techniek:

    http://research.imagesforthefuture.org/index.php/video-fingerprinting-state-of-the-art-report/



2      audio and video fingerprinting
duplicaatherkenning

    VRAAG: bestaat een video in onze databases?

    video fingerprints houden rekening
     met veranderingen in:
       • resolutie
       • codec
       • ruis
       • kleur




3     audio and video fingerprinting
SWOT video fingerprinting


    STRENGTHS                             WEAKNESSES
    • uitontwikkelde technologie          • veel concurrerende partijen, welk
    • zeer goede performance op           softwarepakket te kiezen?
    geproduceerd materiaal                • geschiktheid voor video materiaal dat
    • veel commerciële pakketten          niet geproduceerd is?
    verkrijgbaar op de markt

    OPPORTUNITIES                         THREATS
    • grotere video databases             • video fingerprints gesloten
    • niet geproduceerd materiaal         standaarden
    • open standaard video fingerprints   • versleuteling video
    • combinatie met audio                • slimme “gebruikers”



4     audio and video fingerprinting
video categorisatie

    VRAAG: Wat voor categorie video is het?
     Close-up gezicht, binnensport, buitensport?




                                                                       images UvA
                                     http://www.science.uva.nl/research/mediamill/


5   audio and video fingerprinting
SWOT video categorisatie

     STRENGTHS                         WEAKNESSES
     • veel belovende techniek         • onvolwassen techniek
     • generieke herkenning mogelijk   • performance (sterk) afhankelijk
     • aanvulling op duplicaat- en     van gebruikte leervoorbeelden
     objectherkenning                  • leren systeem voor nieuwe
     • brug van de ‘semantic gap’      categorieën duurt relatief lang

     OPPORTUNITIES                     THREATS
     • combinatie van categorieën      • variëteit te groot voor categorie
     • sneller en beter leren          • keuze van categorieën
     • automatische annotatie          • afhankelijk van annotatie
                                       leervoorbeelden




6   audio and video fingerprinting
object- en logoherkenning


    VRAAG: bestaat
    een object of logo
    in onze databases?




                                     picture from http://www.omniperception.com/




7   audio and video fingerprinting
SWOT object- en logoherkenning

     STRENGTHS                        WEAKNESSES
     • goede, robuuste performance    • alleen 2D objecten (logo’s)
     • commerciële pakketten          • echte duplicaatherkenning
     • snel leren en herkennen        • rekenintensief
     • revolutie in computer vision



     OPPORTUNITIES                    THREATS
     • grotere video databases        • pre-processing al het materiaal
     • open standaard                 noodzakelijk
     • 3D object herkenning           • patenten




8   audio and video fingerprinting
video fingerprinting




9     audio and video fingerprinting
Use of FP: identification

                       Audio/visual    Fingerprint
Labeled                  signal        extraction    Fingerprints
Multimedia                                               and
items                                                 Metadata
                          Metadata

 Training phase
 Identification phase

Unlabeled                              Fingerprint
                       Audio/visual                    Match        Which item?
Multimedia                             extraction
                         signal                                      Metadata
items


 10   audio and video fingerprinting
Sound & Vision Pilot
     • Observations
       • Problem harder than expected
       • Transformations
         • Crop & scale
         • Brightness/contrast
         • Logos, captions
       • very difficult PIP
       • many matching sequences of black frames


11   audio and video fingerprinting
Sound & Vision Pilot – results ZiuZ

     • TNO has used the ZiuZ video fingerprinting tool on the dataset
     • ZiuZ video fingerprinting is optimized for child-abuse material:
         • short clips
         • low resolution
         • low image quality
     • Preliminary results on the Sound & Vision dataset show
         • material is very challenging
         • some but limited recall performance
         • application domain differs
         • queries containing multiple clips of reference material were
           not enabled by this version of the tool




12   audio and video fingerprinting
Sound & Vision Pilot – Results JRS
     • Recall: 36% (min: 16%, max. 55%)
     • Precision: difficult to determine, many black
       sequences matching, needs manual checking




13   audio and video fingerprinting
Sound & Vision Pilot - Results
     • Transformations our system handles




14   audio and video fingerprinting
Sound & Vision Pilot - Results
     • False positives




15   audio and video fingerprinting
Experiments with SIFT (1)
     • we do not have a SIFT based fingerprinting
       solution in the consortium
     • JRS has SIFT-based interactive tool to locate
       recurring objects in video
     • created video from episode + source clips and
       performed analysis and search




16   audio and video fingerprinting
Experiments with SIFT (2)




17   audio and video fingerprinting
Experiments with SIFT (3)




18   audio and video fingerprinting
Experiments with SIFT (4)
     • Conclusion
       • SIFT can handle cases of scaling and cropping
         reliably
       • even PIP with distortions
       • Scalability issues
         • time for extraction and esp. matching
         • not sure if ranking of matches is still reliable on
           huge datasets




19   audio and video fingerprinting
Characteristics of the data set - audio

     • Not all archive fragments contain audio
     • Often the original audio is used – just cut-and-paste, no serious
       distortions
     • Sometimes the audio is replaced or combined with a voice over
     • Time segmentation of the audio in the episode is different from
       the video used. The audio is not always used with the
       corresponding video fragments. Example on next slide illustrates
       this. The other ways around, and other variations also occur.




20   audio and video fingerprinting
Characteristics of the data set – audio example

                                                            Time line of one
                                                            archive video
video

audio


                                        Time line of one
                                        Andere Tijden episode
video

audio


         Continuous audio fragment, with several shorter video fragments

21   audio and video fingerprinting
Characteristics of the data set - audio

     • Limitations of the use of audio
         • the reference material must contain audio
         • the audio track might not originate from the same material as
           the video track; this is dependent on the video material used.
         • the playout speed must not be changed too much (less than
           +/- 2%)

     • Advantages of the use of audio
        • Highly robust algorithms
        • Usually audio is undistorted; video is cropped, scaled, etc.
        • Audio usually is used continuously, while video fragments are
          cut-and-paste from different sections of the reference video,
          and ‘glued together’.



22   audio and video fingerprinting
Identification results - audio

     • Only checked if the correct archive file name is returned
                                                                        False
         Episode                           Correct        Missed          Positive
         Liggadjati                                  8             3                 0
         Veertig jaar STER-reclame                   10            4                 1
         75 jaar afsluitdijk                         0             5                 2
         Strijd tegen de file                        9             1                 6
         Kronkels van de Maas                        1             9                 1
         Op zoek naar Nederland                      2             6                 1
         Modderen in de polder: Lelystad             3             1                 2
         Burgemeesters in oorlogstijd                6             10                0
         De wording van Paars                        8             1                 0
         Pim en zijn volk                             7            3                 0


23   audio and video fingerprinting
                                                          silent parts in the video
Fingerprinting – audio algorithm

     • Algorithm well-known from literature:
         • Haitsma, Kalker, “A Highly Robust Audio Fingerprinting
           System”, In Proceedings of 3rd International Conference
           onMusic Information Retrieval (ISMIR), October 2002.
     • Features: energy in 33 audio frequency bands
     • Every 11.6 ms a 32-bit sub-fingerprint is computed, consisting of
       coarsely quantized differences between these energy samples
     • Fingerprint consists of a time series of sub-fingerprints
     • The implementation returns the best matching fragments only
       (settings to return no false positives)
     • Algorithm is highly robust, and highly discriminative




24   audio and video fingerprinting
Future improvements on current results

     • Trailing parts contain silence and black frames (no content). The
       silences give rise to false positives and irrelevant detections. A
       silence/activity detector is needed to exclude these parts.
     • Our current implementation from literature allows for only one
       fragment per reference file to be returned.
     • Our current implementation has only coarse time localization.
     • Combination of audio and video fingerprinting




25   audio and video fingerprinting
Consortium

     http://instituut.beeldengeluid.nl/

     http://www.joanneum.at/en/digital.html

     http://www.ziuz.com

     http://hs-art.com/

     http://www.tno.nl




26   audio and video fingerprinting

Contenu connexe

Tendances

Arka Solutions final NSF I-Corps presentation
Arka Solutions final NSF I-Corps presentationArka Solutions final NSF I-Corps presentation
Arka Solutions final NSF I-Corps presentationStanford University
 
NMiEF 2011 - Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...
NMiEF 2011 -  Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...NMiEF 2011 -  Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...
NMiEF 2011 - Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...mediaplaylab
 
Interpretation of Patent Search Results and Patent Claims
Interpretation of Patent Search Results and Patent ClaimsInterpretation of Patent Search Results and Patent Claims
Interpretation of Patent Search Results and Patent ClaimsCaezar Angelito E Arceo
 
Ttg Ca Border Sec Mjd22jun07
Ttg Ca Border Sec Mjd22jun07Ttg Ca Border Sec Mjd22jun07
Ttg Ca Border Sec Mjd22jun07martindudziak
 
Silver Needle in the Skype
Silver Needle in the SkypeSilver Needle in the Skype
Silver Needle in the SkypeDug Song
 
Silver needle in Skype
Silver needle in SkypeSilver needle in Skype
Silver needle in SkypeViet Nt
 
Quoc Le, Stanford & Google - Tera Scale Deep Learning
Quoc Le, Stanford & Google - Tera Scale Deep LearningQuoc Le, Stanford & Google - Tera Scale Deep Learning
Quoc Le, Stanford & Google - Tera Scale Deep LearningKun Le
 
Nordic IPR Forum 2012 Agenda
Nordic IPR Forum 2012 AgendaNordic IPR Forum 2012 Agenda
Nordic IPR Forum 2012 AgendaJj HanXue
 
A novel preservation watch system
A novel preservation watch systemA novel preservation watch system
A novel preservation watch systemLuis Faria
 
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...VidenDanmark
 
Giora Kornblau - Entrepreneurship: Creating New Reality
Giora Kornblau - Entrepreneurship: Creating New RealityGiora Kornblau - Entrepreneurship: Creating New Reality
Giora Kornblau - Entrepreneurship: Creating New RealityMIT Forum of Israel
 
41631 lecture 2 pt2 patents
41631 lecture 2 pt2   patents41631 lecture 2 pt2   patents
41631 lecture 2 pt2 patentsTom Howard
 

Tendances (17)

2019 Q1_TVT Product Portfolio
2019 Q1_TVT Product Portfolio2019 Q1_TVT Product Portfolio
2019 Q1_TVT Product Portfolio
 
Arka Solutions final NSF I-Corps presentation
Arka Solutions final NSF I-Corps presentationArka Solutions final NSF I-Corps presentation
Arka Solutions final NSF I-Corps presentation
 
NMiEF 2011 - Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...
NMiEF 2011 -  Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...NMiEF 2011 -  Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...
NMiEF 2011 - Toy or Tool: Using Mobile Devices & QR (Quick Response) Codes i...
 
Interpretation of Patent Search Results and Patent Claims
Interpretation of Patent Search Results and Patent ClaimsInterpretation of Patent Search Results and Patent Claims
Interpretation of Patent Search Results and Patent Claims
 
Ttg Ca Border Sec Mjd22jun07
Ttg Ca Border Sec Mjd22jun07Ttg Ca Border Sec Mjd22jun07
Ttg Ca Border Sec Mjd22jun07
 
Cheng
ChengCheng
Cheng
 
Ccna PrepCenter - IP Subnetting from Networkers
Ccna PrepCenter - IP Subnetting from NetworkersCcna PrepCenter - IP Subnetting from Networkers
Ccna PrepCenter - IP Subnetting from Networkers
 
Silver Needle in the Skype
Silver Needle in the SkypeSilver Needle in the Skype
Silver Needle in the Skype
 
Silver needle in Skype
Silver needle in SkypeSilver needle in Skype
Silver needle in Skype
 
Power of Patents, November 16, 2011
Power of Patents, November 16, 2011Power of Patents, November 16, 2011
Power of Patents, November 16, 2011
 
Quoc Le, Stanford & Google - Tera Scale Deep Learning
Quoc Le, Stanford & Google - Tera Scale Deep LearningQuoc Le, Stanford & Google - Tera Scale Deep Learning
Quoc Le, Stanford & Google - Tera Scale Deep Learning
 
Nordic IPR Forum 2012 Agenda
Nordic IPR Forum 2012 AgendaNordic IPR Forum 2012 Agenda
Nordic IPR Forum 2012 Agenda
 
20120402 prd - generic presentation
20120402   prd - generic presentation20120402   prd - generic presentation
20120402 prd - generic presentation
 
A novel preservation watch system
A novel preservation watch systemA novel preservation watch system
A novel preservation watch system
 
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...
Nick Milton: The Business Value of Knowledge Management. VidenDanmark. 30 maj...
 
Giora Kornblau - Entrepreneurship: Creating New Reality
Giora Kornblau - Entrepreneurship: Creating New RealityGiora Kornblau - Entrepreneurship: Creating New Reality
Giora Kornblau - Entrepreneurship: Creating New Reality
 
41631 lecture 2 pt2 patents
41631 lecture 2 pt2   patents41631 lecture 2 pt2   patents
41631 lecture 2 pt2 patents
 

En vedette

Audio Fingerprinting Introduction
Audio Fingerprinting IntroductionAudio Fingerprinting Introduction
Audio Fingerprinting IntroductionVikesh Khanna
 
DNA finger printing
DNA finger printing DNA finger printing
DNA finger printing Ivan Kato
 
DNA Fingerprinting
DNA FingerprintingDNA Fingerprinting
DNA FingerprintingDisha Bedi
 
DNA fingerprinting
DNA fingerprintingDNA fingerprinting
DNA fingerprintingsantharooban
 
DNA FINGERPRINTING
DNA FINGERPRINTINGDNA FINGERPRINTING
DNA FINGERPRINTINGParth Shah
 
Dna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic ApplicationsDna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic Applicationsdheva B
 
Dna fingerprinting powerpoint 1
Dna fingerprinting powerpoint 1Dna fingerprinting powerpoint 1
Dna fingerprinting powerpoint 1Usman Abdullah
 

En vedette (11)

Audio Fingerprinting Introduction
Audio Fingerprinting IntroductionAudio Fingerprinting Introduction
Audio Fingerprinting Introduction
 
DNA finger printing
DNA finger printing DNA finger printing
DNA finger printing
 
Dna Fingerprinting
Dna FingerprintingDna Fingerprinting
Dna Fingerprinting
 
DNA Fingerprinting
DNA FingerprintingDNA Fingerprinting
DNA Fingerprinting
 
DNA fingerprinting
DNA fingerprintingDNA fingerprinting
DNA fingerprinting
 
DNA FINGERPRINTING
DNA FINGERPRINTINGDNA FINGERPRINTING
DNA FINGERPRINTING
 
Dna finger printing
Dna finger printingDna finger printing
Dna finger printing
 
Dna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic ApplicationsDna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic Applications
 
Dna fingerprinting
Dna fingerprintingDna fingerprinting
Dna fingerprinting
 
Dna fingerprinting
Dna fingerprintingDna fingerprinting
Dna fingerprinting
 
Dna fingerprinting powerpoint 1
Dna fingerprinting powerpoint 1Dna fingerprinting powerpoint 1
Dna fingerprinting powerpoint 1
 

Similaire à Vdfp audio and video fingerprinting

Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...Erwin Verbruggen
 
HD Voice: The Hurdles and how to overcome the codec war
HD Voice: The Hurdles and how to overcome the codec warHD Voice: The Hurdles and how to overcome the codec war
HD Voice: The Hurdles and how to overcome the codec warJohn Gallagher
 
HD Voice, telecom operators
HD Voice, telecom operatorsHD Voice, telecom operators
HD Voice, telecom operatorsJohn Gallagher
 
FutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementFutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementRADVISION Ltd.
 
EBU's report on DVB and VR
EBU's report on DVB and VREBU's report on DVB and VR
EBU's report on DVB and VRITU
 
Stefan slivinski lifesize video coding
Stefan slivinski lifesize video coding Stefan slivinski lifesize video coding
Stefan slivinski lifesize video coding IMTC
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsvoginip
 
Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012Mobixell
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationMediaMixerCommunity
 
The User at the Wheel of the Online Video Search Engine
The User at the Wheel of the Online Video Search EngineThe User at the Wheel of the Online Video Search Engine
The User at the Wheel of the Online Video Search Engineckofler
 
The Mind-Boggling Challege of Long-Term Digital Preservation
The Mind-Boggling Challege of Long-Term Digital PreservationThe Mind-Boggling Challege of Long-Term Digital Preservation
The Mind-Boggling Challege of Long-Term Digital PreservationGordon Hoke
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004Aniruddh Tyagi
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004aniruddh Tyagi
 

Similaire à Vdfp audio and video fingerprinting (20)

Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...Separate Pasts,  Common Futures: Digital film preservation in a  broadcast en...
Separate Pasts, Common Futures: Digital film preservation in a broadcast en...
 
Hv2615441548
Hv2615441548Hv2615441548
Hv2615441548
 
What’s new in MPEG?
What’s new in MPEG?What’s new in MPEG?
What’s new in MPEG?
 
HD Voice: The Hurdles and how to overcome the codec war
HD Voice: The Hurdles and how to overcome the codec warHD Voice: The Hurdles and how to overcome the codec war
HD Voice: The Hurdles and how to overcome the codec war
 
HD Voice, telecom operators
HD Voice, telecom operatorsHD Voice, telecom operators
HD Voice, telecom operators
 
FutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and MeasurementFutureComm 2010: Video Quality Analysis and Measurement
FutureComm 2010: Video Quality Analysis and Measurement
 
Computer graphic lecturer no 3
Computer graphic lecturer no 3Computer graphic lecturer no 3
Computer graphic lecturer no 3
 
EBU's report on DVB and VR
EBU's report on DVB and VREBU's report on DVB and VR
EBU's report on DVB and VR
 
Stefan slivinski lifesize video coding
Stefan slivinski lifesize video coding Stefan slivinski lifesize video coding
Stefan slivinski lifesize video coding
 
VOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradtsVOGIN-IP-lezing-Zeno_ geradts
VOGIN-IP-lezing-Zeno_ geradts
 
Worksheet 1
Worksheet 1Worksheet 1
Worksheet 1
 
Video editing SP
Video editing SPVideo editing SP
Video editing SP
 
Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012Mobixell pipeline webinar_june_20_2012
Mobixell pipeline webinar_june_20_2012
 
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and AnnotationRe-using Media on the Web tutorial: Media Fragment Creation and Annotation
Re-using Media on the Web tutorial: Media Fragment Creation and Annotation
 
The User at the Wheel of the Online Video Search Engine
The User at the Wheel of the Online Video Search EngineThe User at the Wheel of the Online Video Search Engine
The User at the Wheel of the Online Video Search Engine
 
Video enc basic_p_pt_type
Video enc basic_p_pt_typeVideo enc basic_p_pt_type
Video enc basic_p_pt_type
 
The Mind-Boggling Challege of Long-Term Digital Preservation
The Mind-Boggling Challege of Long-Term Digital PreservationThe Mind-Boggling Challege of Long-Term Digital Preservation
The Mind-Boggling Challege of Long-Term Digital Preservation
 
L51 w
L51 wL51 w
L51 w
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 

Dernier

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 

Dernier (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 

Vdfp audio and video fingerprinting

  • 1. audio and video fingerprinting John Schavemaker, Werner Bailer, Peter-Jan Doets, Jaap Blom
  • 2. techniek even in kort: duplicaatherkenning (video fingerprinting) • bestaat een video in onze databases? categorisatie • wat voor categorie video is het? Nieuws, sport, film? object- en logoherkenning • bestaat een object of logo (plaatje) in onze databases? Zie ook ons online rapport over stand van de techniek: http://research.imagesforthefuture.org/index.php/video-fingerprinting-state-of-the-art-report/ 2 audio and video fingerprinting
  • 3. duplicaatherkenning VRAAG: bestaat een video in onze databases? video fingerprints houden rekening met veranderingen in: • resolutie • codec • ruis • kleur 3 audio and video fingerprinting
  • 4. SWOT video fingerprinting STRENGTHS WEAKNESSES • uitontwikkelde technologie • veel concurrerende partijen, welk • zeer goede performance op softwarepakket te kiezen? geproduceerd materiaal • geschiktheid voor video materiaal dat • veel commerciële pakketten niet geproduceerd is? verkrijgbaar op de markt OPPORTUNITIES THREATS • grotere video databases • video fingerprints gesloten • niet geproduceerd materiaal standaarden • open standaard video fingerprints • versleuteling video • combinatie met audio • slimme “gebruikers” 4 audio and video fingerprinting
  • 5. video categorisatie VRAAG: Wat voor categorie video is het? Close-up gezicht, binnensport, buitensport? images UvA http://www.science.uva.nl/research/mediamill/ 5 audio and video fingerprinting
  • 6. SWOT video categorisatie STRENGTHS WEAKNESSES • veel belovende techniek • onvolwassen techniek • generieke herkenning mogelijk • performance (sterk) afhankelijk • aanvulling op duplicaat- en van gebruikte leervoorbeelden objectherkenning • leren systeem voor nieuwe • brug van de ‘semantic gap’ categorieën duurt relatief lang OPPORTUNITIES THREATS • combinatie van categorieën • variëteit te groot voor categorie • sneller en beter leren • keuze van categorieën • automatische annotatie • afhankelijk van annotatie leervoorbeelden 6 audio and video fingerprinting
  • 7. object- en logoherkenning VRAAG: bestaat een object of logo in onze databases? picture from http://www.omniperception.com/ 7 audio and video fingerprinting
  • 8. SWOT object- en logoherkenning STRENGTHS WEAKNESSES • goede, robuuste performance • alleen 2D objecten (logo’s) • commerciële pakketten • echte duplicaatherkenning • snel leren en herkennen • rekenintensief • revolutie in computer vision OPPORTUNITIES THREATS • grotere video databases • pre-processing al het materiaal • open standaard noodzakelijk • 3D object herkenning • patenten 8 audio and video fingerprinting
  • 9. video fingerprinting 9 audio and video fingerprinting
  • 10. Use of FP: identification Audio/visual Fingerprint Labeled signal extraction Fingerprints Multimedia and items Metadata Metadata Training phase Identification phase Unlabeled Fingerprint Audio/visual Match Which item? Multimedia extraction signal Metadata items 10 audio and video fingerprinting
  • 11. Sound & Vision Pilot • Observations • Problem harder than expected • Transformations • Crop & scale • Brightness/contrast • Logos, captions • very difficult PIP • many matching sequences of black frames 11 audio and video fingerprinting
  • 12. Sound & Vision Pilot – results ZiuZ • TNO has used the ZiuZ video fingerprinting tool on the dataset • ZiuZ video fingerprinting is optimized for child-abuse material: • short clips • low resolution • low image quality • Preliminary results on the Sound & Vision dataset show • material is very challenging • some but limited recall performance • application domain differs • queries containing multiple clips of reference material were not enabled by this version of the tool 12 audio and video fingerprinting
  • 13. Sound & Vision Pilot – Results JRS • Recall: 36% (min: 16%, max. 55%) • Precision: difficult to determine, many black sequences matching, needs manual checking 13 audio and video fingerprinting
  • 14. Sound & Vision Pilot - Results • Transformations our system handles 14 audio and video fingerprinting
  • 15. Sound & Vision Pilot - Results • False positives 15 audio and video fingerprinting
  • 16. Experiments with SIFT (1) • we do not have a SIFT based fingerprinting solution in the consortium • JRS has SIFT-based interactive tool to locate recurring objects in video • created video from episode + source clips and performed analysis and search 16 audio and video fingerprinting
  • 17. Experiments with SIFT (2) 17 audio and video fingerprinting
  • 18. Experiments with SIFT (3) 18 audio and video fingerprinting
  • 19. Experiments with SIFT (4) • Conclusion • SIFT can handle cases of scaling and cropping reliably • even PIP with distortions • Scalability issues • time for extraction and esp. matching • not sure if ranking of matches is still reliable on huge datasets 19 audio and video fingerprinting
  • 20. Characteristics of the data set - audio • Not all archive fragments contain audio • Often the original audio is used – just cut-and-paste, no serious distortions • Sometimes the audio is replaced or combined with a voice over • Time segmentation of the audio in the episode is different from the video used. The audio is not always used with the corresponding video fragments. Example on next slide illustrates this. The other ways around, and other variations also occur. 20 audio and video fingerprinting
  • 21. Characteristics of the data set – audio example Time line of one archive video video audio Time line of one Andere Tijden episode video audio Continuous audio fragment, with several shorter video fragments 21 audio and video fingerprinting
  • 22. Characteristics of the data set - audio • Limitations of the use of audio • the reference material must contain audio • the audio track might not originate from the same material as the video track; this is dependent on the video material used. • the playout speed must not be changed too much (less than +/- 2%) • Advantages of the use of audio • Highly robust algorithms • Usually audio is undistorted; video is cropped, scaled, etc. • Audio usually is used continuously, while video fragments are cut-and-paste from different sections of the reference video, and ‘glued together’. 22 audio and video fingerprinting
  • 23. Identification results - audio • Only checked if the correct archive file name is returned False Episode Correct Missed Positive Liggadjati 8 3 0 Veertig jaar STER-reclame 10 4 1 75 jaar afsluitdijk 0 5 2 Strijd tegen de file 9 1 6 Kronkels van de Maas 1 9 1 Op zoek naar Nederland 2 6 1 Modderen in de polder: Lelystad 3 1 2 Burgemeesters in oorlogstijd 6 10 0 De wording van Paars 8 1 0 Pim en zijn volk 7 3 0 23 audio and video fingerprinting silent parts in the video
  • 24. Fingerprinting – audio algorithm • Algorithm well-known from literature: • Haitsma, Kalker, “A Highly Robust Audio Fingerprinting System”, In Proceedings of 3rd International Conference onMusic Information Retrieval (ISMIR), October 2002. • Features: energy in 33 audio frequency bands • Every 11.6 ms a 32-bit sub-fingerprint is computed, consisting of coarsely quantized differences between these energy samples • Fingerprint consists of a time series of sub-fingerprints • The implementation returns the best matching fragments only (settings to return no false positives) • Algorithm is highly robust, and highly discriminative 24 audio and video fingerprinting
  • 25. Future improvements on current results • Trailing parts contain silence and black frames (no content). The silences give rise to false positives and irrelevant detections. A silence/activity detector is needed to exclude these parts. • Our current implementation from literature allows for only one fragment per reference file to be returned. • Our current implementation has only coarse time localization. • Combination of audio and video fingerprinting 25 audio and video fingerprinting
  • 26. Consortium http://instituut.beeldengeluid.nl/ http://www.joanneum.at/en/digital.html http://www.ziuz.com http://hs-art.com/ http://www.tno.nl 26 audio and video fingerprinting