SlideShare une entreprise Scribd logo
1  sur  2
Télécharger pour lire hors ligne
Arneb: a Rich Internet Application for Ground Truth
Annotation of Videos
Thomas Alisi, Marco Bertini, Gianpaolo D’Amico, Alberto Del Bimbo,
Andrea Ferracani, Federico Pernici and Giuseppe Serra
Media Integration and Communication Center, University of Florence, Italy
{alisi, bertini, damico, delbimbo, ferracani, pernici, serra}@dsi.unifi.it
http://www.micc.unifi.it/vim
ABSTRACT
In this technical demonstration we show the current version
of Arneb1
, a web-based system for manual annotation of
videos, developed within the EU VidiVideo project. This
tool has been developed with the aim of creating ground
truth annotations, that can be used for training and evalu-
ating automatic video annotation systems. Annotations can
be exported to MPEG-7 and OWL ontologies. The system
has been developed according to the Rich Internet Applica-
tion paradigm, allowing collaborative web-based annotation.
Categories and Subject Descriptors
H.5.1 [Information Interfaces and Presentation]: Mul-
timedia Information Systems—Video; H.3.5 [Information
Storage and Retrieval]: Online Information Services—
Web-based services
General Terms
Algorithms, Experimentation
Keywords
Video annotation, ground truth, video streaming
1. INTRODUCTION
The goal of the EU VidiVideo project is to boost the per-
formance of video search by forming a 1000 element the-
saurus of automatic detectors to classify instances of audio,
visual or mixed-media concepts. In order to train all these
classifiers [1], using supervised machine learning techniques
such as SVMs, there is need to easily create ground-truth
annotations of videos, indicating the objects and events of
interest. To this end a web based system that assists in the
creation of large amounts of frame precise manual annota-
tions has been created. Through manual inspection of the
broadcast videos, several human annotators can mark up
1
Arneb is a hare, one of the preys of the mythical hunter
Orion. It is chased by Orion’s hound, Sirio.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MM’09, October 19–24, 2009, Beijing, China.
Copyright 2009 ACM X-XXXXX-XX-X/XX/XX ...$10.00.
the start and end time of each concept appearance, adding
frame accurate annotations.
Web-based annotations tools are becoming very popular
because of some mainstream video portals or web televisions
like YouTube, Splashcast and Viddler. In fact, online users
now are able to insert notes, speech bubbles or simple sub-
titles into the videos timelines with the purpose of adding
further informations to the content, linking to webpages,
advertising brands and products. These new metadata are
managed in a collaborative web 2.0 style, since annotations
can be inserted not only by content owners (video profes-
sionals, archivists, etc.), but also by end users, providing an
enhanced knowledge representation of multimedia content.
Our system starts from this idea of collaborative anno-
tations management and takes it to a new level of efficacy.
All the previous systems, in fact, use annotations composed
of non structured data and do not use any standard for-
mat to store or export it. All these metadata (tags, key-
words or simple sentences) are not structured and do not
allow to represent composite concepts or complex informa-
tion of the video content. Our solution proposes a seman-
tic approach, in which structured annotations are created
using ontologies, that permit to represent in a formal and
machine-understable way a video domain. The import and
export functionalities of both annotations and ontologies,
using MPEG-7 standard and OWL ontologies, allow an ef-
fective interoperability of the system. The system has been
intensively used to annotate several hours of videos, creat-
ing more than 25,000 annotations, by B&G (Dutch national
audiovisual institute) professional archivists.
2. THE SYSTEM
The video annotation system allows to browse videos, se-
lect concepts from an ontology structure, insert annotations
and download previously saved annotations. The system can
work with multiple ontologies, to accomodate different video
domains. Ontologies can be imported and exported using
the MPEG-7 or OWL ontology, or created from scratch
by users. The system administrator can set the ontologies
as modifiable by end-users. Annotations, stored on a SQL
database server, can be exported to both MPEG-7 and OWL
ontology formats. The use of MPEG-7 provides interoper-
ability of the system, but this format reflects the“structural”
lack of semantic expressiveness, that can be obtained by
its translation into an ontology language [2]. To cope with
this problem has been developed the VidiVideo DOME (Dy-
namic Ontology with Multimedia Elements)2
ontology, mod-
2
http://www.micc.unifi.it/dome/
Figure 1: Annotation interface (in clockwise order): browsing videos, searching a concept, annotating.
elled following the Dynamic Pictorially Enriched Ontology
model [3]. DOME defines and describes general structural
video concepts and multimedia concepts, in particular dy-
namic concepts. To cover different fields, videosurveillance,
news and cultural heritage, DOME is composed by a com-
mon core and by three domain ontologies developed for the
video domains used within the VidiVideo project: DOME
Videosurveillance, DOME News, DOME Cultural Heritage.
The system is implemented according to the Rich Inter-
net Application paradigm: the user interface runs in a Flash
virtual machine inside a web browser. RIAs offer many ad-
vantages to the user experience, because users benefit from
the best of web and desktop applications paradigms. Both
high levels of interaction and collaboration (obtainable by
a web application) and robustness in multimedia environ-
ments (typical of a desktop application) are guaranteed.
With this solution installation is not required, since the ap-
plication is updated on the server, and run anywhere regard-
less of what operating system is used. The user interface is
written in the Action Script 3.0 programming language us-
ing Adobe Flex and Flash CS4.
The backend is developed in the PHP 5 scripting language,
while data are stored in a MySQL 5 database server. All
the data between client and server are exchanged in XML
format using HTTP POST calls. The system is currently
based both on open source tools (Apache web server and
Red 5 video streaming server) or freely available commercial
ones (Adobe Flash Media Server free developer edition). All
videos are encoded in the FLV format, using H.264 or On2
VP6 video codecs, and are delivered with the Real Time
Messaging Protocol (RTMP) for video streaming.
3. DEMONSTRATION
We demonstrate the creation of ground-truth annotations
of videos using a tool that allows a collaborative process
through an internet application. Differently from other web
based tools the proposed system allows to annotate the pres-
ence of objects and events in frame accurate time intervals,
instead of annotating a few keyframes or having to rely on
offline video segmentation. The possibility for the annota-
tors to extend the proposed ontologies allow expert users
to better represent the video domains, as the vidos are in-
spected. The import and export capabilities using MPEG-7
standard format allow interoperability of the system.
Acknowledgments.
This work is partially supported by the EU IST VidiVideo
project (www.vidivideo.info - contract FP6-045547).
4. REFERENCES
[1] Cees G. M. Snoek, et al. The MediaMill TRECVID
2008 Semantic Video Search Engine. In Proc. of 6th
TRECVID Workshop, Gaithersburg, USA, November
2008
[2] N. Simou, V. Tzouvaras, Y. Avrithis, G. Stamou, and
S. Kollias. A visual descriptor ontology for multimedia
reasoning. In Proc. of Workshop on Image Analysis for
Multimedia Interactive Services (WIAMIS 2005), Mon-
treux (Switzerland), April 2005.
[3] M. Bertini, R. Cucchiara, A. Del Bimbo, C. Grana,
G. Serra, C. Torniai and R. Vezzani. Dynamic
Pictorially Enriched Ontologies for Video Digital
Libraries. In IEEE Multimedia, to appear, 2009.

Contenu connexe

Similaire à Arneb

Advene As A Tailorable Hypervideo Authoring Tool A Case Study
Advene As A Tailorable Hypervideo Authoring Tool  A Case StudyAdvene As A Tailorable Hypervideo Authoring Tool  A Case Study
Advene As A Tailorable Hypervideo Authoring Tool A Case StudyLaurie Smith
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsAndrea Ferracani
 
Flash-based audio and video communication
Flash-based audio and video communicationFlash-based audio and video communication
Flash-based audio and video communicationKundan Singh
 
Module 2 3
Module 2 3Module 2 3
Module 2 3ryanette
 
VIDEO CONFERENCING SYSTEM USING WEBRTC
VIDEO CONFERENCING SYSTEM USING WEBRTCVIDEO CONFERENCING SYSTEM USING WEBRTC
VIDEO CONFERENCING SYSTEM USING WEBRTCIRJET Journal
 
Cloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection SystemCloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection System1crore projects
 
Streaming Video in the Fortune 500
Streaming Video in the Fortune 500 Streaming Video in the Fortune 500
Streaming Video in the Fortune 500 MediaPlatform
 
Multimedia authoring tools and User interface design
Multimedia authoring tools and User interface designMultimedia authoring tools and User interface design
Multimedia authoring tools and User interface designSagar Rai
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Frameworkijtsrd
 
CommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesCommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesAntonio Capone
 
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...Editor IJMTER
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVFIAT/IFTA
 
Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...ReTV project
 
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...ReTV project
 
Mis presentation -quest
Mis presentation -questMis presentation -quest
Mis presentation -questabhirup1985
 

Similaire à Arneb (20)

Advene As A Tailorable Hypervideo Authoring Tool A Case Study
Advene As A Tailorable Hypervideo Authoring Tool  A Case StudyAdvene As A Tailorable Hypervideo Authoring Tool  A Case Study
Advene As A Tailorable Hypervideo Authoring Tool A Case Study
 
503 434-438
503 434-438503 434-438
503 434-438
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing Systems
 
Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing Systems
 
Flash-based audio and video communication
Flash-based audio and video communicationFlash-based audio and video communication
Flash-based audio and video communication
 
Module 2 3
Module 2 3Module 2 3
Module 2 3
 
VIDEO CONFERENCING SYSTEM USING WEBRTC
VIDEO CONFERENCING SYSTEM USING WEBRTCVIDEO CONFERENCING SYSTEM USING WEBRTC
VIDEO CONFERENCING SYSTEM USING WEBRTC
 
Cloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection SystemCloud-Based Multimedia Content Protection System
Cloud-Based Multimedia Content Protection System
 
Streaming Video in the Fortune 500
Streaming Video in the Fortune 500 Streaming Video in the Fortune 500
Streaming Video in the Fortune 500
 
C44081316
C44081316C44081316
C44081316
 
Multimedia authoring tools and User interface design
Multimedia authoring tools and User interface designMultimedia authoring tools and User interface design
Multimedia authoring tools and User interface design
 
A Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia FrameworkA Study on FFmpeg Multimedia Framework
A Study on FFmpeg Multimedia Framework
 
Athabasca
AthabascaAthabasca
Athabasca
 
Athabasca
AthabascaAthabasca
Athabasca
 
CommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesCommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) services
 
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...
Enhanced Dynamic Leakage Detection and Piracy Prevention in Content Delivery ...
 
OOMEN MEZARIS ReTV
OOMEN MEZARIS ReTVOOMEN MEZARIS ReTV
OOMEN MEZARIS ReTV
 
Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...Implementing artificial intelligence strategies for content annotation and pu...
Implementing artificial intelligence strategies for content annotation and pu...
 
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
Implementing Artificial Intelligence Strategies for Content Annotation and Pu...
 
Mis presentation -quest
Mis presentation -questMis presentation -quest
Mis presentation -quest
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Arneb

  • 1. Arneb: a Rich Internet Application for Ground Truth Annotation of Videos Thomas Alisi, Marco Bertini, Gianpaolo D’Amico, Alberto Del Bimbo, Andrea Ferracani, Federico Pernici and Giuseppe Serra Media Integration and Communication Center, University of Florence, Italy {alisi, bertini, damico, delbimbo, ferracani, pernici, serra}@dsi.unifi.it http://www.micc.unifi.it/vim ABSTRACT In this technical demonstration we show the current version of Arneb1 , a web-based system for manual annotation of videos, developed within the EU VidiVideo project. This tool has been developed with the aim of creating ground truth annotations, that can be used for training and evalu- ating automatic video annotation systems. Annotations can be exported to MPEG-7 and OWL ontologies. The system has been developed according to the Rich Internet Applica- tion paradigm, allowing collaborative web-based annotation. Categories and Subject Descriptors H.5.1 [Information Interfaces and Presentation]: Mul- timedia Information Systems—Video; H.3.5 [Information Storage and Retrieval]: Online Information Services— Web-based services General Terms Algorithms, Experimentation Keywords Video annotation, ground truth, video streaming 1. INTRODUCTION The goal of the EU VidiVideo project is to boost the per- formance of video search by forming a 1000 element the- saurus of automatic detectors to classify instances of audio, visual or mixed-media concepts. In order to train all these classifiers [1], using supervised machine learning techniques such as SVMs, there is need to easily create ground-truth annotations of videos, indicating the objects and events of interest. To this end a web based system that assists in the creation of large amounts of frame precise manual annota- tions has been created. Through manual inspection of the broadcast videos, several human annotators can mark up 1 Arneb is a hare, one of the preys of the mythical hunter Orion. It is chased by Orion’s hound, Sirio. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’09, October 19–24, 2009, Beijing, China. Copyright 2009 ACM X-XXXXX-XX-X/XX/XX ...$10.00. the start and end time of each concept appearance, adding frame accurate annotations. Web-based annotations tools are becoming very popular because of some mainstream video portals or web televisions like YouTube, Splashcast and Viddler. In fact, online users now are able to insert notes, speech bubbles or simple sub- titles into the videos timelines with the purpose of adding further informations to the content, linking to webpages, advertising brands and products. These new metadata are managed in a collaborative web 2.0 style, since annotations can be inserted not only by content owners (video profes- sionals, archivists, etc.), but also by end users, providing an enhanced knowledge representation of multimedia content. Our system starts from this idea of collaborative anno- tations management and takes it to a new level of efficacy. All the previous systems, in fact, use annotations composed of non structured data and do not use any standard for- mat to store or export it. All these metadata (tags, key- words or simple sentences) are not structured and do not allow to represent composite concepts or complex informa- tion of the video content. Our solution proposes a seman- tic approach, in which structured annotations are created using ontologies, that permit to represent in a formal and machine-understable way a video domain. The import and export functionalities of both annotations and ontologies, using MPEG-7 standard and OWL ontologies, allow an ef- fective interoperability of the system. The system has been intensively used to annotate several hours of videos, creat- ing more than 25,000 annotations, by B&G (Dutch national audiovisual institute) professional archivists. 2. THE SYSTEM The video annotation system allows to browse videos, se- lect concepts from an ontology structure, insert annotations and download previously saved annotations. The system can work with multiple ontologies, to accomodate different video domains. Ontologies can be imported and exported using the MPEG-7 or OWL ontology, or created from scratch by users. The system administrator can set the ontologies as modifiable by end-users. Annotations, stored on a SQL database server, can be exported to both MPEG-7 and OWL ontology formats. The use of MPEG-7 provides interoper- ability of the system, but this format reflects the“structural” lack of semantic expressiveness, that can be obtained by its translation into an ontology language [2]. To cope with this problem has been developed the VidiVideo DOME (Dy- namic Ontology with Multimedia Elements)2 ontology, mod- 2 http://www.micc.unifi.it/dome/
  • 2. Figure 1: Annotation interface (in clockwise order): browsing videos, searching a concept, annotating. elled following the Dynamic Pictorially Enriched Ontology model [3]. DOME defines and describes general structural video concepts and multimedia concepts, in particular dy- namic concepts. To cover different fields, videosurveillance, news and cultural heritage, DOME is composed by a com- mon core and by three domain ontologies developed for the video domains used within the VidiVideo project: DOME Videosurveillance, DOME News, DOME Cultural Heritage. The system is implemented according to the Rich Inter- net Application paradigm: the user interface runs in a Flash virtual machine inside a web browser. RIAs offer many ad- vantages to the user experience, because users benefit from the best of web and desktop applications paradigms. Both high levels of interaction and collaboration (obtainable by a web application) and robustness in multimedia environ- ments (typical of a desktop application) are guaranteed. With this solution installation is not required, since the ap- plication is updated on the server, and run anywhere regard- less of what operating system is used. The user interface is written in the Action Script 3.0 programming language us- ing Adobe Flex and Flash CS4. The backend is developed in the PHP 5 scripting language, while data are stored in a MySQL 5 database server. All the data between client and server are exchanged in XML format using HTTP POST calls. The system is currently based both on open source tools (Apache web server and Red 5 video streaming server) or freely available commercial ones (Adobe Flash Media Server free developer edition). All videos are encoded in the FLV format, using H.264 or On2 VP6 video codecs, and are delivered with the Real Time Messaging Protocol (RTMP) for video streaming. 3. DEMONSTRATION We demonstrate the creation of ground-truth annotations of videos using a tool that allows a collaborative process through an internet application. Differently from other web based tools the proposed system allows to annotate the pres- ence of objects and events in frame accurate time intervals, instead of annotating a few keyframes or having to rely on offline video segmentation. The possibility for the annota- tors to extend the proposed ontologies allow expert users to better represent the video domains, as the vidos are in- spected. The import and export capabilities using MPEG-7 standard format allow interoperability of the system. Acknowledgments. This work is partially supported by the EU IST VidiVideo project (www.vidivideo.info - contract FP6-045547). 4. REFERENCES [1] Cees G. M. Snoek, et al. The MediaMill TRECVID 2008 Semantic Video Search Engine. In Proc. of 6th TRECVID Workshop, Gaithersburg, USA, November 2008 [2] N. Simou, V. Tzouvaras, Y. Avrithis, G. Stamou, and S. Kollias. A visual descriptor ontology for multimedia reasoning. In Proc. of Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS 2005), Mon- treux (Switzerland), April 2005. [3] M. Bertini, R. Cucchiara, A. Del Bimbo, C. Grana, G. Serra, C. Torniai and R. Vezzani. Dynamic Pictorially Enriched Ontologies for Video Digital Libraries. In IEEE Multimedia, to appear, 2009.