1. 1
Digital Newsrooms group
VRT Integrated Newsroom
Brussels, Belgium, 13-14 December 2007
Visit report [technical part only]
(revised and proof-read by the speakers)
1 The Digital Media Factory (DMF)..................................................................... 2
1.1 Principles ........................................................................................................................2
1.2 Architecture and technical components..........................................................................2
1.2.1 Work centres (WoC) ....................................................................................................2
1.2.2 File formats..................................................................................................................3
1.2.3 Storage architecture ....................................................................................................3
1.2.4 Network and bandwidth ...............................................................................................3
1.2.5 Media Asset Management...........................................................................................4
1.2.6 Data model ..................................................................................................................5
1.2.7 Integration Layer..........................................................................................................6
1.2.8 Lessons learned ..........................................................................................................6
1.3 Metadata for News: NewsML-G2....................................................................................7
1.4 Security & continuity of service.......................................................................................7
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
2. 2
1 The Digital Media Factory (DMF)
"Ensuring interoperability: la grande besogne" by Johan Hoffman, Manager Production Technology
projects
"An integrated file-based media production workflow: the Digital Media Factory" J. Hoffman, P.
Soetens and M. De Geyter, IBC 2007
In parallel to the reading of this section, refer in the Annexe: to:
the Figure 1, with the areas indicated by [x], and other Figures,
the list of 'Abbreviations and Acronyms',
the list of 'Tools and Vendors'.
1.1 Principles
The architecture of the VRT's Digital Media Factory corresponds to the following principles:
Work centres connected to a central media storage and management. A work centre (WoC) is
an autonomous environment used for performing a specific craft step in the production process
(e.g. audio editing, video editing, subtitling…).
o Instead of a single vendor approach, the best tool for each craft was selected. This better
responded to the users' needs, but increased the complexity of the integration.
o A work centre must at all times be able to operate autonomously, even if the connection to
or integration with other work centres or the central media management is temporarily broken.
This implies a loose coupling: as long as all work centres, the central system and the
integration layer are available, the added value of the integrated workflow is available to the
users.
A file-based workflow used throughout. This implies that material will either start its existence
as a file (e.g. file-based camera with P2 memory card), or it will be converted to a file as the very
first step in the production process (e.g. feed ingest, ingest of archive tapes, ingest of material
purchased externally). On the drawing of Figure 1, the bold black lines indicating audio and video
streams are only present at the ingest [1][2] or at the playout [11][12][13] stages, whereas the
dotted lines indicating the media files are present everywhere.
A full integration on Essence and metadata level. As Essence (audio and video data) is
produced, transformed and transported between work centres and central storage & media
management, the metadata associated with (thin black lines on the drawing of Figure 1) are
gradually enriched in each step of the workflow. The integration layer [7] takes care of
synchronising the metadata between systems and of orchestrating the various Essence transfers.
Standard equipment and software are used where possible. This is the case for the central
media file storage environment, for the server environment for a number of applications (mainly
central media asset management, integration layer, radio production, newsroom computer
system, online editing and distribution) and for the user environment, which is PC-based for the
broad user community. Specialised equipment, used where needed, is mainly found in the areas
of editing, ingest and play-out automation.
1.2 Architecture and technical components
1.2.1 Work centres (WoC)
The three broad areas are ingest, editing and play-out.
There are separate ingest facilities [1] [2] for news-related ingest (feeds, tapes, camera material)
and for general programme production (mainly tapes and file import from external partners),
including ingest scheduling and automation (DART and Ardcap by Ardendo) and the ingest itself
(Omneon and Avid AirSpeed servers).
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
3. 3
Professional editing tools and editing cells are available for video editing (Avid ISIS) [8], audio
editing (VCS Dira) [9] and online multi-channel editing and publishing (based on the Polopoly
framework) [10].
TV play-out [11] is automated by SGT DBOS for news production and Morpheus in the final
control room, with Omneon servers as the file play-out devices.
The legacy archive system [3] BASISp lus (by OpenText) serves as the main index to archived
material and is integrated with the new environment, so that the result of searches can be
immediately visualized and linked through in the file-based work environment (cf §1.2.5).
Avid iNews [4] is the newsroom computer system used for news content planning, editing and
rundown scheduling. The integration with the play-out automation system and with the central
media management system allows the synchronisation of the rundown with the playout automation
and a follow-up of the status of the material and its links to the rundown as it is produced and
prepared for broadcast.
A future Work Centre to be connected will be Graphics.
1.2.2 File formats
The MXF Operational Pattern OP1a was chosen as the standard file format, wrapping Essence
(e.g. video and audio tracks) in either DV25 or D-10 (Sony IMX) format. But the compatibility
between the various tools proved to be a challenge:
MXF is a complex container format and allows many standard-compliant variants and different
ways of wrapping Essence and Essence-related metadata (e.g. time code information) in a file.
As a result, there is no guarantee that tools of different vendors, although both capable of
processing (one of the flavours of) MXF, are interoperable on the Essence file level. For example,
at the ingest stage [1] a 're-wrapping' tool developed by OpenCube is necessary to acquire the
Essence from the camera P2 memory card wrapped in the MXF OP Atom, to get it into the Avid
MXF OP.
Besides that, there were initially a number of outright errors in the treatment of MXF, due to the
fact that this is a young standard and that vendor tools are not yet fully mature.
1.2.3 Storage architecture
The file server and storage environment (Figure 1-[5] and Figure 2) uses three clusters of
Linux/Intel servers running IBM’s GPFS clustered file system. The file servers communicate with
their external clients, which are typically the work centre servers or the Ardome application servers,
through the IP network. Internally, a Fibre Channel network is used to connect the servers to their
storage cabinets. The controller and disk technology used is either Fibre Channel or SATA,
protected in RAID 5 and in some cases mirrored, depending on the expected usage pattern and
reliability requirements of the essence data which it will contain. Striping over multiple RAID sets is
used to obtain the guaranteed sustained throughput of media essence files under the expected
load conditions. The storage capacity on online disks is about 500 TBytes gross or 250 TBytes net.
IBM Tivoli Storage Manager (TSM), together with the Ardome MAM, manages the information life
cycle, automating the decisions on keeping material on disks or on the tapes of large data tape
robots (ADIC and StorageTek). The tape robots can grow enough to contain the complete
production archive of VRT in the future.
1.2.4 Network and bandwidth
The files wrapping production quality video tend to be very large (with DVCPRO25 @ 30 Mbit/s or
with D-10 @ 60 Mbit/s, 1 hour of video material requests 15 or 30 Gbyte of storage capacity). The
network and storage bandwidth capacities needed to allow hundreds of transfers in parallel of
these very large files are very large.
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
4. 4
All communication between servers is carried out over a standard TCP/IP UTP Ethernet network.
On the server side, each network access point has a guaranteed, non-blocking throughput of 1
Gbit/s and is fully redundant, both on the port and on the switch level. This is a very stringent
though realistic design requirement for the network. By using multiple network access points on
one server, the combined network throughput of one server can largely exceed 1 Gbit/s.
1.2.5 Media Asset Management
1
The media management system (Ardome by Ardendo ) is the central core of the architecture
(Figure 1-[6]). It manages the storage resources for the media Essence and the browse copy files,
the information life cycle, and user access. A number of auxiliary tools managing ingest, ingest
planning and “outgest” to tape (Ardcap and Dart, also by Ardendo) collaborate with it.
The media asset management serves as a central hub and repository for production material
whose life span exceeds the short life cycle of intermediate work files or scratch space. It is a
repository both for “work in progress” with a life span of a few days or more, and for long term
archive material (finished items, raw or semi-finished material with a lasting value).
The Ardome system is the main access point to media Essence for the large media production
user community, with the exception of the smaller groups of specialized craft users (e.g. video
editors, planners, etc…) who have specialized work centres available.
The user can search and access the material through its browse copy from any PC workstation
within the organisation. Material is available within seconds, thus avoiding the previous
tediousness of identifying and collecting tapes from the archive.
Basic cutting operations are also supported to allow the user to select fragments from material for
later edits and generation of items.
In the near future we plan the introduction of browse editing, to allow the user (e.g. journalist) who
is not a professional video editor, to create simple edits with sufficient quality to broadcast the
items without craft editing, thereby shortening the preparation time for items before broadcast and
avoiding the bottleneck of editing capacity which can be annoying during news-intensive days.
1.2.5.1 Archive Management
"Managing digital News Assets" by Francis Van Werde, Project Manager digitization video, and
Roel Geets, Archivist
The VRT Archive is using the BASISplus document management and retrieval software. It has
been interconnected with the Ardome MAM software. The most valuable fields have been
extracted from BASISplus for Ardome, and some Ardome fields (duration, creation date…) have
been integrated into BASISplus. Some fields have been added in Ardome to support the workflow,
the storage and the security management:
'Kind' (raw material / semi-finished / finished / undefined);
'Status' indicating the life cycle of an item, the next process (to edit / to distribute / to archive / to
delete / archived);
'Category' related to the production unit (News / Sports / Culture / Fiction / Education / …).
The Archive management is part of a 3-level media management:
The local media management is organised by the production assistants in each production
unit (News / Current Affairs / Foreign desk /Sports) to:
o select the assets according to a 'Kind' field (Raw material / Half-finished / Broadcastable or
'to Delete'),
o select and compile material to be archived,
o make up thematic collections.
1
http://www.ardendo.com/?page=products&subpage=ardome
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
5. 5
The archive media management looks after:
o the final selection of material before archiving (or deleting),
o the filtering of material for temporary storage before defintive conservation.
The central media management supervises:
o the definition of the guidelines and rules of storage and deletion,
o the maintenance and definition of the Graphical User Interface,
o the reports and logs,
o the users (roles and domains)
o the users, uses and material (Essence & metadata) in Ardome
o It plays an "administrator" role in the security domain.
To avoid a bottleneck at the tape ingest stage (Figure 1-[2]), 2 years of News & current affairs and
6 months of Sports had been previously digitized to the launch of the DMF.
1.2.6 Data model
The data model is an essential part of the overall enterprise architecture. It provides a sound definition of
all applicable entities (programme groups, programmes, items), types (item types would be newsfeeds,
rushes, news stories, etc) and properties of entities (title, abstract, genre…). The model defines as well
how entities relate to each other. The data model includes various particular translations to individual
applications and standards and, while continuously being in an unfinished state, it eventually intends to
create a common denominator when integrating different systems.
The data model was first devised (in 2004) and it was originally based on a 4-layered process
2
model ,whereby each layer represents one or more fundamental business processes of
programme production, and associated with these the data model represents the different
dimensions (Figure 3):
3
1) Enterprise Logic includes the Enterprise Resources Planning (ERP) system implemented by
SAP, an implementation which is based on the widely applicable MRPII model. The main purpose of
the ERP system is to manage logistic and financial information;
2) Information Management includes the creative processes (Product Engineering or programme
development) and archiving and mainly represents descriptive information (scene descriptions,
scenarios, shooting scripts …);
3) Production and Distribution implements the low-level processes that manipulate the technical
information concerning the digital production steps (from ingest over editing - assembly to playout);
4) The Technological platform essentially deals with the handling of the essence in the form of
tapes and files.
This theoretical and complex reference model, made of entities and attributes definitions (Figure
6), plays a major role in projects:
When (re)installing applications, a subset of the data model is used to configure the system in a way
which is reasonably compliant with the data model. This is an essential prerequisite for future system
integration. For example
The internal data model of the Ardome Media Asset Management (MAM) system has been extended
by an organisation-specific set of attributes and properties in order to correctly represent the
organisation’s business processes. A subset of the VRT data model has been used as such when
2
"VRT data model" Maarten Verwaest, Luk Overmeire & Bart Cornille, EBU Production Technology 2006 seminar
http://www.ebu.ch/CMSimages/en/EBU-2006ProdTechnoSeminar-Report_FINAL_tcm6-43103.pdf
3
The SMEF BBC model, which has been a big source of inspiration for the VRT model, has reinvented the top layer, whereas
VRT has adopted what MRPII and APICS vocabulary already offer.
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
6. 6
configuring the system, ensuring relative compatibility with other systems that speech “VRT data
modelish”;
When integrating different applications the data model is used to create mappings between systems.
For example:
o for the exchange of metadata between the MAM and the local MAM of Video Workcenters,
or between the Archive system (BASISplus) and the Ardome MAM, translations were created
from and to the VRT data model;
o for the exchange of programme-related information between the ERP system (SAP) and the
broadcast schedule management system (MediaGenix/Whatson), which have set up an
interface that basically speaks the EBU standard P/META. We have been able to set up an
unambiguous communication by mapping SAP, MediaGenix/Whatson and P/META with
respect to the VRT data model.
1.2.7 Integration Layer
Building the Digital Media Factory requires the integration of its components. The basic
requirement is that metadata (technical metadata, description fields, context information, etc...) is
attached to Essence and transported together with the Essence through the various tools to be
enriched and maintained by the user, as the Essence is transformed from raw material to a
finished item. Special attention is needed to synchronization and data ownership between the
subsystems.
Technically, the integration was accomplished by using an 'Enterprise Service Bus' ESB (Figure 1-
[7] and Figure 4), part of IBM Websphere, as the core integration component in a Service Oriented
Architecture (SOA). Applications are accessed through services, producing or consuming
application specific business objects, which convert into or enrich generic business objects that are
processed within the Enterprise Service Bus.
The EBU P/META standard was used as the generic business object data model in the ESB. This
makes it possible to loosely couple systems with each other as end-point specific data models are
immediately replaced.
A layered architecture was built on top of the ESB which allows not only to make abstraction of
specific data models but also hides specific end-point interaction complexity from the upper layers.
This allows for a business process management layer to be built safely on top in a later stage (BSL
in the Figure 5) and to easily replace end-points with new products or versions.
One of the main integration challenges is to overcome the initial immaturity from an advanced IT
perspective of many tools and products used in media production. In many cases, integration API’s
and associated data models are rudimentary, not well documented and heterogeneous. The
support for SOA and its associated technologies is rather limited in typical media production work
centres. The End Point Layer in the architecture hides these deficiencies from the higher layers in
the integration architecture by providing a clean and standardized web service interface. Ardome
played an important role as a hub in integrating these work centres with each other. For this,
SOAP-compliant Ardome APIs and an event mechanism had to be developed.
1.2.8 Lessons learned
The design phase of the integrations is time consuming, but the development cycle is relatively
short.
The integration platform is the key to obtain a well-performing factory, allowing to align the
components and their use on the business process, and making easier future extensions.
A common data model is necessary to pass various metadata representations (P/META, MXF,
NewsML-G2, etc.) from one system to another.
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
7. 7
Although this is within reach of standard IT technology, designing and building the storage
system and the supporting network are far from trivial and constitute projects in their own.
Key vendors were willing to tackle and fix the MXF problems, allowing us to reach our goal of a
fully integrated media factory, even though this required a lot of attention and effort on our side.
A good dialogue and cooperation with users is essential: give clear feedback on what technology
can and can’t do (demo sessions as early as possible), involve users in testing.
1.3 Metadata for News: NewsML-G2
By Maarten Verwaest, Senior researcher VRT Medialab
NewsML (News Markup Language) is a family of computer languages used to formally describe
news items and intended to support communica tion between systems (not humans). It is optimised
for distribution of raw material (including separetely text, photos, graphs, audio or video) by News
agencies, developed by Reuters in 1998, based on XML, and is an industry standard managed by
4
the IPTC (International Press Telecommunications Council) since 2000 .
But NewsMLM being a compromise has become complicated and it has shown to be not effective.
The 2nd Generation NewsML-G2 should provide:
an enhanced interoperability between various News items providers (AFP, DowJones, EBU,
Reuters, Documentation services, Archives, correspondents by e-mail, telex services,etc.) and
clients;
extensibility (e.g. to include a subtitle standard),
clarity and compactness of the syntax,
ease of storage in items packages / items / fragments allowing better resource management
(specific items versus entire feeds) and random access,
ease of processing for relational or object mapping.
focus on "semantic" capabilities (with thesaurus and taxonomy) to allow future 'intelligent'
NRCS, search tools and Content Management Systems to process the overflow of redundant
information, to be aware of potentially relevant items, to remove duplicates and detect updates,
and so improving drastically search & retrieve.
Presently:
5
IPTC is finalising the standard
6
EBU launched a Beta-programme P/NEWSML
7
IBBT/VRT medialab is developing reference software
8
CWI/IBBT will submit a European project proposal - FP7 (Digital Libraries)
At VRT, the pre-developed data model (§ 1.2.6 and Figure 6), as a reference model providing a
common meaning of entities and attributes, will allow to integrate NewsML-G2.
1.4 Security & continuity of service
"Security and continuity of service: managing the unacceptable black screen risk" by Francis Van
Werde, Project Manager digitization video and Dieter Boen, IT-analyst
4
http://www.ebu.ch/en/technical/trev/trev_287-allday.pdf?display=EN
5
http://www.iptc.org/std-dev/NewsML-G2/1.0/
Technical Forum http://tech.groups.yahoo.com/group/newsml-g2
6
http://wiki.ebu.ch/technical/P/MAG
http://www.ebu.ch/metadata/NewsML/P-newsML001_NewsMeetingPresentation.pdf
7
http://multiblog.vrt.be/medialab/category/research/competence/newsml/
8
http://newsml.cwi.nl
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
8. 8
Some 'worst nightmare' stories told by the participants illustrate the need for well-designed
architecture and emergency plans:
After 2 years using Avid Unity server system, there were recurrent disks and arrays failures,
making it difficult to produce the News. The decision was taken to go back to tapes for 2 weeks,
the time for Avid to change the 120 disks of the server (Western Digital replacing Maxtor disks,
and Mediarray version 3 replacing ZX2) and to reconstruct the content. A few disks are still failing
from time to time, but the replacement procedure runs now well.
A technician performed a critical consolidation-process on the weekly 700-hour Profile server.
He had no experience. When he saw that files were deleted, he stopped the process. This
happened before and there is a procedure to restart the process in a correct way. The technician
didn't apply this procedure and started again the process in the wrong way. More than 4 days of
News items were lost!
A journalist bought in a small hub to extend the number of network points they had at their
desk. This was not approved equipment and IT teams were not informed. For no obvious
reason, many months after it had been added, it caused an imbalance in the network that
triggered a network storm from the main router it was connected to. The network became
virtually unusable for a number of hours until the device was physically located and removed. The
existing ban on unauthorised additions to the network is now being enthusiastically applied!
On the first week of tapeless News production, an 'on-line' (instead of 'off-line') recovery
procedure was launched, degrading the performances of the newsroom system and taking 3 days
to be completed!
On the day of the launch of two new Newscasts on two different channels an overload of the
company's Uninterruptible Power Supply (UPS) caused a major power failure at the ingest point,
in the 2 Digital News studios, in the editing boots and at the journalists/producers/editors desks.
Luckily, the coffee machine was still working!
Two hard disk drives crashed in 5 minutes interval (should never happen, claimed the vendor)…
luckily on the backup playout server!
VRT has centred its security and continuity of service strategy on 3 axes:
Redundancy in architecture and workflow
o The autonomous loosely coupled workcenters avoid the ripplethrough of technical incidents
to other systems. An emergency workflow is available. For example, the Editing goes to
Playout and Archiving through the Media Asset Management system. But, in case of
emergency, a direct connection between Editing and Playout can be established, and there is
a redundant connection. Urgent production and broadcast operations can continue at all times,
even in case of multiple simultaneous system failures.
o The structure of the Central storage (Figure 2) in specific zones Archive / Browse material /
'High-Res.' material / System components, with each its own redundancy (mirror) and back-up,
insures a high degree of reliability.
o The (over)load of the units are permanently monitored.
Testing
A series of tests have been conducted: technical tests (1/2 year), functional tests, load and
performance tests (10 students simulating for 1 month 100 people working), user acceptance
tests, test of the updates on a parallel platform…
Practice
Demo sessions with key users, dry runs and general rehearsals, evaluation sessions after the
newscasts.
The role of the key users has been essential for the testing and the configuration of the system, for
adjusting the workflow and procedures. Teaching and coaching all users, they strongly contribute
to reduce the security 'human factor'.
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
9. 9
Concerning networking, ZDF has implemented:
a general LAN for all the office communication, including all the PCs of the Journalists;
a production LAN to which are only attached all the Computers, which are part of the production
environment.
The communication between the two parts of the network is controlled by Firewalls and limited to
the necessary content.
There is no access to the Internet from the production LAN. In case of a virus attack on the general
LAN, the production LAN is completely separated, imposing restrictions to the Journalists. For
example, they have then no more access to iNews from their PCs, but only from special PCs within
the production LAN.
ZDF is currently defining common rules to handle files and storage devices from outside ZDF,
including USB-sticks,P2-cards, external hard-drives, files from the Internet…These rules are quite
strict but this is a complicated subject, because there is always the struggle between the editorial
needs and the necessity to provide network security in order to stay on Air. In general only P2-
cards from ZDF-crews are allowed to be attached directly to the production LAN. All other material
has to be checked and then transferred via secure gateways from the general LAN to the
production LAN. In uncertain cases the file has to be played as an analog video signal, then re-
converted into a digital file and only then integrated into the Production LAN (this is in fact the only
really secure way).
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
10. 10
Figure 1 : VRT Modular and Integrated Newsroom Architecture
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
11. 11
Figure 2 : Central storage architecture
Target 50 streams
Target 200 streams
Supernet II
GMII
Ardome DB2 Cluster TSM Cluster TSM Cluster TSM Cluster TSM
5x4CPU 2x2CPU 3x2CPU 1x2CPU 4x4CPU 1x2CPU 3x2CPU 2x2CPU
SAN SAN SAN SAN
switch switch switch switch
Tape SATA single Tape SATA mirror FC mirror Tape FC mirror Tape
165 TB 40 TB 22,5 TB FC mirror
robot robot robot 62,5 TB SATA mirror robot
Work Audio Browse SATA single
Other essence News Feeds
News archive
2500 uur HiRes WiP 5000 h News
browse + 50000 h 1000 h + 4000 h
other browse (HiRes)
BUBE MCBE MCRT Staging
Business as usual Best Effort Mission Critical Best Effort Mission Critical Real Time
Archive Browse material ‘High-Res.’ material System components
Figure 3 : VRT 4-layer data model
logistic &
financial information ERP - Enterprise Control
META
Descriptive information META
“Creative” processes
Product ENGINEERING
PLAY
INGEST Editing Assembly
OUT
Digital production PRODUCTION
(technical) information
STORAGE - SERVERS - NETWORK
essence
information TECHNOLOGICAL PLATFORM
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007
12. 12
Figure 4 : ESB-based integration
Enterprise Service Bus
Business Process
Mediation
App A ASBO GBO GBO ASBO App B
or process
mediation mediation
GBO = Generic Business Object
ASBO = Application Specific Business Object
Mediation = transformation, routing, validation and processing of messages
Figure 5 : ESB-layered architecture
Future
Current
EBU International Training / Hélène RAUBY-MATTA & Jean-Noël GOUYET / Thematic Visit to VRT Newsroom / 13 - 14 December 2007