SlideShare une entreprise Scribd logo
1  sur  59
IPTC NewsCodes: Controlled
Vocabularies for the News Media
Brendan Quinn, Managing Director, IPTC
Jennifer Parrucci, Senior Taxonomist, The New York Times
Tuesday 25 May 2021
Overview
© 2021 IPTC (www.iptc.org) All rights reserved 2
What is IPTC? What do we do?
What are IPTC NewsCodes?
How do we maintain the IPTC NewsCodes?
How can IPTC NewsCodes be used?
The future of IPTC NewsCodes
What is IPTC?
• The global technical
standards body of the news
media
• We are an open, non-
partisan, non-political
organisation dedicated to
promoting interoperable
standards and best practices
in the news and media
industries
• We are a registered non-
profit organisation funded by
© 2021 IPTC (www.iptc.org) All rights reserved 3
History of IPTC
1965
• IPTC founded to
safeguard the
telecommunications
needs of the press
• Accredited by
International
Telecommunications
Union in 1967
1979
• Creation of IPTC 7901
format for binary
transmission of news
content
1990
• Information
Interchange
Model (IIM), the
first multimedia
news exchange
format, created
with NAA
1995
•Adobe adopts IIM in
Photoshop, which led
to the development
of the IPTC Photo
Metadata Standard in
2004
2001
•NewsML, the first
XML news exchange
standard, was
released
•First set of
NewsCodes
controlled
vocabularies released
2008
•NewsML-G2
family of
standards
released,
including
EventsML-
G2 and
SportsML-
G2
2010s
•rNews
•ninjs
•RightsML
•Video Metadata
Hub
© 2021 IPTC (www.iptc.org) All rights reserved 4
2020s
• Trust &
Credibility
additions to
•IPTC Photo
Metadata in
Google Search
results
Our voting members
© 2021 IPTC (www.iptc.org) All rights reserved 5
Our associate members
• Agencia EFE Spain
• ATC - Athens Technology Center Greece
• BVPA Germany
• CBC / Radio Canada, Canada
• CEPIC Europe
• DeFodi Images Germany
• DPP Digital Production Partnership United Kingdom
• EBU - European Broadcasting Union Europe
• epa - European Pressphoto Agency Europe
• ETX Studio France
• Fingerpost UK
• FotoWare Norway
• iMatrics Sweden
• JIS - Jozef Stefan Institute Slovenia
• Keystone SDA Switzerland
• KNA - Kath. Nachrichten-Agentur Germany
• Lusa Portugal
• Mainstream Data USA
• MENA Egypt
• Microstocksolutions LLC USA
• Naviga USA
• NTB - Norsk Telegrambyrå Norway
• PR Newswire UK
• Profium Finland
• PUBLISH, Inc South Korea
• STA - Slovenian Press Agency Slovenia
• Sourcefabric Czech Republic
• TASS Russia
• XML Team Solutions USA
• Yuanben China
• … plus 13 Individual and Honorary Members
© 2021 IPTC (www.iptc.org) All rights reserved 6
Our startup members
Special membership category open to small companies before they
have become profitable. Launched in late 2020
IMATAG Metadata Authoring Systems Scribely
© 2021 IPTC (www.iptc.org) All rights reserved 7
IPTC’s technical standards
IPTC Photo
Metadata
Video
Metadata
Hub
NewsCodesM
ediaTopicsSu
bject Codes
NewsML-
G2 and
NewsML-1
RightsML
ninjs
rNews SportsML-
G2
EventsML-
G2
Historical
standards:
•IPTC 7901
•IIM format
© 2021 IPTC (www.iptc.org) All rights reserved 8
IPTC Photo Metadata:
The global standard used to describe pictures
• IPTC metadata has been used since 1990 to store
descriptive, administrative and rights metadata embedded
in image files
• Since Adobe adopted it in Photoshop in 1995 it has been
used in millions of image files
• Originally based on IPTC Information Interchange Model
(IIM) standard, now based on XMP embedded metadata
framework (introduced by Adobe, now an ISO standard).
– Most software supports both formats but we encourage users to use
XMP as it has more fields, unlimited field lengths and multi-language
support
• IPTC Photo Metadata is used by:
– All major camera manufacturers: Canon, Nikon, Sony etc
– Photo software vendors: Adobe PhotoShop and Lightroom,
PhotoMechanic, GIMP etc
– Digital Asset Management systems: FotoWare, Extensis and many more
– Google image search results © 2021 IPTC (www.iptc.org) All rights reserved 9
IPTC Photo Metadata and Google Images
• In 2018, IPTC, CEPIC and
DMLA worked with Google
to display embedded IPTC
Credit, Copyright and
Creator fields in Google
Images search results
© 2021 IPTC (www.iptc.org) All rights reserved 10
IPTC Photo Metadata drives the
Google Images Licensable badge
© 2021 IPTC (www.iptc.org) All rights reserved 11
• In 2020, Google started using IPTC Photo
Metadata to link “Licensable Images” to a
place where the image can be purchased
or licenced
• A “licensable” badge is displayed in the
search results grid
• Links from Google search results go directly
to a page where users can license the image
• Works even when images are used on
third-party sites
– The badge isn’t shown in thumbnail page, but “get this
image on” and “view licence details” in side panel will
work
Video Metadata Hub
• IPTC’s Video Metadata Hub is a universal
metadata schema developed with the key
goal of storing and exchanging metadata
in a safe and reliable way.
• The Video Metadata Hub is a single set of
video metadata properties based on the
properties in multiple technical formats,
joining metadata across QuickTime, XMP,
MPEG4, MPEG7, MXF, EBUCore, PBCore
and NewsML-G2 plus manufacturer-
specific metadata formats such as Sony
XDCAM and Panasonic/SMPTE P2.
• These properties can be used for
describing the visible and audible
content, rights and licensing information,
administrative details and technical
characteristics of a video in a system-
independent way.
© 2021 IPTC (www.iptc.org) All rights reserved 12
Video Metadata Hub: What problem does it
solve?
• Metadata is stored in different ways in
existing video formats. For video editors it is
very hard to move video metadata between
systems and the semantics are not always
clear.
• For example, location information can be
described in all of the ways described in the
table
• IPTC Video Metadata Hub describes two
fields, Location Shot and Location Shown,
with very clearly defined semantics, and the
fields are mapped to fields of existing video
formats, so metadata can be moved
between systems from different vendors.
© 2021 IPTC (www.iptc.org) All rights reserved 13
QuickTime location with role=0
MPEG7 creation/location
Sony XDCAM PlanningMetadata/Properties/Me
ta[@name="Location"]
schema.org locationCreated
SMPTE P2 ClipMetadata/Shoot/Location
EBUcore
coverage/spatial/location +
typeLink="ivqu:locationShot”
coverage/spatial/location +
typeLink="ivqu:locationShown"
OR subject +
typeLink="ivqu:locationShown"/s
ubjectCode + subjectDefinition
IPTC News Exchange formats - summary
• Binary news exchange format created in 1979! Still used by
many news agencies. No ongoing work.
IPTC 7901
• First-generation multimedia news exchange format created in
2001 when XML was new. Still used by some agencies and
publishers.
NewsML 1
• News Industry Text Format, a generic XML article markup that is
still used widely (although some agencies distribute articles in
XHTML)
NITF
• Re-thought version of NewsML incorporating EventsML-G2 and
SportsML-G2. New releases generally twice a year. Currently at
v2.28 (November 2018).
NewsML-G2
• Semantic news markup format, incorporated into schema.org
(led by Google)
rNews
• News in JSON. Simpler schema intended for APIs, JSON
databases (e.g. Elasticsearch, MongoDB) and lighter-weight use
cases
NinJS
© 2021 IPTC (www.iptc.org) All rights reserved 14
Photo by Mr Cup / Fabien Barral on Unsplash
Proxy (MP4)
Metadata
Metadata Watch Services
2
2: Checking every 30
seconds for new
assets
3: Query asset metadata
via XDCAM air API
4: Pass asset metadata
to G2 XML creation
services
XML Creation Services
6
6: Send NewsML-G2 XML and
MP4 to AWS S3 bucket for
Ooyala
7: Get G2 XML
and MP4 by
OOYALA services
Planning
metadata
Planning
Metadata
5: Get MP4 from
XDCAM air S3
bucket
9
9: Get G2 XML
and MP4 by
Reuters
Connect
8: Send NewsML-
G2 XML and MP4
to AWS S3 bucket
for Reuters
8
1: Connect to XDCAM Air via API calls
DPP Metadata Exchange for News – using NewsML-
G2
© 2021 IPTC (www.iptc.org) All rights reserved 16
RightsML: IPTC's Rights Expression Language
for the media industry
• In a modern media organisation, content comes from
many sources with many combinations of permissions and
restrictions
• The task of deciding which assets can be used on which
properties is complicated, error-prone and manually
intensive
• Mistakes can cost organisations millions in claimed
royalties and legal fees
• The solution: Machine Readable Rights – declaring in
markup a summary of what is usually contained in a legal
contract
• This ensures that both rights holders and clients can be
confident that the use of digital assets complies with the
appropriate rights and restrictions
• Allows publishers to quickly decide which assets can be
used in which contexts, or to flag assets for manual
checking © 2021 IPTC (www.iptc.org) All rights reserved 17
“Embargoed until XXX” “China out” “Web only”
What does IPTC do? Events
• We run two face-to-face members meetings each year where we update on
new standards and allow members to share best practice and collaborate on
projects
– Recent meetings:
• Athens, Greece and Toronto, Canada in 2018
• Lisbon, Portugal and Ljubljana, Slovenia in 2019
• Online meetings in 2020 and 2021
– Upcoming meetings:
• Online, October 18– 20 2021
• Hopefully Tallinn, Estonia and New York City in 2022
• We run the IPTC Photo Metadata Conference in association with CEPIC, the
photo industry association
– Berlin in 2018, Paris in 2019, online in 2020 (with over 200 registrants!)
– 2021 meeting: online in Autumn
© 2021 IPTC (www.iptc.org) All rights reserved 18
Diego Ceccarelli, Bloomberg at
IPTC Autumn Meeting 2018,
Toronto Canada
What does IPTC do? Software development
• We run development projects related
to our work
– The IPTC EXTRA open source rules-based
text classifier
– The GetPMD photo metadata checker
– The PhotoMetadata Explorer (available to
members only) to understand how
publishers use IPTC Photo Metadata
– Reference implementations of RightsML
parsers
– SportsML parser and SportsJS converter
• We work with open source software
developers to support their correct
implementation of IPTC standards
– Phil Harvey’s Exiftool
– SourceFabric’s SuperDesk newsroom
software
– ffmpeg, mediainfo © 2021 IPTC (www.iptc.org) All rights reserved 19
What does IPTC do? Partnerships and advocacy
• We provide advice to other groups and work with other associations on
standards
– W3C
• Joint work on Open Digital Rights Language (ODRL)
• RightsML is a “profile” of ODRL
– Schema.org
– JPEG / ISO
– Content Authenticity Initiative
– Trust Project, Journalism Trust Initiative
– CIPA (creators of EXIF)
• We also work with companies on implementing and
integrating our standards into their products
– Adobe
– Google
– Camera makers, newsroom software vendors, sports stats systems,
etc…
© 2021 IPTC (www.iptc.org) All rights reserved 20
Overview
© 2021 IPTC (www.iptc.org) All rights reserved 21
What is IPTC? What do we do?
What are IPTC NewsCodes?
How do we maintain the IPTC NewsCodes?
How can IPTC NewsCodes be used?
The future of IPTC NewsCodes
IPTC News Codes: concept taxonomies for news
© 2021 IPTC (www.iptc.org) All rights reserved 22
IPTC subject classification systems: a quick
history
• IPTC originally created the Subject Reference System
(SRS) in 1995, in association with the Newspaper
Association of America (NAA)
– Defines many vocabularies such as Subject Qualifier, Genre,
Media Type, Media Type, and Gender, but main one is Subject
aka Subject Code
– All are available in an XML “TopicSet” format (part of NewsML
1)
– Subject contains 17 top-level Subjects, each with three-letter
alphabetic and 8-digit numeric references across three levels
– SubjectCodes are/were available in English, French, German,
Italian, Japanese, and Spanish
– Used within IPTC Photo Metadata Standard as “IIM 2:12
Subject Reference” (called “Subject Code” in the latest IPTC
Photo Metadata Standard) – and still used in Photoshop IPTC
metadata editor interface…
– Problems: too-strict hierarchy, unbalanced granularity, hard to
edit. Couldn’t accommodate requests from users.
© 2021 IPTC (www.iptc.org) All rights reserved 23
An excerpt from the SRS documentation
This led to MediaTopics
© 2021 IPTC (www.iptc.org) All rights reserved 24
• So in 2010, the Media Topics vocabulary was
born
• It contains around ~1,200 terms for
categorising news content
• Terms and definitions in 12
languages/variants (so far):
– Arabic, Simplified Chinese, Danish, English
(British and US), French, German,
Norwegian, Portuguese (for Portugal and
Brazil), Spanish, Swedish
Media Topics: IPTC’s main subject vocabulary
• We had many requests for new terms in the Subject Codes
vocabulary that we couldn’t accommodate due to its strict
hierarchy
• So in 2010, the Media Topics vocabulary was born
• It contains around ~1,200 terms for categorising news
content
• Terms and definitions in 12 languages/variants (so far):
– Arabic, Simplified Chinese, Danish, English (British and
US), French, German, Norwegian, Portuguese (for
Portugal and Brazil), Spanish, Swedish
• Also 200+ controlled vocabularies for news content
– Audio Codec, Colorspace, Contributor Role, News
Provider
© 2021 IPTC (www.iptc.org) All rights reserved 25
Human-readable views
© 2021 IPTC (www.iptc.org) All rights reserved 26
https://cv.iptc.org/newscodes/mediatopic/
Human-readable views
© 2021 IPTC (www.iptc.org) All rights reserved 27
https://show.newscodes.org/
Human-readable views
© 2021 IPTC (www.iptc.org) All rights reserved 28
https://www.iptc.org/std/NewsCodes/treeview/
Human-readable views
© 2021 IPTC (www.iptc.org) All rights reserved 29
https://www.iptc.org/std/NewsCodes/IPTC-MediaTopic-NewsCodes.xlsx
Machine-readable views
© 2021 IPTC (www.iptc.org) All rights reserved 30
• All NewsCodes are maintained as NewsML-G2 Knowledge
Items, mapped to W3C SKOS
• Available as:
– NewsML-G2 Knowledge Items (XML)
– SKOS RDF/XML
– SKOS RDF/Turtle
– SKOS RDF/JSON-LD
• Can be obtained from cv.iptc.org via URL query params or
HTTP content negotiation
• Examples:
– https://cv.iptc.org/newscodes/mediatopic/20000008
?lang=en-GB&format=json
– https://cv.iptc.org/newscodes/genre/?lang=x-all&format=rdfxml
– wget -O IPTCscene.ttl http://cv.iptc.org/newscodes/scene/
?format=rdfttl
• See guidelines at https://www.iptc.org/std/NewsCodes/guidelines/cv.iptc.org-
guidelines.html
More than just subjects/topics
• CVs for Audio Codec, Colorspace,
Contributor Role, Financial Instrument
Type, Genre, News Provider, Trust
Indicator, Video Scaling, etc…
• Most are used in NewsML-G2 files for
conveying information about
news/media content
• We keep references to third-party
codes such as ISO Country Codes,
BCP47 Language tags etc
• All NewsCodes are accessible from
https://cv.iptc.org/newscodes/
© 2021 IPTC (www.iptc.org) All rights reserved 31
More than just subjects/topics (2)
• We have dozens of vocabularies
for sports, used by SportsML,
SportsJS and a possible
SportsRDF standard:
– Event outcome (win, loss, tie/draw,
undecided)
– Professional status
(amateur/professional)
– Core statistics (score, attempts on
goal, possession time, …)
– American Football Penalty types
(illegal pass, roughing, illegal
motion, …)
– Basketball Player Roles (shooter,
blocker, assist, …)
– ... And MANY more © 2021 IPTC (www.iptc.org) All rights reserved 32
Overview
© 2021 IPTC (www.iptc.org) All rights reserved 33
What is IPTC? What do we do?
What are IPTC NewsCodes?
How do we maintain the IPTC NewsCodes?
How can IPTC NewsCodes be used?
The future of IPTC NewsCodes
The IPTC NewsCodes Working Group
• Group Objective: The maintenance of
IPTC metadata controlled vocabularies
for categorizing content – branded as
NewsCodes – and the creation of new
ones when required
• Main focus is usually on Media Topics
but the group also maintains the other
NewsCodes
• Work on Sports NewsCodes is done
jointly with the IPTC Sports Content
Working Group
• Also functions a collaboration/discussion
group for taxonomists from media
organisations
© 2021 IPTC (www.iptc.org) All rights reserved 34
Recent work:
• Media Topics Additions, edits, definition reviews
• Genre Review
• Media Topics → Wikidata Mapping
• Adding and maintaining translations for
MediaTopics
• Entities discussions
Criteria for adding new terms
© 2021 IPTC (www.iptc.org) All rights reserved 35
Balance with granularity in other branches of the Media Topics taxonomy
Will adding the proposed Media Topic(s) keep the level of granularity fairly even across the
vocabulary?
Threshold of coverage
Is the proposed Topic likely to be covered with reliably regular or seasonal frequency by
participating member media organisations?
Not covered by another term or combination of terms
Does the proposed Topic have distinct semantics and scope from all existing Media Topics, or
combination of Media Topics?
Supported and/or validated by other general taxonomies
Does the proposed Topic have representation in external widely available general subject
taxonomies, such as Wikidata, Library of Congress etc.?
Does not belong to any highly specialised taxonomies
The proposed Topic should not have representation in specialised domain taxonomies such
as Medical Subject Headings (MeSH).
https://iptc.org/std/NewsCodes/guidelines/#_when_should_a_term_be_added_to_media_topics
Examples of recent changes to MediaTopics
Recently retired terms:
• sports facilities – Use sport venue instead
• inline skating – Use roller sports instead
• accomplishment –Use award and prize or record and
achievement instead
• people – Use more specific terms instead
Recent hierarchy changes:
• dropped criminal investigation is now a child of
investigation (criminal)
• bodybuilding is now a child of competition discipline
• award and prize and record and achievement were
moved to the top level of human interest because we
retired the parent term accomplishment
• birthday, celebrity and high society were moved to
under the top level human interest term because we
retired the parent term people.
© 2021 IPTC (www.iptc.org) All rights reserved 36
Recvently added terms:
• men
• women
• women’s rights
• animal abuse
• border disputes
• cheerleading
• child care
• disinformation and misinformation
Recent label chahges:
• online media -> online media outlet
• traffic crime -> reckless driving
• trade fair -> trade show or expo
• online media -> online media industry
• parliament -> legislative body
• synchronised swimming -> artistic swimming (as per IOC
changes for 2020 Olympics)
How we work – GitHub, Zoom, groups.io
• New terms are suggested by IPTC members via
Slack, email, GitHub, groups.io list or external
users on our public discussion list iptc-
newscodes@groups.io
• We track, prioritise and classify suggestions using
GitHub issues
• We discuss suggestions on GitHub issues and at
our bi-weekly Zoom calls
• When agreed, we make changes to the XML files,
re-generate the public assets and upload them to
the site
• We also regularly re-visit existing branches to
update labels, definitions and hierarchy
• Suggestions can take from 2 weeks to a few
months to be adopted, depending on urgency
and time to next release
© 2021 IPTC (www.iptc.org) All rights reserved 37
How we work – translations
• Currently we have a fairly manual workflow for
dealing with language translations
• We maintain terms in en-GB and en-US, and then
send updates to translation partners who submit
their work in Excel spreadsheets
• We can then merge changes directly from
spreadsheets into our XML files using a custom
script
Future translations
• We’re considering using a translation platform to
handle this workflow - https://crowdin.com/?
https://lokalise.com/?
• We could then integrate translation with our
GitHub repository
© 2021 IPTC (www.iptc.org) All rights reserved 38
Staying up to date
• We generally release updates every
two to three months
• New changes are announced on:
– iptc.org/news as a blog post
– iptc-newscodes@groups.io
– Twitter @iptcupdates
– Internal Slack and email discussion
lists for IPTC members
• There’s no obligation to take
every update straight away, it’s
up to you
© 2021 IPTC (www.iptc.org) All rights reserved 39
Overview
© 2021 IPTC (www.iptc.org) All rights reserved 40
What is IPTC? What do we do?
What are IPTC NewsCodes?
How do we maintain the IPTC NewsCodes?
How can IPTC NewsCodes be used?
The future of IPTC NewsCodes
Classification of news content
• Some users of Media Topics:
– Swedish news agency TT and their customers
– Danish news agency Ritzau
– Agence France-Presse (AFP)
– dpa (Deutsche Press-Agentur)
– NTB Norway
– ABC Australia
• Many of these use MediaTopics internally or
in agency-to-publisher feeds, not available
to the public
• Content Management Systems
• Auto-tagging and language analysis tools
(e.g. iMatrics, expert.ai)
• Media Monitoring services (e.g.
http://www.pressmonitor.ca/)
© 2021 IPTC (www.iptc.org) All rights reserved 41
News Credibility and Mis/disinformation
• We don’t want to make yet another
standard. We want to work with existing
groups to ensure that their work can be
applied to our standards, and will extend
ours where necessary
• We worked with industry groups (such as
the Trust Project and Journalism Trust
Initiative / Reporters Sans Frontiers) to help
them to:
– define journalistic credibility indicators
– publish indicators in a useful way
– map indicators to our existing news metadata
standards
• We help fact checking groups to best use
our standards and technologies to make
fact-checking easier and more scalable
© 2021 IPTC (www.iptc.org) All rights reserved 42
NewsCodes to support Trust & Credibility
• We added a new IPTC
NewsCodes trustindicator
vocabulary
• New terms in IPTC NewsCodes
genre vocabulary
• The vocabularies can be used
in NewsML-G2 and ninjs
documents (or anywhere else)
• Guideline document:
“Expressing Trust and
Credibility Information in IPTC
Standards” (currently in late
draft form)
© 2021 IPTC (www.iptc.org) All rights reserved 43
Trust indicators used to enhance news for users
© 2021 IPTC (www.iptc.org) All rights reserved 44
• Agencies can distribute
news marked up with
Trust Indicators using
our CVs
• Publishers can pass on
the indicators in
schema.org markup
• Platforms can use the
schema.org markup to
flag credibility to users
• Users can use tools such
as NewsGuard interpret
and analyse credibility
markup
Extending MediaTopics for your own use
• We often receive requests from media organisations to add terms that would be too specific for
us to add to the general MediaTopics vocabulary…
• … but that shouldn’t stop you from adding terms yourself!
• For example our CV stops at the term “animal” (medtop:20000500). To add your own animal
terms, you can ”mint” your own IDs within their own namespace.
• Create a new term and use the skos:broader relationship to show that it’s a child of the
MediaTopic term
• You can then import your new terms into your database without any fear of falling out of sync
with future MediaTopics updates
© 2021 IPTC (www.iptc.org) All rights reserved 45
medtop:08000000 human interest - Item that discusses individuals, groups, animals, plants or other
objects in an emotional way
 medtop:20000500 animal - Human interest and cultural stories related to animals, including pets
 mycv:12345 cat Small domesticated carnivorous mammal with soft fur
 mycv:23456 lion Species of big cat often found in Africa and Asia
 mycv:34567 penguin Large flightless seabird of the southern hemisphere
 medtop:20001237 anniversary - The celebration or commemoration of …
Media Topics → Wikidata Mapping
• In the past we had mapped the top 2 levels
of MediaTopics to Wikidata
• The project to map all the MediaTopics
arose from an IPTC member’s need to have
links from Wikidata back to MediaTopics
because they are building a classification
tool that will use Wikidata IDs.
• We now have mappings for 93% of
MediaTopics
• Some broader terms couldn’t be mapped,
so we plan to create those terms in
Wikidata
© 2021 IPTC (www.iptc.org) All rights reserved 46
Overview
© 2021 IPTC (www.iptc.org) All rights reserved 47
What is IPTC? What do we do?
What are IPTC NewsCodes?
How do we maintain the IPTC NewsCodes?
How can IPTC NewsCodes be used?
The future of IPTC NewsCodes
Wikidata → Media Topics Mapping
• We map from MediaTopics to Wikidata, but
not the other way around
• Wikidata has a property “IPTC NewsCode”
which maps Wikidata entities back to
MediaTopics
• This has been updated ad-hoc by Wikidata
users: already has over 700 mappings
• We plan to update these automatically with
a script and continuously maintain them as
we continue our work
• Coming later in 2021
© 2021 IPTC (www.iptc.org) All rights reserved 48
Mapping to other CVs via wikidata
© 2021 IPTC (www.iptc.org) All rights reserved 49
https://cv.iptc.org/newscodes/mediatopic/20000285
https://www.wikidata.org/wiki/Q253613
https://www.bbc.com/news/topics/c8nq32jw8vjt/personal-finance
https://id.loc.gov/authorities/subjects/sh85048263.html
Media Topics for contextual advertising?
• We have had discussions with several
publishers who are considering using Media
Topics for contextual advertising
• As more users opt out of behavioural
tracking cookies, contextual advertising (i.e.
advertising based on the content of the page
rather than the behaviour of the user) is a
promising alternative
• But to do this, advertisers need a
standardised vocabulary to target against
• IAB taxonomy is very advertiser- and
product-oriented, not really suitable for news
content
• Could MediaTopics provide a solution?
© 2021 IPTC (www.iptc.org) All rights reserved 50
https://wan-ifra.org/2021/05/emotion-and-intent-based-contextual-advertising-emerging-as-promising-alternatives-to-third-party-cookies/
Future work: Named entities?
• Many members have asked whether we will
support named entities (people, places,
organisations, events)
• News by its nature needs new entities all the
time
• New names are typically added to Wikidata 1-2
months after they emerge (but this is improving)
• If news organisations all add a new entity as part
of their workflow, we could end up with dozens
of duplicates
• Going back to edit previously tagged stories to
fix duplicates is a hassle
• In discussions, we agreed that IPTC doesn’t have
the resources to build and maintain a news
entity database for the world
• But perhaps a “best practices guide” would be
© 2021 IPTC (www.iptc.org) All rights reserved 51
Summary: NewsCodes and MediaTopics
Resources
• Info on NewsCodes
• Info on MediaTopics
• IPTC NewsCodes Guidelines
• Human-readable views of
MediaTopics:
– CV Server
– Excel
– HTML tree view
– Interactive tree view
• Guidelines for machine-readable CVs
• Public discussion group on groups.io
• For IPTC members:
– GitHub repository
– Members-only discussion group
– Slack channel
© 2021 IPTC (www.iptc.org) All rights reserved 52
Thank you!
© 2021 IPTC (www.iptc.org) All rights reserved 53
Brendan Quinn
Managing Director
IPTC
mdirector@iptc.org
@brendanquinn on Twitter
Jennifer Parrucci
Senior Taxonomist,
The New York Times
jennifer.parrucci@nytimes.co
m
Appendices
© 2021 IPTC (www.iptc.org) All rights reserved 54
More info on Trust Indicators
© 2021 IPTC (www.iptc.org) All rights reserved 55
Ideal “B2B2C” workflow for trust indicators
© 2021 IPTC (www.iptc.org) All rights reserved 56
Mappings between Trust Project, JTI and IPTC
standards
© 2021 IPTC (www.iptc.org) All rights reserved 57
Mappings are generally direct to NewsML-G2
tags
or “typed links” – a URL with a “qualifier”
showing what kind of link is being described.
IPTC Trust Indicator taxonomy - Hierarchy
• Editorial policy
– Corrections policy, coverage policy, editorial
diversity, ethics, feedback, no-bylines, unnamed-
sources
• Party Info
– Archive, award, knows about, knows language
– Journalist info
• Biography, journalist image
– Organisation info
• Diversity policy, diversity staffing report,
ownership and funding info, senior editorial
staff (masthead)
• Piece of work info
– Corrects resource / corrected by resource
– External / internal source materials, secondary
© 2021 IPTC (www.iptc.org) All rights reserved 58
Browsable web view
Wikidata RDF and SPARQL interface
• Wikidata exposes interfaces in RDF and SPARQL
• Queries can inspect wikidata entities by type or
property:
#People that received both Academy Award and Nobel Prize
SELECT DISTINCT ?person ?personLabel WHERE {
?person wdt:P166/wdt:P31? wd:Q7191 .
?person wdt:P166/wdt:P31? wd:Q19020 .
SERVICE wikibase:label {
bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en"
.
}
}
• “wdt:P166” is “award received”
• “/wdt:P31?” means “this property or any child
instance of it”
• “wd:7191” is “Nobel Prize”
• “wd:Q19020” is “Academy Awards”

Contenu connexe

Tendances

Building the Metaverse
Building the Metaverse Building the Metaverse
Building the Metaverse Reuben Steiger
 
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)Takaaki Umada
 
31+ Startup Tools, Both Online & Offline
31+ Startup Tools, Both Online & Offline31+ Startup Tools, Both Online & Offline
31+ Startup Tools, Both Online & OfflinePixc
 
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)Takaaki Umada
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deckAirbyte
 
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent Bye
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent ByeSensemaking Frameworks for the Metaverse & XR Ethics by Kent Bye
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent ByeKent Bye
 
スタートアップを始める前に
スタートアップを始める前にスタートアップを始める前に
スタートアップを始める前にTakaaki Umada
 
研究を加速するスタートアップ 2017
研究を加速するスタートアップ 2017研究を加速するスタートアップ 2017
研究を加速するスタートアップ 2017Takaaki Umada
 
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜MicroAd, Inc.(Engineer)
 
あなたのスタートアップのアイデアの育てかた
あなたのスタートアップのアイデアの育てかたあなたのスタートアップのアイデアの育てかた
あなたのスタートアップのアイデアの育てかたTakaaki Umada
 
VR Entertainment goes to XR metaverse
VR Entertainment goes to XR metaverseVR Entertainment goes to XR metaverse
VR Entertainment goes to XR metaverseGREE VR Studio Lab
 
スマートフォンアプリ企画書ver.0.1
スマートフォンアプリ企画書ver.0.1スマートフォンアプリ企画書ver.0.1
スマートフォンアプリ企画書ver.0.1tmr2013
 
Designing for Privacy in an Increasingly Public World — Speed Talk
Designing for Privacy in an Increasingly Public World — Speed TalkDesigning for Privacy in an Increasingly Public World — Speed Talk
Designing for Privacy in an Increasingly Public World — Speed TalkRobert Stribley
 
セールスアニマルになろう スタートアップ初期の営業戦略
セールスアニマルになろう スタートアップ初期の営業戦略セールスアニマルになろう スタートアップ初期の営業戦略
セールスアニマルになろう スタートアップ初期の営業戦略Takaaki Umada
 
地域おこし協力隊からはじめる身の丈起業のすすめ200801
地域おこし協力隊からはじめる身の丈起業のすすめ200801地域おこし協力隊からはじめる身の丈起業のすすめ200801
地域おこし協力隊からはじめる身の丈起業のすすめ200801TORU ISHIZAKA
 
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解するTakaaki Umada
 
Infographic: DC vs Marvel – The Battle of Brands
Infographic: DC vs Marvel – The Battle of BrandsInfographic: DC vs Marvel – The Battle of Brands
Infographic: DC vs Marvel – The Battle of Brandsdomain .ME
 

Tendances (20)

Web3 School
Web3 SchoolWeb3 School
Web3 School
 
Building the Metaverse
Building the Metaverse Building the Metaverse
Building the Metaverse
 
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)
リーンスタートアップ本を振り返る 2018 (Lean Startup Update! 2018)
 
31+ Startup Tools, Both Online & Offline
31+ Startup Tools, Both Online & Offline31+ Startup Tools, Both Online & Offline
31+ Startup Tools, Both Online & Offline
 
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)
コミュニティマネジメントとは何か、なぜ今重要か / これから始めるコミュニティマネジメント入門 (1)
 
Airbyte - Series-B deck
Airbyte - Series-B deckAirbyte - Series-B deck
Airbyte - Series-B deck
 
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent Bye
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent ByeSensemaking Frameworks for the Metaverse & XR Ethics by Kent Bye
Sensemaking Frameworks for the Metaverse & XR Ethics by Kent Bye
 
スタートアップを始める前に
スタートアップを始める前にスタートアップを始める前に
スタートアップを始める前に
 
研究を加速するスタートアップ 2017
研究を加速するスタートアップ 2017研究を加速するスタートアップ 2017
研究を加速するスタートアップ 2017
 
Guerrilla Futures
Guerrilla FuturesGuerrilla Futures
Guerrilla Futures
 
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜
マイクロアドのデータ基盤について アドテクを支える基盤〜10Tバイト/日のビッグデータを処理する〜
 
あなたのスタートアップのアイデアの育てかた
あなたのスタートアップのアイデアの育てかたあなたのスタートアップのアイデアの育てかた
あなたのスタートアップのアイデアの育てかた
 
VR Entertainment goes to XR metaverse
VR Entertainment goes to XR metaverseVR Entertainment goes to XR metaverse
VR Entertainment goes to XR metaverse
 
スマートフォンアプリ企画書ver.0.1
スマートフォンアプリ企画書ver.0.1スマートフォンアプリ企画書ver.0.1
スマートフォンアプリ企画書ver.0.1
 
Designing for Privacy in an Increasingly Public World — Speed Talk
Designing for Privacy in an Increasingly Public World — Speed TalkDesigning for Privacy in an Increasingly Public World — Speed Talk
Designing for Privacy in an Increasingly Public World — Speed Talk
 
セールスアニマルになろう スタートアップ初期の営業戦略
セールスアニマルになろう スタートアップ初期の営業戦略セールスアニマルになろう スタートアップ初期の営業戦略
セールスアニマルになろう スタートアップ初期の営業戦略
 
地域おこし協力隊からはじめる身の丈起業のすすめ200801
地域おこし協力隊からはじめる身の丈起業のすすめ200801地域おこし協力隊からはじめる身の丈起業のすすめ200801
地域おこし協力隊からはじめる身の丈起業のすすめ200801
 
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する
変革のためのスタートアップ思考 (1) / スタートアップの考え方を理解する
 
ピッチの極意
ピッチの極意ピッチの極意
ピッチの極意
 
Infographic: DC vs Marvel – The Battle of Brands
Infographic: DC vs Marvel – The Battle of BrandsInfographic: DC vs Marvel – The Battle of Brands
Infographic: DC vs Marvel – The Battle of Brands
 

Similaire à IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop 2021)

DTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformDTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformNguyen Trung
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesFIAT/IFTA
 
Semantic Exchange, a history
Semantic Exchange, a historySemantic Exchange, a history
Semantic Exchange, a historyHeather Edwards
 
Spatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformitySpatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformityIlkka Rinne
 
Webinar on 2nd Open Call - Platforms - slideset
Webinar on 2nd Open Call - Platforms - slidesetWebinar on 2nd Open Call - Platforms - slideset
Webinar on 2nd Open Call - Platforms - slidesetsymbiote-h2020
 
Sustaining Television News Technical Challenges
Sustaining Television News Technical ChallengesSustaining Television News Technical Challenges
Sustaining Television News Technical ChallengesStuart Myles
 
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...Enerco Energy Solutions LLP
 
MIPI DevCon 2020 | State of the Alliance
MIPI DevCon 2020 | State of the AllianceMIPI DevCon 2020 | State of the Alliance
MIPI DevCon 2020 | State of the AllianceMIPI Alliance
 
IPBC - Interplanetary Broadcast Coin
IPBC - Interplanetary Broadcast CoinIPBC - Interplanetary Broadcast Coin
IPBC - Interplanetary Broadcast CoinKai-Uwe Schnier
 
Effective IoT System on Openstack
Effective IoT System on OpenstackEffective IoT System on Openstack
Effective IoT System on OpenstackTakashi Kajinami
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxStephan Haller
 
Axway Titanium - Whats New? (2018)
Axway Titanium - Whats New? (2018)Axway Titanium - Whats New? (2018)
Axway Titanium - Whats New? (2018)Hans Knoechel
 
DevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heelDevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heelDevSecCon
 
How the Internet of Things (IoT) will impact customer experience, operational...
How the Internet of Things (IoT) will impact customer experience, operational...How the Internet of Things (IoT) will impact customer experience, operational...
How the Internet of Things (IoT) will impact customer experience, operational...Navaid Khan
 
Webinar on 2nd Open Call - Applications and Trials - slideset
Webinar on 2nd Open Call - Applications and Trials - slidesetWebinar on 2nd Open Call - Applications and Trials - slideset
Webinar on 2nd Open Call - Applications and Trials - slidesetsymbiote-h2020
 
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012Ryo Jin
 
Cisco Connect Toronto 2018 DevNet Overview
Cisco Connect Toronto 2018  DevNet OverviewCisco Connect Toronto 2018  DevNet Overview
Cisco Connect Toronto 2018 DevNet OverviewCisco Canada
 
License Plate Recognition System using Python and OpenCV
License Plate Recognition System using Python and OpenCVLicense Plate Recognition System using Python and OpenCV
License Plate Recognition System using Python and OpenCVVishal Polley
 

Similaire à IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop 2021) (20)

DTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformDTT OIC, OIP IoT platform
DTT OIC, OIP IoT platform
 
Thinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, IssuesThinking the archives of 2020: Opportunitiws, priorities, Issues
Thinking the archives of 2020: Opportunitiws, priorities, Issues
 
Semantic Exchange, a history
Semantic Exchange, a historySemantic Exchange, a history
Semantic Exchange, a history
 
Spatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE ConformitySpatineo Webinar: Shedding Light on INSPIRE Conformity
Spatineo Webinar: Shedding Light on INSPIRE Conformity
 
Webinar on 2nd Open Call - Platforms - slideset
Webinar on 2nd Open Call - Platforms - slidesetWebinar on 2nd Open Call - Platforms - slideset
Webinar on 2nd Open Call - Platforms - slideset
 
Sustaining Television News Technical Challenges
Sustaining Television News Technical ChallengesSustaining Television News Technical Challenges
Sustaining Television News Technical Challenges
 
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...
Industry 4.0 PPT PDF for Smart Manufacturing using IIoT (Industrial IoT i.e. ...
 
MIPI DevCon 2020 | State of the Alliance
MIPI DevCon 2020 | State of the AllianceMIPI DevCon 2020 | State of the Alliance
MIPI DevCon 2020 | State of the Alliance
 
IPBC - Interplanetary Broadcast Coin
IPBC - Interplanetary Broadcast CoinIPBC - Interplanetary Broadcast Coin
IPBC - Interplanetary Broadcast Coin
 
Effective IoT System on Openstack
Effective IoT System on OpenstackEffective IoT System on Openstack
Effective IoT System on Openstack
 
OpenStackDay - XIFI Federation
OpenStackDay - XIFI FederationOpenStackDay - XIFI Federation
OpenStackDay - XIFI Federation
 
CPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based ToolboxCPaaS.io - FIWARE-based Toolbox
CPaaS.io - FIWARE-based Toolbox
 
Axway Titanium - Whats New? (2018)
Axway Titanium - Whats New? (2018)Axway Titanium - Whats New? (2018)
Axway Titanium - Whats New? (2018)
 
DevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heelDevSecCon London 2018: Is your supply chain your achille's heel
DevSecCon London 2018: Is your supply chain your achille's heel
 
How the Internet of Things (IoT) will impact customer experience, operational...
How the Internet of Things (IoT) will impact customer experience, operational...How the Internet of Things (IoT) will impact customer experience, operational...
How the Internet of Things (IoT) will impact customer experience, operational...
 
Webinar on 2nd Open Call - Applications and Trials - slideset
Webinar on 2nd Open Call - Applications and Trials - slidesetWebinar on 2nd Open Call - Applications and Trials - slideset
Webinar on 2nd Open Call - Applications and Trials - slideset
 
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012
Tizen IVI - Rusty Lynch (Intel) - Korea Linux Forum 2012
 
Cisco Connect Toronto 2018 DevNet Overview
Cisco Connect Toronto 2018  DevNet OverviewCisco Connect Toronto 2018  DevNet Overview
Cisco Connect Toronto 2018 DevNet Overview
 
unit 2.pdf
unit 2.pdfunit 2.pdf
unit 2.pdf
 
License Plate Recognition System using Python and OpenCV
License Plate Recognition System using Python and OpenCVLicense Plate Recognition System using Python and OpenCV
License Plate Recognition System using Python and OpenCV
 

Dernier

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 

Dernier (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 

IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop 2021)

  • 1. IPTC NewsCodes: Controlled Vocabularies for the News Media Brendan Quinn, Managing Director, IPTC Jennifer Parrucci, Senior Taxonomist, The New York Times Tuesday 25 May 2021
  • 2. Overview © 2021 IPTC (www.iptc.org) All rights reserved 2 What is IPTC? What do we do? What are IPTC NewsCodes? How do we maintain the IPTC NewsCodes? How can IPTC NewsCodes be used? The future of IPTC NewsCodes
  • 3. What is IPTC? • The global technical standards body of the news media • We are an open, non- partisan, non-political organisation dedicated to promoting interoperable standards and best practices in the news and media industries • We are a registered non- profit organisation funded by © 2021 IPTC (www.iptc.org) All rights reserved 3
  • 4. History of IPTC 1965 • IPTC founded to safeguard the telecommunications needs of the press • Accredited by International Telecommunications Union in 1967 1979 • Creation of IPTC 7901 format for binary transmission of news content 1990 • Information Interchange Model (IIM), the first multimedia news exchange format, created with NAA 1995 •Adobe adopts IIM in Photoshop, which led to the development of the IPTC Photo Metadata Standard in 2004 2001 •NewsML, the first XML news exchange standard, was released •First set of NewsCodes controlled vocabularies released 2008 •NewsML-G2 family of standards released, including EventsML- G2 and SportsML- G2 2010s •rNews •ninjs •RightsML •Video Metadata Hub © 2021 IPTC (www.iptc.org) All rights reserved 4 2020s • Trust & Credibility additions to •IPTC Photo Metadata in Google Search results
  • 5. Our voting members © 2021 IPTC (www.iptc.org) All rights reserved 5
  • 6. Our associate members • Agencia EFE Spain • ATC - Athens Technology Center Greece • BVPA Germany • CBC / Radio Canada, Canada • CEPIC Europe • DeFodi Images Germany • DPP Digital Production Partnership United Kingdom • EBU - European Broadcasting Union Europe • epa - European Pressphoto Agency Europe • ETX Studio France • Fingerpost UK • FotoWare Norway • iMatrics Sweden • JIS - Jozef Stefan Institute Slovenia • Keystone SDA Switzerland • KNA - Kath. Nachrichten-Agentur Germany • Lusa Portugal • Mainstream Data USA • MENA Egypt • Microstocksolutions LLC USA • Naviga USA • NTB - Norsk Telegrambyrå Norway • PR Newswire UK • Profium Finland • PUBLISH, Inc South Korea • STA - Slovenian Press Agency Slovenia • Sourcefabric Czech Republic • TASS Russia • XML Team Solutions USA • Yuanben China • … plus 13 Individual and Honorary Members © 2021 IPTC (www.iptc.org) All rights reserved 6
  • 7. Our startup members Special membership category open to small companies before they have become profitable. Launched in late 2020 IMATAG Metadata Authoring Systems Scribely © 2021 IPTC (www.iptc.org) All rights reserved 7
  • 8. IPTC’s technical standards IPTC Photo Metadata Video Metadata Hub NewsCodesM ediaTopicsSu bject Codes NewsML- G2 and NewsML-1 RightsML ninjs rNews SportsML- G2 EventsML- G2 Historical standards: •IPTC 7901 •IIM format © 2021 IPTC (www.iptc.org) All rights reserved 8
  • 9. IPTC Photo Metadata: The global standard used to describe pictures • IPTC metadata has been used since 1990 to store descriptive, administrative and rights metadata embedded in image files • Since Adobe adopted it in Photoshop in 1995 it has been used in millions of image files • Originally based on IPTC Information Interchange Model (IIM) standard, now based on XMP embedded metadata framework (introduced by Adobe, now an ISO standard). – Most software supports both formats but we encourage users to use XMP as it has more fields, unlimited field lengths and multi-language support • IPTC Photo Metadata is used by: – All major camera manufacturers: Canon, Nikon, Sony etc – Photo software vendors: Adobe PhotoShop and Lightroom, PhotoMechanic, GIMP etc – Digital Asset Management systems: FotoWare, Extensis and many more – Google image search results © 2021 IPTC (www.iptc.org) All rights reserved 9
  • 10. IPTC Photo Metadata and Google Images • In 2018, IPTC, CEPIC and DMLA worked with Google to display embedded IPTC Credit, Copyright and Creator fields in Google Images search results © 2021 IPTC (www.iptc.org) All rights reserved 10
  • 11. IPTC Photo Metadata drives the Google Images Licensable badge © 2021 IPTC (www.iptc.org) All rights reserved 11 • In 2020, Google started using IPTC Photo Metadata to link “Licensable Images” to a place where the image can be purchased or licenced • A “licensable” badge is displayed in the search results grid • Links from Google search results go directly to a page where users can license the image • Works even when images are used on third-party sites – The badge isn’t shown in thumbnail page, but “get this image on” and “view licence details” in side panel will work
  • 12. Video Metadata Hub • IPTC’s Video Metadata Hub is a universal metadata schema developed with the key goal of storing and exchanging metadata in a safe and reliable way. • The Video Metadata Hub is a single set of video metadata properties based on the properties in multiple technical formats, joining metadata across QuickTime, XMP, MPEG4, MPEG7, MXF, EBUCore, PBCore and NewsML-G2 plus manufacturer- specific metadata formats such as Sony XDCAM and Panasonic/SMPTE P2. • These properties can be used for describing the visible and audible content, rights and licensing information, administrative details and technical characteristics of a video in a system- independent way. © 2021 IPTC (www.iptc.org) All rights reserved 12
  • 13. Video Metadata Hub: What problem does it solve? • Metadata is stored in different ways in existing video formats. For video editors it is very hard to move video metadata between systems and the semantics are not always clear. • For example, location information can be described in all of the ways described in the table • IPTC Video Metadata Hub describes two fields, Location Shot and Location Shown, with very clearly defined semantics, and the fields are mapped to fields of existing video formats, so metadata can be moved between systems from different vendors. © 2021 IPTC (www.iptc.org) All rights reserved 13 QuickTime location with role=0 MPEG7 creation/location Sony XDCAM PlanningMetadata/Properties/Me ta[@name="Location"] schema.org locationCreated SMPTE P2 ClipMetadata/Shoot/Location EBUcore coverage/spatial/location + typeLink="ivqu:locationShot” coverage/spatial/location + typeLink="ivqu:locationShown" OR subject + typeLink="ivqu:locationShown"/s ubjectCode + subjectDefinition
  • 14. IPTC News Exchange formats - summary • Binary news exchange format created in 1979! Still used by many news agencies. No ongoing work. IPTC 7901 • First-generation multimedia news exchange format created in 2001 when XML was new. Still used by some agencies and publishers. NewsML 1 • News Industry Text Format, a generic XML article markup that is still used widely (although some agencies distribute articles in XHTML) NITF • Re-thought version of NewsML incorporating EventsML-G2 and SportsML-G2. New releases generally twice a year. Currently at v2.28 (November 2018). NewsML-G2 • Semantic news markup format, incorporated into schema.org (led by Google) rNews • News in JSON. Simpler schema intended for APIs, JSON databases (e.g. Elasticsearch, MongoDB) and lighter-weight use cases NinJS © 2021 IPTC (www.iptc.org) All rights reserved 14 Photo by Mr Cup / Fabien Barral on Unsplash
  • 15. Proxy (MP4) Metadata Metadata Watch Services 2 2: Checking every 30 seconds for new assets 3: Query asset metadata via XDCAM air API 4: Pass asset metadata to G2 XML creation services XML Creation Services 6 6: Send NewsML-G2 XML and MP4 to AWS S3 bucket for Ooyala 7: Get G2 XML and MP4 by OOYALA services Planning metadata Planning Metadata 5: Get MP4 from XDCAM air S3 bucket 9 9: Get G2 XML and MP4 by Reuters Connect 8: Send NewsML- G2 XML and MP4 to AWS S3 bucket for Reuters 8 1: Connect to XDCAM Air via API calls DPP Metadata Exchange for News – using NewsML- G2
  • 16. © 2021 IPTC (www.iptc.org) All rights reserved 16
  • 17. RightsML: IPTC's Rights Expression Language for the media industry • In a modern media organisation, content comes from many sources with many combinations of permissions and restrictions • The task of deciding which assets can be used on which properties is complicated, error-prone and manually intensive • Mistakes can cost organisations millions in claimed royalties and legal fees • The solution: Machine Readable Rights – declaring in markup a summary of what is usually contained in a legal contract • This ensures that both rights holders and clients can be confident that the use of digital assets complies with the appropriate rights and restrictions • Allows publishers to quickly decide which assets can be used in which contexts, or to flag assets for manual checking © 2021 IPTC (www.iptc.org) All rights reserved 17 “Embargoed until XXX” “China out” “Web only”
  • 18. What does IPTC do? Events • We run two face-to-face members meetings each year where we update on new standards and allow members to share best practice and collaborate on projects – Recent meetings: • Athens, Greece and Toronto, Canada in 2018 • Lisbon, Portugal and Ljubljana, Slovenia in 2019 • Online meetings in 2020 and 2021 – Upcoming meetings: • Online, October 18– 20 2021 • Hopefully Tallinn, Estonia and New York City in 2022 • We run the IPTC Photo Metadata Conference in association with CEPIC, the photo industry association – Berlin in 2018, Paris in 2019, online in 2020 (with over 200 registrants!) – 2021 meeting: online in Autumn © 2021 IPTC (www.iptc.org) All rights reserved 18 Diego Ceccarelli, Bloomberg at IPTC Autumn Meeting 2018, Toronto Canada
  • 19. What does IPTC do? Software development • We run development projects related to our work – The IPTC EXTRA open source rules-based text classifier – The GetPMD photo metadata checker – The PhotoMetadata Explorer (available to members only) to understand how publishers use IPTC Photo Metadata – Reference implementations of RightsML parsers – SportsML parser and SportsJS converter • We work with open source software developers to support their correct implementation of IPTC standards – Phil Harvey’s Exiftool – SourceFabric’s SuperDesk newsroom software – ffmpeg, mediainfo © 2021 IPTC (www.iptc.org) All rights reserved 19
  • 20. What does IPTC do? Partnerships and advocacy • We provide advice to other groups and work with other associations on standards – W3C • Joint work on Open Digital Rights Language (ODRL) • RightsML is a “profile” of ODRL – Schema.org – JPEG / ISO – Content Authenticity Initiative – Trust Project, Journalism Trust Initiative – CIPA (creators of EXIF) • We also work with companies on implementing and integrating our standards into their products – Adobe – Google – Camera makers, newsroom software vendors, sports stats systems, etc… © 2021 IPTC (www.iptc.org) All rights reserved 20
  • 21. Overview © 2021 IPTC (www.iptc.org) All rights reserved 21 What is IPTC? What do we do? What are IPTC NewsCodes? How do we maintain the IPTC NewsCodes? How can IPTC NewsCodes be used? The future of IPTC NewsCodes
  • 22. IPTC News Codes: concept taxonomies for news © 2021 IPTC (www.iptc.org) All rights reserved 22
  • 23. IPTC subject classification systems: a quick history • IPTC originally created the Subject Reference System (SRS) in 1995, in association with the Newspaper Association of America (NAA) – Defines many vocabularies such as Subject Qualifier, Genre, Media Type, Media Type, and Gender, but main one is Subject aka Subject Code – All are available in an XML “TopicSet” format (part of NewsML 1) – Subject contains 17 top-level Subjects, each with three-letter alphabetic and 8-digit numeric references across three levels – SubjectCodes are/were available in English, French, German, Italian, Japanese, and Spanish – Used within IPTC Photo Metadata Standard as “IIM 2:12 Subject Reference” (called “Subject Code” in the latest IPTC Photo Metadata Standard) – and still used in Photoshop IPTC metadata editor interface… – Problems: too-strict hierarchy, unbalanced granularity, hard to edit. Couldn’t accommodate requests from users. © 2021 IPTC (www.iptc.org) All rights reserved 23 An excerpt from the SRS documentation
  • 24. This led to MediaTopics © 2021 IPTC (www.iptc.org) All rights reserved 24 • So in 2010, the Media Topics vocabulary was born • It contains around ~1,200 terms for categorising news content • Terms and definitions in 12 languages/variants (so far): – Arabic, Simplified Chinese, Danish, English (British and US), French, German, Norwegian, Portuguese (for Portugal and Brazil), Spanish, Swedish
  • 25. Media Topics: IPTC’s main subject vocabulary • We had many requests for new terms in the Subject Codes vocabulary that we couldn’t accommodate due to its strict hierarchy • So in 2010, the Media Topics vocabulary was born • It contains around ~1,200 terms for categorising news content • Terms and definitions in 12 languages/variants (so far): – Arabic, Simplified Chinese, Danish, English (British and US), French, German, Norwegian, Portuguese (for Portugal and Brazil), Spanish, Swedish • Also 200+ controlled vocabularies for news content – Audio Codec, Colorspace, Contributor Role, News Provider © 2021 IPTC (www.iptc.org) All rights reserved 25
  • 26. Human-readable views © 2021 IPTC (www.iptc.org) All rights reserved 26 https://cv.iptc.org/newscodes/mediatopic/
  • 27. Human-readable views © 2021 IPTC (www.iptc.org) All rights reserved 27 https://show.newscodes.org/
  • 28. Human-readable views © 2021 IPTC (www.iptc.org) All rights reserved 28 https://www.iptc.org/std/NewsCodes/treeview/
  • 29. Human-readable views © 2021 IPTC (www.iptc.org) All rights reserved 29 https://www.iptc.org/std/NewsCodes/IPTC-MediaTopic-NewsCodes.xlsx
  • 30. Machine-readable views © 2021 IPTC (www.iptc.org) All rights reserved 30 • All NewsCodes are maintained as NewsML-G2 Knowledge Items, mapped to W3C SKOS • Available as: – NewsML-G2 Knowledge Items (XML) – SKOS RDF/XML – SKOS RDF/Turtle – SKOS RDF/JSON-LD • Can be obtained from cv.iptc.org via URL query params or HTTP content negotiation • Examples: – https://cv.iptc.org/newscodes/mediatopic/20000008 ?lang=en-GB&format=json – https://cv.iptc.org/newscodes/genre/?lang=x-all&format=rdfxml – wget -O IPTCscene.ttl http://cv.iptc.org/newscodes/scene/ ?format=rdfttl • See guidelines at https://www.iptc.org/std/NewsCodes/guidelines/cv.iptc.org- guidelines.html
  • 31. More than just subjects/topics • CVs for Audio Codec, Colorspace, Contributor Role, Financial Instrument Type, Genre, News Provider, Trust Indicator, Video Scaling, etc… • Most are used in NewsML-G2 files for conveying information about news/media content • We keep references to third-party codes such as ISO Country Codes, BCP47 Language tags etc • All NewsCodes are accessible from https://cv.iptc.org/newscodes/ © 2021 IPTC (www.iptc.org) All rights reserved 31
  • 32. More than just subjects/topics (2) • We have dozens of vocabularies for sports, used by SportsML, SportsJS and a possible SportsRDF standard: – Event outcome (win, loss, tie/draw, undecided) – Professional status (amateur/professional) – Core statistics (score, attempts on goal, possession time, …) – American Football Penalty types (illegal pass, roughing, illegal motion, …) – Basketball Player Roles (shooter, blocker, assist, …) – ... And MANY more © 2021 IPTC (www.iptc.org) All rights reserved 32
  • 33. Overview © 2021 IPTC (www.iptc.org) All rights reserved 33 What is IPTC? What do we do? What are IPTC NewsCodes? How do we maintain the IPTC NewsCodes? How can IPTC NewsCodes be used? The future of IPTC NewsCodes
  • 34. The IPTC NewsCodes Working Group • Group Objective: The maintenance of IPTC metadata controlled vocabularies for categorizing content – branded as NewsCodes – and the creation of new ones when required • Main focus is usually on Media Topics but the group also maintains the other NewsCodes • Work on Sports NewsCodes is done jointly with the IPTC Sports Content Working Group • Also functions a collaboration/discussion group for taxonomists from media organisations © 2021 IPTC (www.iptc.org) All rights reserved 34 Recent work: • Media Topics Additions, edits, definition reviews • Genre Review • Media Topics → Wikidata Mapping • Adding and maintaining translations for MediaTopics • Entities discussions
  • 35. Criteria for adding new terms © 2021 IPTC (www.iptc.org) All rights reserved 35 Balance with granularity in other branches of the Media Topics taxonomy Will adding the proposed Media Topic(s) keep the level of granularity fairly even across the vocabulary? Threshold of coverage Is the proposed Topic likely to be covered with reliably regular or seasonal frequency by participating member media organisations? Not covered by another term or combination of terms Does the proposed Topic have distinct semantics and scope from all existing Media Topics, or combination of Media Topics? Supported and/or validated by other general taxonomies Does the proposed Topic have representation in external widely available general subject taxonomies, such as Wikidata, Library of Congress etc.? Does not belong to any highly specialised taxonomies The proposed Topic should not have representation in specialised domain taxonomies such as Medical Subject Headings (MeSH). https://iptc.org/std/NewsCodes/guidelines/#_when_should_a_term_be_added_to_media_topics
  • 36. Examples of recent changes to MediaTopics Recently retired terms: • sports facilities – Use sport venue instead • inline skating – Use roller sports instead • accomplishment –Use award and prize or record and achievement instead • people – Use more specific terms instead Recent hierarchy changes: • dropped criminal investigation is now a child of investigation (criminal) • bodybuilding is now a child of competition discipline • award and prize and record and achievement were moved to the top level of human interest because we retired the parent term accomplishment • birthday, celebrity and high society were moved to under the top level human interest term because we retired the parent term people. © 2021 IPTC (www.iptc.org) All rights reserved 36 Recvently added terms: • men • women • women’s rights • animal abuse • border disputes • cheerleading • child care • disinformation and misinformation Recent label chahges: • online media -> online media outlet • traffic crime -> reckless driving • trade fair -> trade show or expo • online media -> online media industry • parliament -> legislative body • synchronised swimming -> artistic swimming (as per IOC changes for 2020 Olympics)
  • 37. How we work – GitHub, Zoom, groups.io • New terms are suggested by IPTC members via Slack, email, GitHub, groups.io list or external users on our public discussion list iptc- newscodes@groups.io • We track, prioritise and classify suggestions using GitHub issues • We discuss suggestions on GitHub issues and at our bi-weekly Zoom calls • When agreed, we make changes to the XML files, re-generate the public assets and upload them to the site • We also regularly re-visit existing branches to update labels, definitions and hierarchy • Suggestions can take from 2 weeks to a few months to be adopted, depending on urgency and time to next release © 2021 IPTC (www.iptc.org) All rights reserved 37
  • 38. How we work – translations • Currently we have a fairly manual workflow for dealing with language translations • We maintain terms in en-GB and en-US, and then send updates to translation partners who submit their work in Excel spreadsheets • We can then merge changes directly from spreadsheets into our XML files using a custom script Future translations • We’re considering using a translation platform to handle this workflow - https://crowdin.com/? https://lokalise.com/? • We could then integrate translation with our GitHub repository © 2021 IPTC (www.iptc.org) All rights reserved 38
  • 39. Staying up to date • We generally release updates every two to three months • New changes are announced on: – iptc.org/news as a blog post – iptc-newscodes@groups.io – Twitter @iptcupdates – Internal Slack and email discussion lists for IPTC members • There’s no obligation to take every update straight away, it’s up to you © 2021 IPTC (www.iptc.org) All rights reserved 39
  • 40. Overview © 2021 IPTC (www.iptc.org) All rights reserved 40 What is IPTC? What do we do? What are IPTC NewsCodes? How do we maintain the IPTC NewsCodes? How can IPTC NewsCodes be used? The future of IPTC NewsCodes
  • 41. Classification of news content • Some users of Media Topics: – Swedish news agency TT and their customers – Danish news agency Ritzau – Agence France-Presse (AFP) – dpa (Deutsche Press-Agentur) – NTB Norway – ABC Australia • Many of these use MediaTopics internally or in agency-to-publisher feeds, not available to the public • Content Management Systems • Auto-tagging and language analysis tools (e.g. iMatrics, expert.ai) • Media Monitoring services (e.g. http://www.pressmonitor.ca/) © 2021 IPTC (www.iptc.org) All rights reserved 41
  • 42. News Credibility and Mis/disinformation • We don’t want to make yet another standard. We want to work with existing groups to ensure that their work can be applied to our standards, and will extend ours where necessary • We worked with industry groups (such as the Trust Project and Journalism Trust Initiative / Reporters Sans Frontiers) to help them to: – define journalistic credibility indicators – publish indicators in a useful way – map indicators to our existing news metadata standards • We help fact checking groups to best use our standards and technologies to make fact-checking easier and more scalable © 2021 IPTC (www.iptc.org) All rights reserved 42
  • 43. NewsCodes to support Trust & Credibility • We added a new IPTC NewsCodes trustindicator vocabulary • New terms in IPTC NewsCodes genre vocabulary • The vocabularies can be used in NewsML-G2 and ninjs documents (or anywhere else) • Guideline document: “Expressing Trust and Credibility Information in IPTC Standards” (currently in late draft form) © 2021 IPTC (www.iptc.org) All rights reserved 43
  • 44. Trust indicators used to enhance news for users © 2021 IPTC (www.iptc.org) All rights reserved 44 • Agencies can distribute news marked up with Trust Indicators using our CVs • Publishers can pass on the indicators in schema.org markup • Platforms can use the schema.org markup to flag credibility to users • Users can use tools such as NewsGuard interpret and analyse credibility markup
  • 45. Extending MediaTopics for your own use • We often receive requests from media organisations to add terms that would be too specific for us to add to the general MediaTopics vocabulary… • … but that shouldn’t stop you from adding terms yourself! • For example our CV stops at the term “animal” (medtop:20000500). To add your own animal terms, you can ”mint” your own IDs within their own namespace. • Create a new term and use the skos:broader relationship to show that it’s a child of the MediaTopic term • You can then import your new terms into your database without any fear of falling out of sync with future MediaTopics updates © 2021 IPTC (www.iptc.org) All rights reserved 45 medtop:08000000 human interest - Item that discusses individuals, groups, animals, plants or other objects in an emotional way  medtop:20000500 animal - Human interest and cultural stories related to animals, including pets  mycv:12345 cat Small domesticated carnivorous mammal with soft fur  mycv:23456 lion Species of big cat often found in Africa and Asia  mycv:34567 penguin Large flightless seabird of the southern hemisphere  medtop:20001237 anniversary - The celebration or commemoration of …
  • 46. Media Topics → Wikidata Mapping • In the past we had mapped the top 2 levels of MediaTopics to Wikidata • The project to map all the MediaTopics arose from an IPTC member’s need to have links from Wikidata back to MediaTopics because they are building a classification tool that will use Wikidata IDs. • We now have mappings for 93% of MediaTopics • Some broader terms couldn’t be mapped, so we plan to create those terms in Wikidata © 2021 IPTC (www.iptc.org) All rights reserved 46
  • 47. Overview © 2021 IPTC (www.iptc.org) All rights reserved 47 What is IPTC? What do we do? What are IPTC NewsCodes? How do we maintain the IPTC NewsCodes? How can IPTC NewsCodes be used? The future of IPTC NewsCodes
  • 48. Wikidata → Media Topics Mapping • We map from MediaTopics to Wikidata, but not the other way around • Wikidata has a property “IPTC NewsCode” which maps Wikidata entities back to MediaTopics • This has been updated ad-hoc by Wikidata users: already has over 700 mappings • We plan to update these automatically with a script and continuously maintain them as we continue our work • Coming later in 2021 © 2021 IPTC (www.iptc.org) All rights reserved 48
  • 49. Mapping to other CVs via wikidata © 2021 IPTC (www.iptc.org) All rights reserved 49 https://cv.iptc.org/newscodes/mediatopic/20000285 https://www.wikidata.org/wiki/Q253613 https://www.bbc.com/news/topics/c8nq32jw8vjt/personal-finance https://id.loc.gov/authorities/subjects/sh85048263.html
  • 50. Media Topics for contextual advertising? • We have had discussions with several publishers who are considering using Media Topics for contextual advertising • As more users opt out of behavioural tracking cookies, contextual advertising (i.e. advertising based on the content of the page rather than the behaviour of the user) is a promising alternative • But to do this, advertisers need a standardised vocabulary to target against • IAB taxonomy is very advertiser- and product-oriented, not really suitable for news content • Could MediaTopics provide a solution? © 2021 IPTC (www.iptc.org) All rights reserved 50 https://wan-ifra.org/2021/05/emotion-and-intent-based-contextual-advertising-emerging-as-promising-alternatives-to-third-party-cookies/
  • 51. Future work: Named entities? • Many members have asked whether we will support named entities (people, places, organisations, events) • News by its nature needs new entities all the time • New names are typically added to Wikidata 1-2 months after they emerge (but this is improving) • If news organisations all add a new entity as part of their workflow, we could end up with dozens of duplicates • Going back to edit previously tagged stories to fix duplicates is a hassle • In discussions, we agreed that IPTC doesn’t have the resources to build and maintain a news entity database for the world • But perhaps a “best practices guide” would be © 2021 IPTC (www.iptc.org) All rights reserved 51
  • 52. Summary: NewsCodes and MediaTopics Resources • Info on NewsCodes • Info on MediaTopics • IPTC NewsCodes Guidelines • Human-readable views of MediaTopics: – CV Server – Excel – HTML tree view – Interactive tree view • Guidelines for machine-readable CVs • Public discussion group on groups.io • For IPTC members: – GitHub repository – Members-only discussion group – Slack channel © 2021 IPTC (www.iptc.org) All rights reserved 52
  • 53. Thank you! © 2021 IPTC (www.iptc.org) All rights reserved 53 Brendan Quinn Managing Director IPTC mdirector@iptc.org @brendanquinn on Twitter Jennifer Parrucci Senior Taxonomist, The New York Times jennifer.parrucci@nytimes.co m
  • 54. Appendices © 2021 IPTC (www.iptc.org) All rights reserved 54
  • 55. More info on Trust Indicators © 2021 IPTC (www.iptc.org) All rights reserved 55
  • 56. Ideal “B2B2C” workflow for trust indicators © 2021 IPTC (www.iptc.org) All rights reserved 56
  • 57. Mappings between Trust Project, JTI and IPTC standards © 2021 IPTC (www.iptc.org) All rights reserved 57 Mappings are generally direct to NewsML-G2 tags or “typed links” – a URL with a “qualifier” showing what kind of link is being described.
  • 58. IPTC Trust Indicator taxonomy - Hierarchy • Editorial policy – Corrections policy, coverage policy, editorial diversity, ethics, feedback, no-bylines, unnamed- sources • Party Info – Archive, award, knows about, knows language – Journalist info • Biography, journalist image – Organisation info • Diversity policy, diversity staffing report, ownership and funding info, senior editorial staff (masthead) • Piece of work info – Corrects resource / corrected by resource – External / internal source materials, secondary © 2021 IPTC (www.iptc.org) All rights reserved 58 Browsable web view
  • 59. Wikidata RDF and SPARQL interface • Wikidata exposes interfaces in RDF and SPARQL • Queries can inspect wikidata entities by type or property: #People that received both Academy Award and Nobel Prize SELECT DISTINCT ?person ?personLabel WHERE { ?person wdt:P166/wdt:P31? wd:Q7191 . ?person wdt:P166/wdt:P31? wd:Q19020 . SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . } } • “wdt:P166” is “award received” • “/wdt:P31?” means “this property or any child instance of it” • “wd:7191” is “Nobel Prize” • “wd:Q19020” is “Academy Awards”

Notes de l'éditeur

  1. Photo (and Video) Metadata Used since 1995 to store descriptive, administrative and rights metadata embedded in image files Originally based on IPTC IIM standard, now based on XMP in-file metadata platform Video Metadata Hub Canonical mapping between video metadata standards: IPTC Photo/Video, EBUCore, …. News Codes, Media Topics, Subject Codes Hierarchical taxonomies of subjects for describing news content NewsML-G2 and NewsML-1 Exchange of packages of text, images, video, audio, events and/or sports data NewsML-G2 is newer, extensible version but NewsML-1 is still maintained RightsML ninjs, rNews News markup in JavaScript, microformat version of news markup SportsML-G2 EventsML-G2 Historical standards NewsML-1 IPTC7901 IIM format
  2. Schematic diagram of what was demoed on February 6th, 2019 by vendors working together using NewsML-G2 as a basis for for production of news content.