2. Language Processing at the core of the Media & Publishing Industries
Multifacet crisis in the Media & Publishing Industries
User
generated
content
Decline in Shift in ways
advertising audiences
revenues consume
contents
Business
models
3. Language Processing at the core of the Media & Publishing Industries
An old concern: language quality
Proofreading with Stilus:
Spell, grammar and
style checking
4. Language Processing at the core of the Media & Publishing Industries
Daedalus: extracting meaning from
multilingual & multimedia contents
Semantic Processing: automatic extraction of knowledge items
from non-structured content
Facts
Topics
Sentiment
Semantic
Processing
People
Annotation, Enrichment & Linking
Named entities and concepts extraction,
classification, clustering Organizations Concepts
Areas: documentation and advanced
content search, SEO positioning
5. Language Processing at the core of the Media & Publishing Industries
Advanced Semantic Analysis at Daedalus
People:
Ben Bernanke, Mariano Rajoy…
Companies, organizations:
BBVA, Bankia, Goldman Sachs, Coca-Cola,
Reserva Federal…
Financial named entities:
Ibex35, Dax Xetra…
Places:
Londres, EE.UU., París…
Concepts:
prima de riesgo, presidente del
Gobierno, intervención parlamentaria,
índice bursátil, situación económica…
Time references:
hoy, ayer, sobre las 11 de la mañana…
Money amounts:
104 dólares, 1 euro…
Polarity positive/neutral/negative
7. Language Processing at the core of the Media & Publishing Industries
User-generated content: automatic translation
8. Language Processing at the core of the Media & Publishing Industries
User-generated content: automatic moderation
Tool for automatic moderation of
social media, blogs, fora, etc.
Offensive, illegal, inappropriate or
objectionable content filtering
9. Language Processing at the core of the Media & Publishing Industries
Social Media Analytics: Sentimentalytics
10. Language Processing at the core of the Media & Publishing Industries
Video/audio indexing & search
Transcription Indexing
Contents
Index
Search
11. Language Processing at the core of the Media & Publishing Industries
Automatic Subtitling: transcription, segmentation &
synchronization
Transcription
TEXT
Processing
(checking,
proofreading, Storag
etc.)
e
12. Language Processing at the core of the Media & Publishing Industries
Data Journalism: exploration & analysis of info sources
Look4leaks.net: Wikileaks case
• Automatic translation of 251.000 cables (5 languages)
• Semantic enrichment: entities, classification
• Multifacet search: by embassy, person, country…
Trial files: Gürtel case (corruption, >100 Kpages)
• OCR, fuzzy recognition and multifacet search
Spanish state of the nation address
• Semantic analysis and search
13. Language Processing at the core of the Media & Publishing Industries
Transmedia
Content production for simultaneous
and coordinated delivery through
different channels
Personalized delivery
• Content of interest according
to user profile
• Contextual advertising
E.g.: second screen apps
14. Language Processing at the core of the Media & Publishing Industries
New ways for monetizing content
Selling content chunks:
• chapters, sections of reference books, etc.
Selling content aggregates:
• full story through news published along the time about one topic
15. Language Processing at the core of the Media & Publishing Industries
DAEDALUS, S.A.
Jose C. Gonzalez
jgonzalez@daedalus.es
http://www.daedalus.es