Publicité
Publicité

Contenu connexe

Similaire à Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf(20)

Publicité

Bertelsmann: BeTrend – Building a Trend Aggregation Machine.pdf

  1. March, 2023 BeTrend – Building a Trend Aggregation Machine Marcus Koring, Bertelsmann – Senior Director Technology & Project Development Thorsten Liebig, derivo GmbH – Managing Director
  2. 2 March 23, 2023 · Bertelsmann TECH & DATA | derivo Bertelsmann at a Glance Media Investments Services Education Gütersloh Headquarters €18.7 Billion Group revenues €3,241 Million Operating EBITDA €2,310 Million Group profit 145,027 Employees
  3. 3 March 23, 2023 · Bertelsmann TECH & DATA | derivo Structuring AI use cases along our key value chains Augmented Intelligence AI for Media AI for Services Content Intelligence Audience Intelligence Service Intelligence Monetisation Monetisation Tech Radar
  4. 4 March 23, 2023 · Bertelsmann TECH & DATA | derivo All of our media companies try to identify trends … BEG Exam questions generation G+J Leveraging Chefkoch recipe data PRH - PRH India AI-based narrative analytics PRH - PRH USA AI book generation PRH - PRH USA Intelligent title tagging RTL Group - RTLZWEI Cast engagement RTL NL Content-based news recommen. … AI for Media Content Intelligence Audience Intelligence … BMG Talent & trend identification G+J Trend analysis/Topic prediction PRH - PRH UK Trend identification PRH - PRH USA Retail search data analysis RTL Group - Fremantle Find drivers of TV ratings RTL Group - RTL NL Predict trending topics RTL Group - RTLZWEI Social media sentiment analysis … NLP Metadata extraction (Hyper)Personalization Trend analysis Pricing | Sales forecast. …
  5. 5 March 23, 2023 · Bertelsmann TECH & DATA | derivo Starting with a first use case Use case “TV format ideation” Provide semantically enriched social media and search data to help TV format/program ideation by: 1) identifying TV formats/shows trending elsewhere in the world and 2) detecting regional meta trends such as more conservative or liberal attitudes Trending topics are a key driver for assessing potential content popularity (“popularity prediction”). Editorial teams and creative units spend a lot of time on searching for good and relevant stories. AI can help to extract data from various sources such as Twitter, Google Trends, Wikipedia, etc. and to process, combine, mix and analyse these data points for producing a list of trending topics in such a way that it reveals promising new storylines, i.e., it helps to create a better understanding of the best topics to address. “As an editor of RTL BLVD, I want to know instantly what is happening in The Netherlands and in the world so I can create an item as quickly as possible.” “As Program Coordinator, I want to know what trends and sentiments do we see and to what extend they have an impact on scheduling TV programs so we can adjust the schedule.” “As Head of Creative Unit, I want to combine various trends so I can come up with a new creative concepts faster than the rest.” “As Head of Content, I want to validate new creative formats.”
  6. 6 March 23, 2023 · Bertelsmann TECH & DATA | derivo
  7. 7 March 23, 2023 · Bertelsmann TECH & DATA | derivo Disambiguation Which „Meghan“ is trending? Why is „Meghan“ trending? What is the context? Has the same „Meghan“ trended before and why?
  8. First Step Took over the great initial prework from RTL NL Second Step • Started to gather first data • Entity linking based on wiki • Started the legal assessment and market research Third Step • Talked to several colleagues in the Data RTL Mediengruppe, PRH, RTL II • Added elastic search to search the database Fifth Step • API was way too slow • Switched to Graph DB • Rebuilding Data Pipeline Fourth Step • Built up a data pipeline • Publishing the API The Journey
  9. 9 March 23, 2023 · Bertelsmann TECH & DATA | derivo Datapipeline Collecting & Preparing Trends from Social Media to make them Graph-ready Data Acquisition + … Entity Disambiguation Trend Scoring JSON Aggregation & Creation Graph Tools – Model & Cleaning + … + … Tools – Google & TextRazor Tools – Scoring
  10. 10 March 23, 2023 · Bertelsmann TECH & DATA | derivo From Tables to the BeTrend Graph & Applications Data Acquisition Entity Disambiguation Trend Scoring JSON Aggregation & Creation Graph Power User / Developer End User
  11. 11 March 23, 2023 · Bertelsmann TECH & DATA | derivo Data Aggregation and Enrichment Data Acquisition Entity Disambiguation Trend Scoring JSON Aggregation & Creation • Temporal normalization (one JSON per day) • Entity Recognition via external NLP service • Trend word Classification (Wikidata, Freebase, IPTC, etc.) • Gathering of related topics Used technology: • Azure table storage • SQL • Python • ext. Services via REST Graph
  12. 12 March 23, 2023 · Bertelsmann TECH & DATA | derivo Data Aggregation and Enrichment Data Acquisition Entity Disambiguation Trend Scoring JSON Aggregation & Creation Graph • Temporal normalization (one JSON per day) • Entity Recognition via external NLP service • Trend word Classification (Wikidata, Freebase, IPTC, etc.) • Gathering of related topics Used technology: • Azure table storage • SQL • Python • ext. Services via REST
  13. 13 March 23, 2023 · Bertelsmann TECH & DATA | derivo Data Import and Graph Model Data Acquisition JSON Aggregation & Creation Used technology: • JSON Schema • Drawings • Cypher • APOC Graph Graph Model Importer • Key modelling dimensions: -time (when, how often, consecutive?) -location (country, continent) -context (classification schemata)
  14. 14 March 23, 2023 · Bertelsmann TECH & DATA | derivo Data Import and Graph Model Data Acquisition JSON Aggregation & Creation Used technology: • JSON Schema • Drawings • Cypher • APOC Graph Graph Model Importer • Key modelling dimensions: -time (when, how often, consecutive?) -location (country, continent) -context (classification schemata) UNWIND ['denmark', 'france', 'germany', …] … CALL apoc.load.json('./JSON Dumps/2022-12-01.json') YIELD value MERGE (i:Crawl …) UNWIND … AS tword MERGE (tw:Trendword {trendword: tword.trendword}) … CASE WHEN entity.confidence <= 0.5 THEN … FOREACH …
  15. 15 March 23, 2023 · Bertelsmann TECH & DATA | derivo Graph Meta Model
  16. 16 March 23, 2023 · Bertelsmann TECH & DATA | derivo Demo
  17. 17 March 23, 2023 · Bertelsmann TECH & DATA | derivo Lessons Learned & Outlook Learnings: • Graph technology is quick to use • Neo4j on-premise and in the cloud • Graph models are flexible + very well suited to reflect the use case Next: • Professionalization the data import • Improvement of Entity Recognition • Refinement of the graph model • Graph Data Science Use-Cases AI for Media • Generative AI • Recomendation
Publicité