SlideShare une entreprise Scribd logo
1  sur  17
Sports and Semantic Tech
              Paul Kelly
            XML Team Solutions
   Chair, SportsML Working Party (IPTC)

         Spring Meeting, IPTC
       Dubai, UAE / 9th March 2011
                 iptc.org
               sportsml.org
Let's Talk About This

• Exploratory, not a didactic presentation
• Purpose
   – gauge interest among members
   – brainstorm
   – guide SWP agenda
• Explore
   –   set of problems
   –   possible solutions
   –   or do we have that backwards?
   –   business cases?




                         © 2010 IPTC (www.iptc.org)   All rights reserved   2
Why Sports?

• easy? a no-brainer?
  – Silver Oliver, BBC
     • "Silver says the BBC has started with sport, because it is simpler. The
       events and the actors taking part in those events are known in
       advance. For example, even this far ahead you know the fixture list,
       venues, teams and probably the majority of the players who are going
       to take part in the 2010 World Cup."
         – http://blogs.journalism.co.uk/editors/2010/02/24/a-history-of-linked-data-at-the-bbc/
  – relationships easy to understand
     • hierarchical
     • sport/league/event/team/player




                             © 2010 IPTC (www.iptc.org)   All rights reserved                  3
Sports News Biz

• Business products
  –   team rosters
  –   schedules
  –   pre-event reports (text and statistical)
  –   live updates
  –   post-event reports (text and statistical)
  –   standings/tables
  –   stat reports
  –   injury reports
  –   general news
  –   wagering
  –   multimedia
  –   etc.

                          © 2010 IPTC (www.iptc.org)   All rights reserved   4
What are the issues?

• ID resolution or acquisition
• data availability
• what to capture?
   –   everything rdfable?
   –   permanent metadata
   –   narrative
   –   perishable metadata
• implementing/architecture
• marketing scenarios




                        © 2010 IPTC (www.iptc.org)   All rights reserved   5
IDs, Concepts and relationships

• IDs
   – player, team, event, league, etc.
• concepts
   – player, team, event, league, etc.
   – also tournament-stage, season-type, etc.
   – goals-scored, shots-missed, shots-on-net, etc.
• relationships
   –   isCompetitiveSportingOrganisationOf
   –   isGroupOf
   –   isMatchOf
   –   hasStat



                         © 2010 IPTC (www.iptc.org)   All rights reserved   6
Data domains

• within sports domain
   – eg. resolving player IDs between providers
   – player page with wikipedia content
• within entire news domain
   – when news and sport intersect
      • doping, Beckhams, etc.
      • multi-domain events like Olympics
      • event management
• broader marketing domain
   – personal data
      • location
      • favourite team
      • favourite gin

                         © 2010 IPTC (www.iptc.org)   All rights reserved   7
What's Out There?

• Linked Data State of the art
   – dbpedia and freebase
      • compare rosters for Miami Heat
   – google calendar
      • schedules
   – Guardian medals spreadsheet
   – sportscodes.org
      • code resolver
      • originally thought of as strictly external
      • but ties in with
          – internal metadata management
          – other apps that produce and consume metadata
   – Did I miss anything?


                            © 2010 IPTC (www.iptc.org)   All rights reserved   8
Ontologies
• BBC sport ontology
  – http://www.bbc.co.uk/ontologies/sport
     • The Sport Ontology is a simple lightweight ontology for publishing data
       about competitive sports events. The terms in this ontology allow data
       to be published about:
         –   The structure of sports tournaments as a series of events
         –   Agents competing in a competition
         –   The type of discipline an event involves
         –   The award associated with the competition
         –   ...etc




                              © 2010 IPTC (www.iptc.org)   All rights reserved   9
BBC Site

• BBC World Cup Site
  – built on top of triple-store; dynamically produced via inference
  – Jem Rayfield: "The BBC World Cup 2010 site features 700-plus
    team, group and player pages, which are powered by a high-
    performance dynamic semantic publishing (DSP) architecture.
    Previously, BBC Sport would never have considered creating this
    number of indices in the CPS, as each index would need an editor
    to keep it up to date with the latest stories, even where automation
    rules had been set up. To put this scale of task into perspective, the
    World Cup site has more index pages than the rest of the BBC
    Sport site."




                        © 2010 IPTC (www.iptc.org)   All rights reserved   10
BBC Site

• "This framework facilitates the publication of automated
  metadata-driven web pages that are light-touch, requiring
  minimal journalistic management, as they automatically
  aggregate and render links to relevant stories."
• "The foundation of these dynamic aggregations is a rich
  ontological domain model. The ontology describes entity
  existence, groups and relationships between the things/
  concepts that describe the World Cup. For example, "Frank
  Lampard" is part of the "England Squad" and the "England
  Squad" competes in "Group C" of the "FIFA World Cup
  2010"
     • http://www.bbc.co.uk/blogs/bbcinternet/2010/07/
       bbc_world_cup_2010_dynamic_sem.html

                        © 2010 IPTC (www.iptc.org)   All rights reserved   11
BBC Site

• John O' Donovan: "Another way to think about all this, is
  that we are not publishing pages, but publishing content as
  assets which are then organised by the metadata
  dynamically into pages"
• "We believe this is the first large scale, mass media site to
  be using concept extraction, RDF and a Triple store to
  deliver content."
   – http://www.bbc.co.uk/blogs/bbcinternet/2010/07/
     the_world_cup_and_a_call_to_ac.html
• entire BBC sports site will cut over to this architecture for
  2012 Olympics.



                        © 2010 IPTC (www.iptc.org)   All rights reserved   12
What to Capture?

• everything in rdf?
   – where to draw the line between flat and deep data?
      • vertical (sportsml) and horizontal (rdf)
• kinds of data
   – stable metadata
      • permanent
          – player, team, event, league
      • fixed
          – schedules
   – unpredictable permanent (meta)data
      • historical post-event results
          – scores
          – highlights
          – outcome
                » historical interest, such as last time England won the World Cup
          – the 0-goals, 0-assists guy?
                                © 2010 IPTC (www.iptc.org)   All rights reserved     13
Perishable Metadata

• perishable metadata
  – the pre-event narrative
     •   why should I follow this game?
     •   where should I watch it?
     •   who should I watch it with?
     •   more of a marketing opportunity?




                          © 2010 IPTC (www.iptc.org)   All rights reserved   14
Pre-event significance

• What makes a sports event significant?
   – decisive game
        • Cup Final
        • avoid relegation
   –   top teams
   –   matchup history
   –   rivalries
   –   top players
   –   streaks
        • winning
        • scoring
        • losing
   – interesting players
   – news intersection (New Orleans Saints @ Super Bowl)
                             © 2010 IPTC (www.iptc.org)   All rights reserved   15
Pre-Event Metadata

• These are all narratives
   – all of it would be in the prose of a match preview
• Contrast
   – structure and predictability of schedule
   – unpredictability of narrative --> essential
   – Winter Olympics narrative
      • controllable?
          – Georgian Luger
          – "Own the Podium"




                           © 2010 IPTC (www.iptc.org)   All rights reserved   16
Next Steps

• What should SportsML Working Party do?
  – just SportsML?
  – what about codes, concepts and ontologies
     • map SportsML to ontologies
  – rename to Sports News and Data Management?




                       © 2010 IPTC (www.iptc.org)   All rights reserved   17

Contenu connexe

Similaire à Sports and-semantic-tech-v.public

Similaire à Sports and-semantic-tech-v.public (20)

Dsp bbc-jem rayfield-semtech2011
Dsp bbc-jem rayfield-semtech2011Dsp bbc-jem rayfield-semtech2011
Dsp bbc-jem rayfield-semtech2011
 
IPTC Approach to News in JSON
IPTC Approach to News in JSONIPTC Approach to News in JSON
IPTC Approach to News in JSON
 
IPTC London AGM 201510 Chair's Welcome
IPTC London AGM 201510 Chair's WelcomeIPTC London AGM 201510 Chair's Welcome
IPTC London AGM 201510 Chair's Welcome
 
IPTC AGM 2018 Welcome
IPTC AGM 2018 WelcomeIPTC AGM 2018 Welcome
IPTC AGM 2018 Welcome
 
IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop...
IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop...IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop...
IPTC NewsCodes - Controlled Vocabularies for the News Media (EBU MDN Workshop...
 
Sport interface presentation nov_10
Sport interface presentation nov_10Sport interface presentation nov_10
Sport interface presentation nov_10
 
The value of Olympic knowledge legacy for host cities
The value of Olympic knowledge legacy for host citiesThe value of Olympic knowledge legacy for host cities
The value of Olympic knowledge legacy for host cities
 
Welcome To IPTC AGM 2016 Berlin
Welcome To IPTC AGM 2016 BerlinWelcome To IPTC AGM 2016 Berlin
Welcome To IPTC AGM 2016 Berlin
 
Groundifly
GroundiflyGroundifly
Groundifly
 
Smart Data Webinar: A semantic solution for financial regulatory compliance
Smart Data Webinar: A semantic solution for financial regulatory complianceSmart Data Webinar: A semantic solution for financial regulatory compliance
Smart Data Webinar: A semantic solution for financial regulatory compliance
 
Mark logic user-group-2012
Mark logic user-group-2012Mark logic user-group-2012
Mark logic user-group-2012
 
Atos Olympic Customers Brochure 2014
Atos Olympic Customers Brochure 2014Atos Olympic Customers Brochure 2014
Atos Olympic Customers Brochure 2014
 
IPTC Spring Meeting Welcome To Athens April 2018
IPTC Spring Meeting Welcome To Athens April 2018IPTC Spring Meeting Welcome To Athens April 2018
IPTC Spring Meeting Welcome To Athens April 2018
 
Development of the European Data Portal
Development of the European Data PortalDevelopment of the European Data Portal
Development of the European Data Portal
 
Big Data Workshop: Splunk and Dell EMC...Better Together
Big Data Workshop: Splunk and Dell EMC...Better TogetherBig Data Workshop: Splunk and Dell EMC...Better Together
Big Data Workshop: Splunk and Dell EMC...Better Together
 
EOSC-hub RDA 11 Colocation Presentation
EOSC-hub RDA 11 Colocation PresentationEOSC-hub RDA 11 Colocation Presentation
EOSC-hub RDA 11 Colocation Presentation
 
IPTC Spring 2019 Conference
IPTC Spring 2019 ConferenceIPTC Spring 2019 Conference
IPTC Spring 2019 Conference
 
ATOS in the SUPERSEDE project
ATOS in the SUPERSEDE projectATOS in the SUPERSEDE project
ATOS in the SUPERSEDE project
 
台科大機械系 c 程式語言第二次演講
台科大機械系 c 程式語言第二次演講台科大機械系 c 程式語言第二次演講
台科大機械系 c 程式語言第二次演講
 
IPTC Rights Statements For News
IPTC Rights Statements For NewsIPTC Rights Statements For News
IPTC Rights Statements For News
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Sports and-semantic-tech-v.public

  • 1. Sports and Semantic Tech Paul Kelly XML Team Solutions Chair, SportsML Working Party (IPTC) Spring Meeting, IPTC Dubai, UAE / 9th March 2011 iptc.org sportsml.org
  • 2. Let's Talk About This • Exploratory, not a didactic presentation • Purpose – gauge interest among members – brainstorm – guide SWP agenda • Explore – set of problems – possible solutions – or do we have that backwards? – business cases? © 2010 IPTC (www.iptc.org) All rights reserved 2
  • 3. Why Sports? • easy? a no-brainer? – Silver Oliver, BBC • "Silver says the BBC has started with sport, because it is simpler. The events and the actors taking part in those events are known in advance. For example, even this far ahead you know the fixture list, venues, teams and probably the majority of the players who are going to take part in the 2010 World Cup." – http://blogs.journalism.co.uk/editors/2010/02/24/a-history-of-linked-data-at-the-bbc/ – relationships easy to understand • hierarchical • sport/league/event/team/player © 2010 IPTC (www.iptc.org) All rights reserved 3
  • 4. Sports News Biz • Business products – team rosters – schedules – pre-event reports (text and statistical) – live updates – post-event reports (text and statistical) – standings/tables – stat reports – injury reports – general news – wagering – multimedia – etc. © 2010 IPTC (www.iptc.org) All rights reserved 4
  • 5. What are the issues? • ID resolution or acquisition • data availability • what to capture? – everything rdfable? – permanent metadata – narrative – perishable metadata • implementing/architecture • marketing scenarios © 2010 IPTC (www.iptc.org) All rights reserved 5
  • 6. IDs, Concepts and relationships • IDs – player, team, event, league, etc. • concepts – player, team, event, league, etc. – also tournament-stage, season-type, etc. – goals-scored, shots-missed, shots-on-net, etc. • relationships – isCompetitiveSportingOrganisationOf – isGroupOf – isMatchOf – hasStat © 2010 IPTC (www.iptc.org) All rights reserved 6
  • 7. Data domains • within sports domain – eg. resolving player IDs between providers – player page with wikipedia content • within entire news domain – when news and sport intersect • doping, Beckhams, etc. • multi-domain events like Olympics • event management • broader marketing domain – personal data • location • favourite team • favourite gin © 2010 IPTC (www.iptc.org) All rights reserved 7
  • 8. What's Out There? • Linked Data State of the art – dbpedia and freebase • compare rosters for Miami Heat – google calendar • schedules – Guardian medals spreadsheet – sportscodes.org • code resolver • originally thought of as strictly external • but ties in with – internal metadata management – other apps that produce and consume metadata – Did I miss anything? © 2010 IPTC (www.iptc.org) All rights reserved 8
  • 9. Ontologies • BBC sport ontology – http://www.bbc.co.uk/ontologies/sport • The Sport Ontology is a simple lightweight ontology for publishing data about competitive sports events. The terms in this ontology allow data to be published about: – The structure of sports tournaments as a series of events – Agents competing in a competition – The type of discipline an event involves – The award associated with the competition – ...etc © 2010 IPTC (www.iptc.org) All rights reserved 9
  • 10. BBC Site • BBC World Cup Site – built on top of triple-store; dynamically produced via inference – Jem Rayfield: "The BBC World Cup 2010 site features 700-plus team, group and player pages, which are powered by a high- performance dynamic semantic publishing (DSP) architecture. Previously, BBC Sport would never have considered creating this number of indices in the CPS, as each index would need an editor to keep it up to date with the latest stories, even where automation rules had been set up. To put this scale of task into perspective, the World Cup site has more index pages than the rest of the BBC Sport site." © 2010 IPTC (www.iptc.org) All rights reserved 10
  • 11. BBC Site • "This framework facilitates the publication of automated metadata-driven web pages that are light-touch, requiring minimal journalistic management, as they automatically aggregate and render links to relevant stories." • "The foundation of these dynamic aggregations is a rich ontological domain model. The ontology describes entity existence, groups and relationships between the things/ concepts that describe the World Cup. For example, "Frank Lampard" is part of the "England Squad" and the "England Squad" competes in "Group C" of the "FIFA World Cup 2010" • http://www.bbc.co.uk/blogs/bbcinternet/2010/07/ bbc_world_cup_2010_dynamic_sem.html © 2010 IPTC (www.iptc.org) All rights reserved 11
  • 12. BBC Site • John O' Donovan: "Another way to think about all this, is that we are not publishing pages, but publishing content as assets which are then organised by the metadata dynamically into pages" • "We believe this is the first large scale, mass media site to be using concept extraction, RDF and a Triple store to deliver content." – http://www.bbc.co.uk/blogs/bbcinternet/2010/07/ the_world_cup_and_a_call_to_ac.html • entire BBC sports site will cut over to this architecture for 2012 Olympics. © 2010 IPTC (www.iptc.org) All rights reserved 12
  • 13. What to Capture? • everything in rdf? – where to draw the line between flat and deep data? • vertical (sportsml) and horizontal (rdf) • kinds of data – stable metadata • permanent – player, team, event, league • fixed – schedules – unpredictable permanent (meta)data • historical post-event results – scores – highlights – outcome » historical interest, such as last time England won the World Cup – the 0-goals, 0-assists guy? © 2010 IPTC (www.iptc.org) All rights reserved 13
  • 14. Perishable Metadata • perishable metadata – the pre-event narrative • why should I follow this game? • where should I watch it? • who should I watch it with? • more of a marketing opportunity? © 2010 IPTC (www.iptc.org) All rights reserved 14
  • 15. Pre-event significance • What makes a sports event significant? – decisive game • Cup Final • avoid relegation – top teams – matchup history – rivalries – top players – streaks • winning • scoring • losing – interesting players – news intersection (New Orleans Saints @ Super Bowl) © 2010 IPTC (www.iptc.org) All rights reserved 15
  • 16. Pre-Event Metadata • These are all narratives – all of it would be in the prose of a match preview • Contrast – structure and predictability of schedule – unpredictability of narrative --> essential – Winter Olympics narrative • controllable? – Georgian Luger – "Own the Podium" © 2010 IPTC (www.iptc.org) All rights reserved 16
  • 17. Next Steps • What should SportsML Working Party do? – just SportsML? – what about codes, concepts and ontologies • map SportsML to ontologies – rename to Sports News and Data Management? © 2010 IPTC (www.iptc.org) All rights reserved 17

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n