Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Mapping,	
  Interlinking	
  and	
  
Exposing	
  MusicBrainz	
  as	
  
Linked	
  Data	
  
1st	
  Interna*onal	
  Workshop	
...
What	
  this	
  talk	
  is	
  about	
  
A	
  Linked	
  Data	
  Perspec=ve	
  
worksOn
publishedTo

affiliation
affiliation...
EUCLID:	
  EdUca=onal	
  Curriculum	
  for	
  the	
  
usage	
  of	
  LinkedData	
  
	
  
http://www.euclid-project.eu

Cou...
Analysis	
  &	
  
Mining	
  Module	
  

Visualiza*on	
  
Module	
  

RDFa	
  

Data acquisition

LD Dataset

Access

Appli...
MusicBrainz	
  
•  MusicBrainz	
  is	
  an	
  open	
  music	
  encyclopedia	
  that	
  collects	
  
music	
  metadata	
  a...
LD	
  Dataset	
  

Access	
  

Publishing	
  Rela=onal	
  Databases	
  as	
  RDF:	
  
W3C	
  RDB2RDF	
  
SPARQL	
  
Endpoi...
Publishing	
  MusicBrainz	
  
h"ps://wiki.musicbrainz.org/Next_Genera;on_Schema	
  

MusicBrainz	
  DB	
  	
  

	
  h"p://...
MusicBrainz	
  Next	
  Gen	
  Schema	
  
ar=st	
  
	
  As	
  pre-­‐NGS,	
  but	
  	
  	
  
	
  	
  	
  	
  further	
  a`ri...
Music	
  Ontology	
  
OWL	
  ontology	
  with	
  following	
  core	
  concepts	
  (classes)	
  and	
  
rela*onships	
  (pr...
R2RML	
  Class	
  Mapping	
  
Mapping	
  tables	
  to	
  classes	
  is	
  ‘easy’:	
  
	
  
lb:Artist	
  a	
  rr:TriplesMap...
R2RML	
  Property	
  Mapping	
  
Mapping	
  columns	
  to	
  proper*es	
  can	
  be	
  easy:	
  
	
  
lb:artist_name	
  a	...
NGS	
  Advanced	
  Rela=ons	
  
Major	
  en**es	
  (Ar*st,	
  Release	
  Group,	
  Track,	
  etc.)	
  plus	
  URL	
  
are	...
R2RML	
  Mapping	
  Editor	
  
R2RML: Expose data from
relational DBMS as RDF /
via SPARQL Endpoint
Problem: R2RML
Mapping...
Scale	
  
MusicBrainz	
  RDF	
  derived	
  via	
  R2RML:	
  

150M
Triples

lb:artist_member	
  a	
  rr:TriplesMap	
  ;	
 ...
Some	
  Sta=s=cs	
  –	
  RDF	
  Dump	
  
(Lead) Table
area
artist
dbpedia
label
medium
recording
release_group
release
tra...
Informa=on	
  Workbench	
  
PlaGorm	
  for	
  Linked	
  Data	
  Applica=ons	
  
§  Seman*cs-­‐	
  &	
  Linked	
  Data-­‐b...
Realiza=on	
  within	
  the	
  	
  
Informa=on	
  Workbench	
  Architecture	
  
Customized	
  applica*on	
  
solu*ons	
  
...
The	
  “MusicBrainz	
  Explorer”	
  Applica=on	
  
Music Ontology
Ontology

Data

R2RML
Data Providers

Templates

Widgets
Ontology	
  as	
  a	
  “Structural	
  Backbone”	
  
Resource	
  page	
  
	
  
	
  
	
  

Defining	
  
UI	
  
structure
	
  ...
Information	
  Workbench:	
  	
  
Browsing	
  a	
  Music	
  Artist	
  
Information	
  Workbench:	
  	
  
Visualization	
  techniques	
  
Naviga=on	
  Through	
  the	
  Data	
  

Source: http://musicbrainz.fluidops.net/resource/Analytical5
SPARQL	
  visualization	
  
Top ten The Beatles releases according to the sum of
track durations in minutes
SPARQL	
  Quer...
SPARQL	
  visualization	
  
Top ten The Beatles releases according to the sum of track durations
in minutes
Widget	
  
{{#...
Information	
  Workbench:	
  	
  
SPARQL	
  visualization	
  
Top ten The Beatles releases according to the sum of track
d...
Automated	
  Widget	
  Suggestion	
  

1	
  

Table
Pivot
view
Bar chart
Line chart
Pie chart

2	
   Select a suggested vi...
Try	
  it	
  out!	
  
R2RML	
  Mappings	
  
• 

h`ps://github.com/LinkedBrainz/MusicBrainz-­‐R2RML	
  

MusicBrainz	
  RDF...
Acknowledgements	
  
The	
  Euclid	
  Project	
  
Barry	
  Norton	
  	
  
Michael	
  Meier	
  
Andriy	
  Nikolov	
  
Yves	...
Thank	
  you!	
  
Contact	
  
	
  
Peter	
  Haase	
  
fluid	
  Opera*ons	
  AG	
  
Altro`str.	
  31	
  
Walldorf	
  
German...
Prochain SlideShare
Chargement dans…5
×

Mapping, Interlinking and Exposing MusicBrainz as Linked Data

3 112 vues

Publié le

Slides from my keynote at the 1st International Workshop on Semantic Music and Media (SMAM2013)
http://iswc2013.semanticweb.org/content/smam-2013

Publié dans : Technologie, Formation
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Mapping, Interlinking and Exposing MusicBrainz as Linked Data

  1. 1. Mapping,  Interlinking  and   Exposing  MusicBrainz  as   Linked  Data   1st  Interna*onal  Workshop  on     Seman*c  Music  and  Media  (SMAM2013)   Sydney,  Oct  21,  2013   Peter  Haase  
  2. 2. What  this  talk  is  about   A  Linked  Data  Perspec=ve   worksOn publishedTo affiliation affiliation (previous) isAbout builtWith participatesIn participatesIn
  3. 3. EUCLID:  EdUca=onal  Curriculum  for  the   usage  of  LinkedData     http://www.euclid-project.eu Course eBook Other channels @euclid_project euclidproject euclidproject
  4. 4. Analysis  &   Mining  Module   Visualiza*on   Module   RDFa   Data acquisition LD Dataset Access Application EUCLID  Scenario   SPARQL Endpoint Vocabulary   Mapping   Publishing Interlinking   Physical  Wrapper   Streaming providers Downloads Musical Content Cleansing   LD  Wrapper   R2R  Transf.   Integrated   Dataset   LD  Wrapper   RDF/   XML   Metadata Other content
  5. 5. MusicBrainz   •  MusicBrainz  is  an  open  music  encyclopedia  that  collects   music  metadata  and  makes  it  available  to  the  public.   •  MusicBrainz  aims  to  be:   •   The  ul=mate  source  of  music  informa=on  by  allowing  anyone  to   contribute  and  releasing  the  data  under  open  licenses.   •   The  universal  lingua  franca  for  music  by  providing  a  reliable  and   unambiguous  form  of   music  iden*fica*on,  enabling  both  people  and  machines  to  have  meaningful   conversa*ons  about  music.   •  Like  Wikipedia,  MusicBrainz  is  maintained  by  a  global   community  of  users  and  we  want  everyone  —  including   you  —  to  par*cipate  and  contribute.   •  MusicBrainz  is  operated  by  the   MetaBrainz  Founda*on,  dedicated  to  keeping   MusicBrainz  free  and  open  source.  
  6. 6. LD  Dataset   Access   Publishing  Rela=onal  Databases  as  RDF:   W3C  RDB2RDF   SPARQL   Endpoint   Publishing   Integrated   Data  in   Triplestore   Vocabulary   Mapping   Interlinking   R2RML   Engine   Cleansing   Task:  Publish  data  from   rela*onal  DBMS  as     Linked  Data     Approach:  map  from   rela*onal  schema  to   seman*c  vocabulary  with   R2RML     Publishing:  two  alterna*ves  –   Data  acquisi*on   •  •  Rela*onal   DBMS   Translate  SPARQL  into  SQL  on   the  fly   Batch  transform  data  into   RDF,  infer,  index  ,  integrate   and  provide  SPARQL  access  in   a  triplestore  
  7. 7. Publishing  MusicBrainz   h"ps://wiki.musicbrainz.org/Next_Genera;on_Schema   MusicBrainz  DB      h"p://musicontology.com   Music   Ontology   R2RML   Concrete  Example  Mapping   Table  Recording(gid,  length)   R2RML  Mapping   Ontology  concept  mo:recording    
  8. 8. MusicBrainz  Next  Gen  Schema   ar=st    As  pre-­‐NGS,  but              further  a`ributes   ar=st_credit    Allows  joint  credit   release_group    Cf.  ‘album’            versus:   •  work   release   •  track   medium     •  tracklist   •  recording   https://wiki.musicbrainz.org/Next_Generation_Schema
  9. 9. Music  Ontology   OWL  ontology  with  following  core  concepts  (classes)  and   rela*onships  (proper*es):   Source: http://musicontology.com
  10. 10. R2RML  Class  Mapping   Mapping  tables  to  classes  is  ‘easy’:     lb:Artist  a  rr:TriplesMap  ;      rr:logicalTable  [rr:tableName  "artist"]  ;      rr:subjectMap            [rr:class  mo:MusicArtist  ;            rr:template                        "http://musicbrainz.org/artist/{gid}#_"]  ;      rr:predicateObjectMap            [rr:predicate  mo:musicbrainz_guid  ;            rr:objectMap  [rr:column  "gid"  ;                                          rr:datatype  xsd:string]]  .    
  11. 11. R2RML  Property  Mapping   Mapping  columns  to  proper*es  can  be  easy:     lb:artist_name  a  rr:TriplesMap  ;      rr:logicalTable  [rr:sqlQuery            """SELECT  artist.gid,  artist_name.name                    FROM  artist                    INNER  JOIN  artist_name  ON  artist.name  =   artist_name.id"""]  ;      rr:subjectMap  [rr:template                                            "http://musicbrainz.org/artist/{gid}#_"]  ;      rr:predicateObjectMap            [rr:predicate  foaf:name  ;            rr:objectMap  [rr:column  "name"]]  .  
  12. 12. NGS  Advanced  Rela=ons   Major  en**es  (Ar*st,  Release  Group,  Track,  etc.)  plus  URL   are  paired    (l_ar*st_ar*st)   Each  pairing    of  instances    refers  to  a  Link   Links  have  types      (cf.  RDF  proper*es)    and  a`ributes         http://wiki.musicbrainz.org/Advanced_Relationship
  13. 13. R2RML  Mapping  Editor   R2RML: Expose data from relational DBMS as RDF / via SPARQL Endpoint Problem: R2RML Mappings are hard to create R2RML   Engine   R2RML   Mappings   R2RML  Edi*ng  Made  Easy!   Hides  vocabulary  intricacies  from  end-­‐user   Access  to  metadata  about  rela*onal  databases   Preview  of  generated  triples  and  SQL  queries   Very  expressive  (Supports  most  of  R2RML)   SPARQL  Endpoint   Rela*onal   Database   See our R2RML Mapping Editor in the ISWC Demo Session on Wednesday!
  14. 14. Scale   MusicBrainz  RDF  derived  via  R2RML:   150M Triples lb:artist_member  a  rr:TriplesMap  ;      rr:logicalTable  [rr:sqlQuery          """SELECT  a1.gid,  a2.gid  AS  band                FROM  artist  a1                    INNER  JOIN  l_artist_artist  ON  a1.id  =   l_artist_artist.entity0                      INNER  JOIN  link  ON  l_artist_artist.link  =  link.id                      INNER  JOIN  link_type  ON  link_type  =  link_type.id                      INNER  JOIN  artist  a2  on  l_artist_artist.entity1  =  a2.id                  WHERE   link_type.gid='5be4c609-­‐9afa-­‐4ea0-­‐910b-­‐12ffb71e3821'"""]  ;      rr:subjectMap  [rr:template  "http://musicbrainz.org/artist/{gid} #_"]  ;      rr:predicateObjectMap            [rr:predicate  mo:member_of  ;            rr:objectMap  [rr:template  "http://musicbrainz.org/artist/{band} #_"  ;                                        rr:termType  rr:IRI]]  .  
  15. 15. Some  Sta=s=cs  –  RDF  Dump   (Lead) Table area artist dbpedia label medium recording release_group release track work Triples 59798 36868228 172017 201832 18069143 11400354 3050818 9764887 75506495 1728955 156822527 Time (s) 2 423 13 3 163 209 31 151 794 20 1809
  16. 16. Informa=on  Workbench   PlaGorm  for  Linked  Data  Applica=ons   §  Seman*cs-­‐  &  Linked  Data-­‐based   integra=on  of  private  and  public   data  sources  based  on  data   providers   •  •  •  Generic  and  specific  providers  for   various  data  formats  and  sources   Supports  established  mapping   frameworks  (e.g.  R2RML,  SILK,  …)   Named  graphs  for  managing   contexts  and  provenance   §  Intelligent  Data  Access  and  Analy=cs   •  •  •  Flexible  self-­‐service  UI   Visualiza*on,  explora*on,   dashboarding  and  repor*ng   Seman*c  search   §  Collabora=on  and  knowledge   management   •  •  Cura*on  &  authoring   Collabora*ve  workflows   §    Open  standards  and  technologies   •  •  •  Seman*c  Wiki  based  frontend     (Using  SMW  Syntax)     Suppor*ng  W3C  standards  (OWL,  RDF,   SPARQL,,  …)   Community  Edi*on  (Open  Source)  +   Enterprise  Edi*on  (Commercial)  
  17. 17. Realiza=on  within  the     Informa=on  Workbench  Architecture   Customized  applica*on   solu*ons   Reusable  UI  and  data   integra*on  components     Data  storage  and   management  plajorm   External  resources  to  reuse   data  and  create  mashups  
  18. 18. The  “MusicBrainz  Explorer”  Applica=on   Music Ontology Ontology Data R2RML Data Providers Templates Widgets
  19. 19. Ontology  as  a  “Structural  Backbone”   Resource  page         Defining   UI   structure   Resource  page         mo:Track   mo:Ar=st   Defining   data   structure   rdf:type   Yesterday   UI  templates   Template:  …     Template:mo:Track       Template:mo:Ar=st               Ontology   (RDFS/OWL)   rdf:type   The_Beatles   RDF  Data   Graph  
  20. 20. Information  Workbench:     Browsing  a  Music  Artist  
  21. 21. Information  Workbench:     Visualization  techniques  
  22. 22. Naviga=on  Through  the  Data   Source: http://musicbrainz.fluidops.net/resource/Analytical5
  23. 23. SPARQL  visualization   Top ten The Beatles releases according to the sum of track durations in minutes SPARQL  Query     SELECT  ?release                  ((SUM(xsd:double(?duration/60000)))  AS  ?avg)     WHERE  {      <http://dbpedia.org/resource/The_Beatles>                    foaf:made  ?release  .    ?release  mo:record  ?record  .    ?record  mo:track  ?track  .    ?track  mo:duration  ?duration  .}     GROUP  BY  ?release   ORDER  BY  DESC(?avg)   LIMIT  10   Result  set  
  24. 24. SPARQL  visualization   Top ten The Beatles releases according to the sum of track durations in minutes Widget   {{#widget:  BarChart  |   query  ='SELECT  (COUNT(?Release)  AS  ?COUNT)  ? label  WHERE  {         <http://musicbrainz.org/artist/8538e728-­‐ca0b-­‐4321-­‐b7e5-­‐ cff6565dd4c0#_>  foaf:made  ?Release.      ?Release  rdf:type  mo:Release  .    ?Release  dc:title  ?label  .}   GROUP  BY  ?label   ORDER  BY  DESC(?COUNT)   LIMIT  20'   |  settings  =  'Settings:barvertical_mb'     |  asynch  =  'true'   |  input  =  'label'   |  output  =  'COUNT'   |  height  =  '300’}}   Visualization:  Bar  chart  
  25. 25. Information  Workbench:     SPARQL  visualization   Top ten The Beatles releases according to the sum of track durations in minutes Other  visualiza*ons  of  the  same  result  set  …   Line  chart:   Pie  chart:  
  26. 26. Automated  Widget  Suggestion   1   Table Pivot view Bar chart Line chart Pie chart 2   Select a suggested visualization 3   Visualization automatically built
  27. 27. Try  it  out!   R2RML  Mappings   •  h`ps://github.com/LinkedBrainz/MusicBrainz-­‐R2RML   MusicBrainz  RDF  Dump   •  h`p://mbsandbox.org/~barry/   MusicBrainz  Linked  Data  Demo  system   •  h`p://musicbrainz.fluidops.net/   Informa*on  Workbench   •  h`p://www.fluidops.com/informa*on-­‐workbench/   Euclid  Project   •      h`p://euclid-­‐project.eu/  
  28. 28. Acknowledgements   The  Euclid  Project   Barry  Norton     Michael  Meier   Andriy  Nikolov   Yves  Raimond   Kurt  Jacobson   Thomas  Gaengler   Juan  Sequeda   Simon  Dixon     (in  no  par;cular  order)    
  29. 29. Thank  you!   Contact     Peter  Haase   fluid  Opera*ons  AG   Altro`str.  31   Walldorf   Germany     +49  (0)  6227  358087-­‐0   www.fluidops.com   peter.haase@fluidOps.com    

×