SlideShare une entreprise Scribd logo
1  sur  26
Building Metadata
Aggregation
Services for
Resource Discovery
Paul Walk
                                                     UKOLN is supported by:
p.walk@ukoln.ac.uk



     www.ukoln.ac.uk
     A centre of expertise in digital information management
aggregating
 metadata
              2
why aggregate metadata?
• to address systems/network latency - a cache
 •   supporting resource-discovery

• for ‘Web Scale concentration’
 •   ‘gaming’ Google - raising ‘visibility’ of content

 •   network effects if user facing services also developed

• to showcase resources
• to create middleman business opportunities
• as infrastructure to support 3rd-party services
• as an approach to preservation
                                                              3
patterns
• harvest from network, aggregate and re-expose
 •   discovery.ac.uk, Europeana, RepUK

• collect from offline sources and make available in
  aggregate on the network
 •   Collections Trust (UK)

• harvest without re-exposing, build services on
  top of aggregation
 •   Google et.al.

• expose as a ‘data dump’, or expose through an
  API
                                                      4
the big question facing
data providers:

do you want to provide a
data service, or just data?

                              5
current work
  in the UK
               6
• a metadata ‘ecosystem’
 • aggregation is a major component
 • preparing resources for aggregation
 • http://www.discovery.ac.uk
                                         7
• support innovation
• develop some ‘business intelligence’
• develop infrastructure component for services   8
issues with
aggregation
              9
distribution
• state management is a challenge! (deletions, changes)
• aggregation of aggregations is consequently non-trivial
  •   e.g. federated models

• linking?
  •   should records in an aggregation ever be the target of a link?
      Or, should such links point to the source?

  •   can/should we make aggregations into Google-friendly targets?

  •   if we succeed with SEO, are we undermining source
      repositories?

• ‘attribution stacking’
  data-protocol/)
                              (http://sciencecommons.org/projects/publishing/open-access-




                                                                                            10
openness and usability
• ‘open’ in danger of becoming synonymous with
  ‘permissively licensed’

• can be both ‘open’ but very difficult to use
 •   needs periodic review - right now SPARQL is barrier to wide
     adoption

 •   remember all those SOAP interfaces....

 •   a well supported API might be more open than a completely
     freely available dump of gigabytes (or more) of data in the sense
     that it might allow open engagement from more people

• we need a richer understanding of openness

                                                                         11
in other words...


           be open, usefully


                               12
character encodings....
• huge number of XML records
  from UK IRs are invalid due
  to character encoding
  issues....




     • there is a special
       place in hell for
       developers who
       ignore character
       encodings...

                                http://www.flickr.com/photos/10661825@N07/

                                                                            13
a distributed system is one in which
the failure of a computer you didn't
even know existed can render your
own computer unusable
                                   Leslie Lamport




  are we creating a new version of this with
  data....?
                                                    14
shifting landscape
• Google was previously seen as in opposition to a rich
  metadata approach...
  •   recall versus precision

  •   Google’s abandonment of OAI-PMH

• but now...
  •   Google, Microsoft & Yahoo committed to improving precision
      through harvesting of Microdata

  •   schema.org and others bridging this divide

• so, is there still a need for other ‘concentrations’ or
  can we rely on the global search engines?


                                                                   15
good
practice
           16
licensing!
• use explicit licenses
• this means requiring explicit licenses from sources
• if at all possible work with extremely open licenses
  such as CC0

• in data aggregation, especially when using a Linked Data
  approach, ‘share alike’ might be easier than ‘attribution’




                                                               17
“build for normal users,
  developers and
       machines”
        Tom Coates
        http://www.plasticbag.org/archives/2006/02/my_future_of_web_apps_slides/




                                                                                   18
developer-friendly formats
• XML has a lot going for it:
 •   very well supported with tools, libraries etc.

 •   well understood & often fits the info models we’re used to

• but it has some issues:
 •   validation is a pain and is very often ignored

 •   it’s verbose - it takes up a lot of bandwidth

• JSON has gained rapid adoption
 •   less verbose - good for simple client-side manipulation
     •   curl -D - -L -H "Accept: application/rdf+xml" "http://dx.doi.org/10.1126/science.
         1157784"

     •   curl -D - -L -H "Accept: application/json" "http://dx.doi.org/10.1126/science.1157784"

                                                                                                  19
service (anti)patterns
                                                                     end-user


• design your API to be                       end-user
                                                                        UI
                                                                                             end-user


  developer-friendly                             UI                                              UI

                                                                      Future


• be aware of what works, and                   Future
                                               3rd-party
                                                                     3rd-party
                                                                        dev
                                                                                              Future
                                                                                             3rd-party

  of what appears to work                         dev                                           dev


  but actually might not...                              AP
                                                           I
                                                                        API            AP
                                                                                             I



• share this understanding                                some aggregated data of broad
                                                         interest and potential usefulness


                                      = certainty                       UI
                                      = belief
                                      = speculation

                                                                    end-user
                             Paul Walk, An infrastructure service anti-pattern
                             http://blog.paulwalk.net/2009/12/07/an-infrastructure-service-anti-pattern/
                                                                                                         20
expect & enable
users to filter
  - give them
  feeds (RSS/
     Atom)

                  http://www.flickr.com/photos/httpwwwflickrcompeoplenadar/3349883/ (CC BY-
                  NC-ND 2.0)




                                                                                            21
workshop
tomorrow!
            22
tomorrow at 16:15
• (Thursday, 23rd June, 16:15-18:15)
• short presentations from UKOLN on LOCAH and
  RepUK, and from Edina on aggregating services

• open discussion on the way forward for metadata
  aggregation, addressing questions such as:
 •   is Linked Data the future for metadata aggregation services?

 •   do initiatives like Microdata & schema.org reduce the need for
     our investment in metadata aggregation services?

 •   does usability matter as much as ‘openness’?

• please join us, and feel free to bring your own
  questions & issues to discuss
                                                                      23
summing up
    in a
sentence....
               24
we should use aggregation

 [applying a tool]

to balance the creation of opportunity

 [building infrastructure]

with the solving of problems

 [developing & providing services]


                                         25
thank you


            26

Contenu connexe

Plus de Paul Walk

Next generation repositories
Next generation repositoriesNext generation repositories
Next generation repositoriesPaul Walk
 
What does the next generation repository look like?
What does the next generation repository look like?What does the next generation repository look like?
What does the next generation repository look like?Paul Walk
 
COAR Next Generation Repositories Working Group
COAR Next Generation Repositories Working GroupCOAR Next Generation Repositories Working Group
COAR Next Generation Repositories Working GroupPaul Walk
 
Static Site Generators: what they are and when they are useful
Static Site Generators: what they are and when they are usefulStatic Site Generators: what they are and when they are useful
Static Site Generators: what they are and when they are usefulPaul Walk
 
RIOXX: a Modern Metadata Application Profile
RIOXX: a Modern Metadata Application ProfileRIOXX: a Modern Metadata Application Profile
RIOXX: a Modern Metadata Application ProfilePaul Walk
 
Implementing RIOXX
Implementing RIOXXImplementing RIOXX
Implementing RIOXXPaul Walk
 
Exploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic developmentExploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic developmentPaul Walk
 
Rioxx 2 repository fringe
Rioxx 2 repository fringeRioxx 2 repository fringe
Rioxx 2 repository fringePaul Walk
 
The Strategic Developer: a new role for Higher Education?
The Strategic Developer: a new role for Higher Education?The Strategic Developer: a new role for Higher Education?
The Strategic Developer: a new role for Higher Education?Paul Walk
 
Local, technical innovation in an outsourced world
Local, technical innovation in an outsourced worldLocal, technical innovation in an outsourced world
Local, technical innovation in an outsourced worldPaul Walk
 
Working with Developers
Working with DevelopersWorking with Developers
Working with DevelopersPaul Walk
 
It's their cloud, not yours
It's their cloud, not yoursIt's their cloud, not yours
It's their cloud, not yoursPaul Walk
 
Technical Challenges in Resource Discovery
Technical Challenges in Resource DiscoveryTechnical Challenges in Resource Discovery
Technical Challenges in Resource DiscoveryPaul Walk
 
Responsive Innovation in a Local Context
Responsive Innovation in a Local ContextResponsive Innovation in a Local Context
Responsive Innovation in a Local ContextPaul Walk
 
The Changing Role of the Developer in HE
The Changing Role of the Developer in HEThe Changing Role of the Developer in HE
The Changing Role of the Developer in HEPaul Walk
 
Supporting Developers, Supporting Research
Supporting Developers, Supporting ResearchSupporting Developers, Supporting Research
Supporting Developers, Supporting ResearchPaul Walk
 
Future of LMS
Future of LMSFuture of LMS
Future of LMSPaul Walk
 
Innovation, community, sustainability
Innovation, community, sustainabilityInnovation, community, sustainability
Innovation, community, sustainabilityPaul Walk
 
Strategic development in a local HEI context
Strategic development in a local HEI contextStrategic development in a local HEI context
Strategic development in a local HEI contextPaul Walk
 
Enterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetEnterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetPaul Walk
 

Plus de Paul Walk (20)

Next generation repositories
Next generation repositoriesNext generation repositories
Next generation repositories
 
What does the next generation repository look like?
What does the next generation repository look like?What does the next generation repository look like?
What does the next generation repository look like?
 
COAR Next Generation Repositories Working Group
COAR Next Generation Repositories Working GroupCOAR Next Generation Repositories Working Group
COAR Next Generation Repositories Working Group
 
Static Site Generators: what they are and when they are useful
Static Site Generators: what they are and when they are usefulStatic Site Generators: what they are and when they are useful
Static Site Generators: what they are and when they are useful
 
RIOXX: a Modern Metadata Application Profile
RIOXX: a Modern Metadata Application ProfileRIOXX: a Modern Metadata Application Profile
RIOXX: a Modern Metadata Application Profile
 
Implementing RIOXX
Implementing RIOXXImplementing RIOXX
Implementing RIOXX
 
Exploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic developmentExploiting the value of Dublin Core through pragmatic development
Exploiting the value of Dublin Core through pragmatic development
 
Rioxx 2 repository fringe
Rioxx 2 repository fringeRioxx 2 repository fringe
Rioxx 2 repository fringe
 
The Strategic Developer: a new role for Higher Education?
The Strategic Developer: a new role for Higher Education?The Strategic Developer: a new role for Higher Education?
The Strategic Developer: a new role for Higher Education?
 
Local, technical innovation in an outsourced world
Local, technical innovation in an outsourced worldLocal, technical innovation in an outsourced world
Local, technical innovation in an outsourced world
 
Working with Developers
Working with DevelopersWorking with Developers
Working with Developers
 
It's their cloud, not yours
It's their cloud, not yoursIt's their cloud, not yours
It's their cloud, not yours
 
Technical Challenges in Resource Discovery
Technical Challenges in Resource DiscoveryTechnical Challenges in Resource Discovery
Technical Challenges in Resource Discovery
 
Responsive Innovation in a Local Context
Responsive Innovation in a Local ContextResponsive Innovation in a Local Context
Responsive Innovation in a Local Context
 
The Changing Role of the Developer in HE
The Changing Role of the Developer in HEThe Changing Role of the Developer in HE
The Changing Role of the Developer in HE
 
Supporting Developers, Supporting Research
Supporting Developers, Supporting ResearchSupporting Developers, Supporting Research
Supporting Developers, Supporting Research
 
Future of LMS
Future of LMSFuture of LMS
Future of LMS
 
Innovation, community, sustainability
Innovation, community, sustainabilityInnovation, community, sustainability
Innovation, community, sustainability
 
Strategic development in a local HEI context
Strategic development in a local HEI contextStrategic development in a local HEI context
Strategic development in a local HEI context
 
Enterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMetEnterprise Information Integration at LondonMet
Enterprise Information Integration at LondonMet
 

Dernier

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 

Dernier (20)

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 

Metadata Aggregation Services

  • 1. Building Metadata Aggregation Services for Resource Discovery Paul Walk UKOLN is supported by: p.walk@ukoln.ac.uk www.ukoln.ac.uk A centre of expertise in digital information management
  • 3. why aggregate metadata? • to address systems/network latency - a cache • supporting resource-discovery • for ‘Web Scale concentration’ • ‘gaming’ Google - raising ‘visibility’ of content • network effects if user facing services also developed • to showcase resources • to create middleman business opportunities • as infrastructure to support 3rd-party services • as an approach to preservation 3
  • 4. patterns • harvest from network, aggregate and re-expose • discovery.ac.uk, Europeana, RepUK • collect from offline sources and make available in aggregate on the network • Collections Trust (UK) • harvest without re-exposing, build services on top of aggregation • Google et.al. • expose as a ‘data dump’, or expose through an API 4
  • 5. the big question facing data providers: do you want to provide a data service, or just data? 5
  • 6. current work in the UK 6
  • 7. • a metadata ‘ecosystem’ • aggregation is a major component • preparing resources for aggregation • http://www.discovery.ac.uk 7
  • 8. • support innovation • develop some ‘business intelligence’ • develop infrastructure component for services 8
  • 10. distribution • state management is a challenge! (deletions, changes) • aggregation of aggregations is consequently non-trivial • e.g. federated models • linking? • should records in an aggregation ever be the target of a link? Or, should such links point to the source? • can/should we make aggregations into Google-friendly targets? • if we succeed with SEO, are we undermining source repositories? • ‘attribution stacking’ data-protocol/) (http://sciencecommons.org/projects/publishing/open-access- 10
  • 11. openness and usability • ‘open’ in danger of becoming synonymous with ‘permissively licensed’ • can be both ‘open’ but very difficult to use • needs periodic review - right now SPARQL is barrier to wide adoption • remember all those SOAP interfaces.... • a well supported API might be more open than a completely freely available dump of gigabytes (or more) of data in the sense that it might allow open engagement from more people • we need a richer understanding of openness 11
  • 12. in other words... be open, usefully 12
  • 13. character encodings.... • huge number of XML records from UK IRs are invalid due to character encoding issues.... • there is a special place in hell for developers who ignore character encodings... http://www.flickr.com/photos/10661825@N07/ 13
  • 14. a distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable Leslie Lamport are we creating a new version of this with data....? 14
  • 15. shifting landscape • Google was previously seen as in opposition to a rich metadata approach... • recall versus precision • Google’s abandonment of OAI-PMH • but now... • Google, Microsoft & Yahoo committed to improving precision through harvesting of Microdata • schema.org and others bridging this divide • so, is there still a need for other ‘concentrations’ or can we rely on the global search engines? 15
  • 17. licensing! • use explicit licenses • this means requiring explicit licenses from sources • if at all possible work with extremely open licenses such as CC0 • in data aggregation, especially when using a Linked Data approach, ‘share alike’ might be easier than ‘attribution’ 17
  • 18. “build for normal users, developers and machines” Tom Coates http://www.plasticbag.org/archives/2006/02/my_future_of_web_apps_slides/ 18
  • 19. developer-friendly formats • XML has a lot going for it: • very well supported with tools, libraries etc. • well understood & often fits the info models we’re used to • but it has some issues: • validation is a pain and is very often ignored • it’s verbose - it takes up a lot of bandwidth • JSON has gained rapid adoption • less verbose - good for simple client-side manipulation • curl -D - -L -H "Accept: application/rdf+xml" "http://dx.doi.org/10.1126/science. 1157784" • curl -D - -L -H "Accept: application/json" "http://dx.doi.org/10.1126/science.1157784" 19
  • 20. service (anti)patterns end-user • design your API to be end-user UI end-user developer-friendly UI UI Future • be aware of what works, and Future 3rd-party 3rd-party dev Future 3rd-party of what appears to work dev dev but actually might not... AP I API AP I • share this understanding some aggregated data of broad interest and potential usefulness = certainty UI = belief = speculation end-user Paul Walk, An infrastructure service anti-pattern http://blog.paulwalk.net/2009/12/07/an-infrastructure-service-anti-pattern/ 20
  • 21. expect & enable users to filter - give them feeds (RSS/ Atom) http://www.flickr.com/photos/httpwwwflickrcompeoplenadar/3349883/ (CC BY- NC-ND 2.0) 21
  • 23. tomorrow at 16:15 • (Thursday, 23rd June, 16:15-18:15) • short presentations from UKOLN on LOCAH and RepUK, and from Edina on aggregating services • open discussion on the way forward for metadata aggregation, addressing questions such as: • is Linked Data the future for metadata aggregation services? • do initiatives like Microdata & schema.org reduce the need for our investment in metadata aggregation services? • does usability matter as much as ‘openness’? • please join us, and feel free to bring your own questions & issues to discuss 23
  • 24. summing up in a sentence.... 24
  • 25. we should use aggregation [applying a tool] to balance the creation of opportunity [building infrastructure] with the solving of problems [developing & providing services] 25
  • 26. thank you 26

Notes de l'éditeur

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n