SlideShare une entreprise Scribd logo
1  sur  29
What business are we in?
Data-centric research, service
requirements and national
responses
Data Keynote, NEIC 2013
Dr Andrew Treloar
Australian National Data Service
Overview
• What business are we really in?
• Service requirements
• Infrastructure responses
• Research Data Alliance
• Conclusions
CC-BY @atreloar 2
Photo CC-BY www.flickr.com/photos/dgjones/7031731377/ 3
Photo CC-BY www.flickr.com/photos/pejmanphotos/1322835717/ 4
What Business are you in?
Theodore Levitt, The Changing Character of Capitalism,
Harvard Business Review, July–August 1956
“The railroads did not stop growing because the need for
passenger and freight transportation declined. That grew.
The railroads are in trouble today not because that need
was filled by others (cars, trucks, airplanes, and even
telephones) but because it was not filled by the railroads
themselves. They let others take customers away from
them because they assumed themselves to be in the
railroad business rather than in the transportation
business. The reason they defined their industry
incorrectly was that they were railroad oriented instead of
transportation oriented; they were product oriented
instead of customer oriented....”
CC-BY @atreloar 5
Photo CC-BY www.flickr.com/photos/spookman01/4904264919/ 6
Photo CC-BY www.flickr.com/photos/jerryjohn/63351338/CC-BY @atreloar 7
Photo CC-BY www.flickr.com/photos/stiefkind/6454784607/CC-BY @atreloar 8
Photo CC-BY www.flickr.com/photos/torkildr/3462607995/CC-BY @atreloar 9
CC-BY @atreloar 10
We are all in the Data business!
• Researchers
– with some exceptions
• Research infrastructure providers
– with no exceptions
• But what about publications?
CC-BY @atreloar 11
LHC output from 2009-2013
= 100PB
(www.symmetrymagazine.org/article/february-
2013/achievement-unlocked-100-petabytes-of-data)
Journal Literature size in context…
@atreloar
Data-centric view of research data re-
use
CC-BY @atreloar 13
eResearch infrastructure
requirements
• Create/Capture
– automated with capture of associated
metadata
• Store
– with appropriate levels of preservation
• Describe
– information for discovery, determination of
value, access, re-use
• Identify
– indirection operator to reduce brittlenessCC-BY @atreloar 14
eResearch infrastructure
requirements
• Register
– in institutional/national/discipline registries
• Discover
– via general or specialised search interfaces
• Access
– with appropriate levels of control, including
humans
• Exploit
– by re-analysis or combination
CC-BY @atreloar 15
Photo CC-BY http://www.flickr.com/photos/vintuitive/6855133329/
16
I come from a land
downunder…
CC-BY @atreloar 17
AU
• 6 States
• 2 Territories
• 2 islands
• 23M people
NZ
• 2 islands
• 4.5M people
You come from the frozen North…
CC-BY @atreloar 18
Nordic Countries
• 5 Countries
• 4 Territories
• So many islands
• 26M people
And yet there are some
similarities
CC-BY @atreloar 19
• Australia+NZ – 27.5M people
• Scandinavia – 26M people
Australian National Data Service
 An initiative of the Australian Government being
conducted as part of the National Collaborative
Research Infrastructure Strategy ($A24M) and the
Super Science Initiative ($A48M)
 A collaboration between Monash University, the
Australian National University and CSIRO
 30 staff, funded to mid 2015
 More researchers re-using more data more often
 Data as a first-class object
CC-BY @atreloar 20
ANDS enables transformation of:
Data that are:
Unmanaged
Disconnected
Invisible
Single use
To Structured Collections that are:
Managed
Connected
Findable
Reusable
so that Australian researchers can easily publish,
discover, access and use/re-use research data.
CC-BY @atreloar 21
Data-centric view of research data re-
use
CC-BY @atreloar 22
ANDS activities/services
 Plan
 Data management planning tools and resources (N)
 Create/Capture
 69 Data Capture projects at 23 universities
 Store
 working closely with national Research Data Storage
Infrastructure (N)
 Describe
 25 institutional Metadata Stores projects
 National Vocabulary Services (N)
CC-BY @atreloar 23
CC-BY @atreloar 24
 Identify (N)
 DataCite DOIs
 Register (N)
 Repository Interchange Format – Collections and Services
(RIF-CS) – based on ISO2146:2010
 Discover (N)
 Research Data Australia
ANDS activities/services
ANDS activities/services
 Access
 enforced by underlying data stores
 Exploit
 25 institutionally-focussed projects to demonstrate value of
combining data
 Advocate (N)
 Be the voice for data
 Work with Government and Research Funders to change
settings in favour of data sharing
CC-BY @atreloar 25
26
Research Data Alliance
 The Research Data Alliance (RDA) is a new international
organization (driven now by EC, US, AU, more soon) forming to
facilitate specific, short-term efforts that accelerate the sharing and
exchange of research data
 Unofficial motto: rough consensus and exchanged data
 Working groups will run over 12-18 months to produce
 Adopted standards
 Deployed infrastructure
 Adopted policy
 Implemented best practice, etc.
 Second Plenary in Washington DC, September 16-18
Slide by Fran Berman
27
 Data Type Registries
 Data Foundation and
Terminology
 Practical Policy
 PID Information Types
 Metadata Standards WG
 Community Capability Model
 Working Group on Data Citation:
Making Data Citable
 Structural Biology
 Defining Urban Data Exchange for
Science
 Marine Data Harmonization
 Repository Audit and Certification
 Big Data Analytics
 Metadata Standards Directory
Interest Group (MSDIG)
 The Engagement Group
 Legal Interoperability
 Preservation e-Infrastructure
 UPC Code for Data
 Publishing Data
 Data in Context
 Citation of Dynamic Data
 Agricultural Data Interoperability
Working Groups Interest Groups
Research Data Alliance
Slide by Fran Berman
Conclusion
• We are all in the data business
• Researchers need data services from their
infrastructure providers
• A number of services can best be provided
at national or regional level
• Research Data Alliance is working to
develop international solutions for data
interoperability – join us!
CC-BY @atreloar 28
Questions?
@atreloar
ands.org.au
rd-alliance.org
CC-BY @atreloar 29

Contenu connexe

Plus de Andrew Treloar

Plus de Andrew Treloar (15)

Adding value to researchers' data
Adding value to researchers' dataAdding value to researchers' data
Adding value to researchers' data
 
The life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practiceThe life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practice
 
Past, present, and future of scholarly technology and practices
Past, present, and future of scholarly technology and practicesPast, present, and future of scholarly technology and practices
Past, present, and future of scholarly technology and practices
 
Scholarly archive-of-the-future
Scholarly archive-of-the-futureScholarly archive-of-the-future
Scholarly archive-of-the-future
 
Research data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaResearch data and the ANDS agenda in Australia
Research data and the ANDS agenda in Australia
 
Data drives decisions
Data drives decisionsData drives decisions
Data drives decisions
 
Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)
 
Journal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataJournal literature size in the context of the LHC data
Journal literature size in the context of the LHC data
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipity
 
Research data ecology
Research data ecologyResearch data ecology
Research data ecology
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly Communication
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...
 
The Past, Present and Future of data
The Past, Present and Future of dataThe Past, Present and Future of data
The Past, Present and Future of data
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and services
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier Solution
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

What business are we in? Data-centric research, service requirements and national responses

  • 1. What business are we in? Data-centric research, service requirements and national responses Data Keynote, NEIC 2013 Dr Andrew Treloar Australian National Data Service
  • 2. Overview • What business are we really in? • Service requirements • Infrastructure responses • Research Data Alliance • Conclusions CC-BY @atreloar 2
  • 5. What Business are you in? Theodore Levitt, The Changing Character of Capitalism, Harvard Business Review, July–August 1956 “The railroads did not stop growing because the need for passenger and freight transportation declined. That grew. The railroads are in trouble today not because that need was filled by others (cars, trucks, airplanes, and even telephones) but because it was not filled by the railroads themselves. They let others take customers away from them because they assumed themselves to be in the railroad business rather than in the transportation business. The reason they defined their industry incorrectly was that they were railroad oriented instead of transportation oriented; they were product oriented instead of customer oriented....” CC-BY @atreloar 5
  • 11. We are all in the Data business! • Researchers – with some exceptions • Research infrastructure providers – with no exceptions • But what about publications? CC-BY @atreloar 11
  • 12. LHC output from 2009-2013 = 100PB (www.symmetrymagazine.org/article/february- 2013/achievement-unlocked-100-petabytes-of-data) Journal Literature size in context… @atreloar
  • 13. Data-centric view of research data re- use CC-BY @atreloar 13
  • 14. eResearch infrastructure requirements • Create/Capture – automated with capture of associated metadata • Store – with appropriate levels of preservation • Describe – information for discovery, determination of value, access, re-use • Identify – indirection operator to reduce brittlenessCC-BY @atreloar 14
  • 15. eResearch infrastructure requirements • Register – in institutional/national/discipline registries • Discover – via general or specialised search interfaces • Access – with appropriate levels of control, including humans • Exploit – by re-analysis or combination CC-BY @atreloar 15
  • 17. I come from a land downunder… CC-BY @atreloar 17 AU • 6 States • 2 Territories • 2 islands • 23M people NZ • 2 islands • 4.5M people
  • 18. You come from the frozen North… CC-BY @atreloar 18 Nordic Countries • 5 Countries • 4 Territories • So many islands • 26M people
  • 19. And yet there are some similarities CC-BY @atreloar 19 • Australia+NZ – 27.5M people • Scandinavia – 26M people
  • 20. Australian National Data Service  An initiative of the Australian Government being conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)  A collaboration between Monash University, the Australian National University and CSIRO  30 staff, funded to mid 2015  More researchers re-using more data more often  Data as a first-class object CC-BY @atreloar 20
  • 21. ANDS enables transformation of: Data that are: Unmanaged Disconnected Invisible Single use To Structured Collections that are: Managed Connected Findable Reusable so that Australian researchers can easily publish, discover, access and use/re-use research data. CC-BY @atreloar 21
  • 22. Data-centric view of research data re- use CC-BY @atreloar 22
  • 23. ANDS activities/services  Plan  Data management planning tools and resources (N)  Create/Capture  69 Data Capture projects at 23 universities  Store  working closely with national Research Data Storage Infrastructure (N)  Describe  25 institutional Metadata Stores projects  National Vocabulary Services (N) CC-BY @atreloar 23
  • 24. CC-BY @atreloar 24  Identify (N)  DataCite DOIs  Register (N)  Repository Interchange Format – Collections and Services (RIF-CS) – based on ISO2146:2010  Discover (N)  Research Data Australia ANDS activities/services
  • 25. ANDS activities/services  Access  enforced by underlying data stores  Exploit  25 institutionally-focussed projects to demonstrate value of combining data  Advocate (N)  Be the voice for data  Work with Government and Research Funders to change settings in favour of data sharing CC-BY @atreloar 25
  • 26. 26 Research Data Alliance  The Research Data Alliance (RDA) is a new international organization (driven now by EC, US, AU, more soon) forming to facilitate specific, short-term efforts that accelerate the sharing and exchange of research data  Unofficial motto: rough consensus and exchanged data  Working groups will run over 12-18 months to produce  Adopted standards  Deployed infrastructure  Adopted policy  Implemented best practice, etc.  Second Plenary in Washington DC, September 16-18 Slide by Fran Berman
  • 27. 27  Data Type Registries  Data Foundation and Terminology  Practical Policy  PID Information Types  Metadata Standards WG  Community Capability Model  Working Group on Data Citation: Making Data Citable  Structural Biology  Defining Urban Data Exchange for Science  Marine Data Harmonization  Repository Audit and Certification  Big Data Analytics  Metadata Standards Directory Interest Group (MSDIG)  The Engagement Group  Legal Interoperability  Preservation e-Infrastructure  UPC Code for Data  Publishing Data  Data in Context  Citation of Dynamic Data  Agricultural Data Interoperability Working Groups Interest Groups Research Data Alliance Slide by Fran Berman
  • 28. Conclusion • We are all in the data business • Researchers need data services from their infrastructure providers • A number of services can best be provided at national or regional level • Research Data Alliance is working to develop international solutions for data interoperability – join us! CC-BY @atreloar 28

Notes de l'éditeur

  1. Let me start with a quotation:“The railroads did not stop growing because the need for passenger and freight transportation declined. That grew. The railroads are in trouble today not because that need was filled by others (cars, trucks, airplanes, and even telephones) but because it was not filled by the railroads themselves. They let others take customers away from them because they assumed themselves to be in the railroad business <CLICK>.”Bergen Railway
  2. “…rather than in the transportation business [this is Dubai International Terminal]. The reason they defined their industry incorrectly was that they were railroad oriented instead of transportation oriented; they were product oriented instead of customer oriented....”Dubai International Terminal
  3. Talk about the importance of recognising what business you are actually in, as opposed to the business you think you are in.
  4. If the only tool you have is a hammer, then everything looks like a nail (apparently no direct Norwegian equivalent according to my hopefully future daughter-in-law native speaker informant)Yes, I work for a data organisation, and so I might be biased, but let’s look hard at some e-Research infrastructure businesses
  5. Networks exist to move what around? Data, and data derivatives (to a first approximation)
  6. Storage exists to store what? Data, and data derivatives
  7. HPC exists to generate and process what? DataI could go on: Visualisation? DataCalculation? Dataetc.
  8. Of course, it’s possible to take this too far. I look at this and see a data-collection instrument ;-)
  9. Of course, researchers also generate publications too, but they need the data in order to be able to do so.
  10. So, if we are all in the data business, what does that mean for researchers? How do we support what they need to do as they create, publish and reuse data?Here is one way of thinking about the functions that need to be supported (based on work by me and Dr Adrian Burton from ANDS)NOTE: This is somewhat idealised, and some of the steps are often done poorly or not at all. Publish = Store+Describe+Register+Identify
  11. And now, let me provide a more Australian flavour to the talk
  12. Recap
  13. Store – we don’t do storageDescribe – 25 of 40 universities
  14. Discover – quick demo if timeAdvocate – new verb
  15. Before I close, let me talk briefly about the Research Data Alliance.Nordic involvement in Organising Group? Nomination for Council?
  16. And WGs/IGs of course