SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
SEAD Datanet and
1.
2.
      NSF DataNet Overview
      SEAD Overview
                                        Sustainability Science
3.    SEAD Active/Social Curation
4.    SEAD Virtual Archive Repository
                                        Robert H. McDonald
                                        Deputy Director/Associate Dean
                                        Data to Insight Center/IU Libraries
                                        SC12 | Salt Lake City, UT

                                        November 12, 2012




     http://www.sead-data.net
          @SEADdatanet
SEAD DataNet and Sustainability
Science




                            http://www.sead-data.net
http://slidesha.re/TAk3ht        @SEADdatanet


 2      SEAD DataNet Home
SEAD TEAMS
              Margaret Hedstrom-PI, Marietta Van Buhler, Karen Woollams,
 Michigan     George Alter (ICPSR), Bryan Beecher (ICPSR)

              Beth Plale-Co-PI, Katy Börner, Robert H. McDonald, Robert Light,
              Kavitha Chandrasekar, Stacy Kowalczyk, Inna Kouper, Robert Ping,
  Indiana     Ryan Cobine

              James Myers-Co-PI, Ram Prasanna Govind Krishnan, Lindsay Todd
Rensselaear

              Praveen Kumar-Co-PI, Terry McLaren (NCSA), Rob Kooper (NCSA),
  Illinois    Luigi Marini (NCSA)




    3          SEAD DataNet Home
NSF DataNet Program

Motivation:
  “… one of the major challenges of this scientific
  generation: how to develop the new methods,
  management structures and technologies to
  manage the diversity, size, and complexity of
  current and future data sets and data streams.”
Response:
  DataNet creates “a set of exemplar national and
  global data research infrastructure organizations” to
  address this challenge.



4      SEAD DataNet Home
Current NSF DataNet Projects

SEAD
   • http://sead-data.net
DataOne
   • http://www.dataone.org
DataNet Federation Consortium
   • http://datafed.org
Terra Populous
   • https://www.pop.umn.edu/terra_pop




5     SEAD DataNet Home
SEAD’s Approach
SEAD Partners - http://sead-data.net
                                       • Contribute infrastructure to the
                                         DataNet vision that supports data
                                         access, sharing, reuse, and
                                         preservation for the long tail
                                       • Develop a data access and
                                         preservation environment that
                                         supports the research, technical,
                                         and economic requirements for
                                         data management in the long tail
                                       • Enable Active and Social Curation
                                         Utilize emerging preservation and
                                         access infrastructures




   6          SEAD DataNet Home
Long Tail Data Challenges
                    Exa
                    Bytes
Bytes per day




                    Peta
                    Bytes




                    Tera
                    Bytes




                    Giga
                    Bytes
                                         Many smaller datasets…




                7           SEAD DataNet Home
CI for the Long Tail

What is the “long tail” of scientific research and
why does it matter?
    •   Diverse set of researchers, questions, data, and
        methodologies, etc.
    •   Diverse set of requirements for instrumentation, data
        collection, models, analysis, etc.
    •   Little standardization, no common denominator
    •   Most researchers and most research dollars go to
        researchers in the long tail
    •   The long tail is underserved by current CI



8        SEAD DataNet Home
Long Tail Example: Sustainability
Research
Many dimensions, many coordinate systems, many scales,
many data collection and analysis tools, many formats, a
long-tail of providers and users, …




9       SEAD DataNet Home
SEAD 18 month Pilot Phase
Domain Engagement:
     •    National Center for Earth-Surface Dynamics (NCED), Illinois River
          Basin Observatory
     •    Requirements, Use Cases, Prioritization of Data Types and
          Services
Active and Social Curation
     •    Pilot Active Content Repository, VIVO deployments
     •    Exemplar services for Data Ingest, Discovery, Re-use, Curation
          (Tupelo/Medici)
CI for Long-term Access (Virtual Archive)
     •    Data model, protocol design/development
     •    Pilot Federated Repository infrastructure
Education, Outreach, and Training
     •    Post-doc mentoring
     •    Web site, training materials, meetings, workshops, …
Project Oversight
     •    Management, reporting, committees
     •    Business model development



10          SEAD DataNet Home
NCED Collection Access
NCED collections in SEAD-ACR
      •   (20 Top-level Collections, 454K
          files, 2.25M objects, 1.6 TB data)
      •   NCED Repository Interface
      •   Support for hierarchy
      •   Support for collection annotation
      •   View/add NCED/domain specific
          Terms
      •   New Large Server with Virtual
          Machine ACR instances
      •   Ingest tools and procedures
          •   csv2rdf4LOD
      •   Archiving, Citation, DOI
          assignment, …
NCED users can (with an account) go from
web page to previews and downloads (w/o
cart), can add annotations, can browse,
search by text (any fields and content), tags,
etc.



 11           SEAD DataNet Home
SEAD notions of defined Data Phases
Phases of data lifecycle acknowledge and accommodate the difference between public
data and data still in work by a researcher.
Research Data Phase: data set is research data collection, owned by individual and
under their control.
     •   Data need not be licensed at this time because it is not ready
         for broader release
     •   Data need not have permanent IDs because still work in
         progress
     •   Corresponds to first existence in Active Curation Repository
Published Phase: Owner of research data collection determines that dataset is ready for
publication
     •   License terms set
     •   Persistent ID
     •   Made available as part of public profile in VIVO
     •   Activated by user-controlled publish event



12          SEAD DataNet Home
SEAD Active/Social Curation
Repository




13   SEAD DataNet Home
14   SEAD DataNet Home
ACR Bulk Ingest Process
                                                           Configuration:
                                                           • Headers to Standard Vocabularies
                                                           • Content Mapping to identifiers
           Metadata                                        • Additional Inference possible


Data
                           TWC: csv2rdf4lod                         .ttl output file




                  DROID Analysis



                               global ID, filepath, file
                                                                    Extractors/
       ACR Ingest                                                    Preview
                            Incremental ingest, restart, verify
                                                                      On/Off
                                                                                  SEAD ACR Instance


15     SEAD DataNet Home
16   SEAD DataNet Home
17   SEAD DataNet Home
18   SEAD DataNet Home
19
20   SEAD DataNet Home
21   SEAD DataNet Home
22   SEAD DataNet Home
23   SEAD DataNet Home
24   SEAD DataNet Home
25   SEAD DataNet Home
26   SEAD DataNet Home
SEAD/NCED Data Social Network




27   SEAD DataNet Home
NCED Data Social Network in SEAD-VIVO
      Mary Power             NCED PI and Professor University of California
      William Dietrich       NCED PI and Professor University of California
      Collin Bode            NCED Data Technician




NCED Social Network Connections Based on Data Authorship

28       SEAD DataNet Home
Angelo Basic GIS Coverage Data Set




29     SEAD DataNet Home
SEAD Data Set Publishing Workflow



                                      NCED Data Set                                   NCED Data Set
• Data content used                   Ingested to VA   • DataCite minted               Published to
  within ACR                                             DOI attached to                  VIVO
• Researcher Profile           • Data Set ready to       finalized Data Set
                                 publish                                        • DOI Resolution to
  Established in VIVO
                                                                                  designated IR

         NCED Data Set                                       NCED Data Set
        Ingested to ACR                                     Deposited with IR




   30             SEAD DataNet Home
Published NCED Data Set in IR (IU ScholarWorks)




31     SEAD DataNet Home
SEAD Virtual Archive




32   SEAD DataNet Home
Virtual Archive Features
Usability consistent with research user expectations
   • Additional metadata fields for scientific datasets
   • Ability to ingest data with previewing data
Repository tracking: tracking member Institutional Repositories
(IRs) and their stored content
   • Not just link to repository, but extensive cataloging tool
       (metadata and other additional information)
   • Allows users to search for data in particular IR or over
       all IR’s
Low cost replication: cloud based storage for reliability
   • Proof of concept uses Amazon S3 to maintain copy of
       files and collections. Amazon Glacier is low-cost, secure
       and durable. Optimized for cold storage. Other
       solutions exist.

 33      SEAD DataNet Home
Virtual Archive Features




34   SEAD DataNet Home
Virtual Archive Features




35   SEAD DataNet Home
Component Interactions:
Virtual Archive and ACR


                                 Data Set Ingested                       Data Set
                                 to Virtual Archive                    Published to
                                                                          VIVO


                                                         Data Set
 Data Set Uploaded                                    Deposited with
       to ACR                                          Institutional
                                                        repository




 36          SEAD DataNet Home
ACR – VA Interaction Protocol
                                              ACR UI                                               VA UI
                                 Researcher                                              Curator
         Mark Data For Publication
         (and Accept Licensing Terms)




                                                       Active Curation Repository
         Curator Request for Preview




                                                                                                                       Virtual Archive
         (SPARQL) Query Metadata
         Return Metadata




                                                                                                           Endpoint
                                                                                                            SWORD
          Curator Preview
T im e




         Ingest Data To VA


          User Queries VA for DOI
                                                                                       Query
          Metadata update and View                                                  DOI Metadata




                                                                                                            Endpoint
                                                                                                             Query
         37            SEAD DataNet Home
Virtual Archive Workflow
  Accept
Repository
Agreement
  in ACR




  Preview
                                                     File
    Data      Upload Data             Run Virus                   Deposit to               Index
                 to VA                            Character-                   Mint DOI
  Ready to                            Checking                       IR                   Metadata
                                                   ization
   Publish

                                                                   Large                    Index
                                                                  Dataset                 Scientific
                            Version                   IR Match-                           Metadata
                             Data                       maker      Policy
                                                                  Decision



             To be completed
             by March 2013



 38          SEAD DataNet Home
Key Questions for SEAD Prototype

• What could SEAD capture when?
• How can SEAD provide direct value
  to data producers, users, and
  curators?
• How can web 2.0/3.0 and social
  computing lower barriers and
  reduce/realign costs?


39   SEAD DataNet Home
Towards A Shared Data Future
                                 Data                                                     User functionalities, data
                                                             Users                        capture & transfer, virtual
                               Generators                                                 research environments
        Data Curation




                                                                                          Data discovery & navigation,
                                    Community Support Services                            workflow generation,
Trust




                                                                                          annotation, interpretability




                                                                                          Persistent storage,
                                                                                          identification, authenticity
                                        Common Data Services                              (provenance), workflow
                                                                                          execution, data mining

                                                    Source: EU HLEG Report on Data Deluge: Riding the Wave, pg 31, 2010




40                      SEAD DataNet Home
Data Interoperability and SEAD
•    NSF OCI: DataNet and INTEROP now DIBBs
•    EUDAT
•    Research Data Alliance
•    IETF Research Data Identifier BOF
•    NCED Data Network




41        SEAD DataNet Home
Acknowledgements
SEAD is funded by the National Science Foundation
under cooperative agreement #OCI0940824


• For more on SEAD go to:
• http://sead-data.net

• Follow us on Twitter
  @SEADdatanet



                             http://sead-data.net

 42     SEAD DataNet Home

Contenu connexe

Tendances

Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curationMichael Day
 
Digital Curation in Libraries: An innovative way of content preservation and...
Digital Curation in Libraries:  An innovative way of content preservation and...Digital Curation in Libraries:  An innovative way of content preservation and...
Digital Curation in Libraries: An innovative way of content preservation and...Bhojaraju Gunjal
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersJez Cope
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...librarianrafia
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceGarethKnight
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Jenn Riley
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identificationguest453b14
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data CommonsVivien Bonazzi
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 

Tendances (19)

OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
OpenData Public Research, University of Toronto, Open Access Week, 25/11/2011
 
Or 2013-abrams-sharing-data-rich-research
Or 2013-abrams-sharing-data-rich-researchOr 2013-abrams-sharing-data-rich-research
Or 2013-abrams-sharing-data-rich-research
 
Digital Curation Technology: JHU Summit, October 2015
Digital Curation Technology: JHU Summit, October 2015Digital Curation Technology: JHU Summit, October 2015
Digital Curation Technology: JHU Summit, October 2015
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
Baker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated AudiencesBaker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated Audiences
 
Digital Curation in Libraries: An innovative way of content preservation and...
Digital Curation in Libraries:  An innovative way of content preservation and...Digital Curation in Libraries:  An innovative way of content preservation and...
Digital Curation in Libraries: An innovative way of content preservation and...
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
 
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 
Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...
Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...
Hawaii Pacific GIS Conference 2012: GIS in Education: K-12 and University - H...
 
Whither Small Data?
Whither Small Data?Whither Small Data?
Whither Small Data?
 
University of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchersUniversity of Bath Research Data Management training for researchers
University of Bath Research Data Management training for researchers
 
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data EquivalenceNISO Forum, Denver, Sept. 24, 2012: Data Equivalence
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
Challenges in setting up an RDM Support Service
Challenges in setting up an RDM Support ServiceChallenges in setting up an RDM Support Service
Challenges in setting up an RDM Support Service
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
Dataset Citation and Identification
Dataset Citation and IdentificationDataset Citation and Identification
Dataset Citation and Identification
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
RDA Update
RDA UpdateRDA Update
RDA Update
 

Similaire à SEAD Datanet and Sustainability Science

SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...aceas13tern
 
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 FinalLibby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Finala.carusi
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalramazan fırın
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing dataWorld Agroforestry (ICRAF)
 
Emerging domain agnostic functionalities on the handle-centered networks
Emerging domain agnostic functionalities on the handle-centered networksEmerging domain agnostic functionalities on the handle-centered networks
Emerging domain agnostic functionalities on the handle-centered networksNational Institute of Informatics
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesBDLSS
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Denodo
 
Graham Pryor
Graham PryorGraham Pryor
Graham PryorEduserv
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycleSherry Lake
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeeling Cheung
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013DataTactics
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | EUDAT
 
Creating a sustainable business model for a digital repository: the Dryad exp...
Creating a sustainable business model for a digital repository: the Dryad exp...Creating a sustainable business model for a digital repository: the Dryad exp...
Creating a sustainable business model for a digital repository: the Dryad exp...ASIS&T
 
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...Jennifer Liss
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET
 

Similaire à SEAD Datanet and Sustainability Science (20)

SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 FinalLibby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
Libby Bishop, Ethics Of Data Sharing Ncess Jun 09 Final
 
Big data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-finalBig data hadoop-no sql and graph db-final
Big data hadoop-no sql and graph db-final
 
Research methods group accelarating impact by sharing data
Research methods group  accelarating impact by sharing dataResearch methods group  accelarating impact by sharing data
Research methods group accelarating impact by sharing data
 
Emerging domain agnostic functionalities on the handle-centered networks
Emerging domain agnostic functionalities on the handle-centered networksEmerging domain agnostic functionalities on the handle-centered networks
Emerging domain agnostic functionalities on the handle-centered networks
 
Cni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferiesCni research data_oxford_horstmann_jefferies
Cni research data_oxford_horstmann_jefferies
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
Bertenthal
BertenthalBertenthal
Bertenthal
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 
Managing the research life cycle
Managing the research life cycleManaging the research life cycle
Managing the research life cycle
 
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data TorrentSeagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
Seagate: Sensor Overload! Taming The Raging Manufacturing Big Data Torrent
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
DT Company Overview January 2013
DT Company Overview January 2013DT Company Overview January 2013
DT Company Overview January 2013
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATResearch Data Management: An Introductory Webinar from OpenAIRE and EUDAT
Research Data Management: An Introductory Webinar from OpenAIRE and EUDAT
 
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu | Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
Research Data Management Introduction: EUDAT/Open AIRE Webinar| www.eudat.eu |
 
Creating a sustainable business model for a digital repository: the Dryad exp...
Creating a sustainable business model for a digital repository: the Dryad exp...Creating a sustainable business model for a digital repository: the Dryad exp...
Creating a sustainable business model for a digital repository: the Dryad exp...
 
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...
Collaborate, Automate, Prepare, Prioritize: Creating Metadata for Legacy Rese...
 
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
dkNET Office Hours - "Are You Ready for 2023: New NIH Data Management and Sha...
 

Plus de Robert H. McDonald

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelRobert H. McDonald
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...Robert H. McDonald
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Robert H. McDonald
 
JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesRobert H. McDonald
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15Robert H. McDonald
 
The HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational ServicesThe HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational ServicesRobert H. McDonald
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterRobert H. McDonald
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Robert H. McDonald
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesRobert H. McDonald
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14Robert H. McDonald
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkRobert H. McDonald
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsRobert H. McDonald
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesRobert H. McDonald
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudRobert H. McDonald
 
The HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and DemoThe HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and DemoRobert H. McDonald
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...Robert H. McDonald
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Robert H. McDonald
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...Robert H. McDonald
 

Plus de Robert H. McDonald (20)

ER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations PanelER&L The Role of Choice in the Future of Discovery Evaluations Panel
ER&L The Role of Choice in the Future of Discovery Evaluations Panel
 
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
The HathiTrust Research Center: Enabling New Knowledge Through Shared Infras...
 
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
Academic Libraries and Big Data: Trends in Collection, Publication, Preservat...
 
JCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening SlidesJCDL 2015 Tutorial Opening Slides
JCDL 2015 Tutorial Opening Slides
 
TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15TLT Discussion on "Saving My Stuff" - 06.05.15
TLT Discussion on "Saving My Stuff" - 06.05.15
 
The HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational ServicesThe HathiTrust Research Center: An Overview of Advanced Computational Services
The HathiTrust Research Center: An Overview of Advanced Computational Services
 
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research CenterElephant in the Room: Scaling Storage for the HathiTrust Research Center
Elephant in the Room: Scaling Storage for the HathiTrust Research Center
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
ER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote SlidesER&L 2015 Closing Keynote Slides
ER&L 2015 Closing Keynote Slides
 
HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14HathiTrust Research Center Data Capsule Overview 09.10.14
HathiTrust Research Center Data Capsule Overview 09.10.14
 
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data FrameworkThe HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
The HathiTrust Research Center: Big Data Analytics in a Secure Data Framework
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
 
Kuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for LibrariesKuali OLE: Enabling Choices for Libraries
Kuali OLE: Enabling Choices for Libraries
 
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to CloudCharleston Seminar Being Earnest with our Collections - Legacy to Cloud
Charleston Seminar Being Earnest with our Collections - Legacy to Cloud
 
The HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and DemoThe HathiTrust Research Center (HTRC): An Overview and Demo
The HathiTrust Research Center (HTRC): An Overview and Demo
 
SCONUL Kuali OLE Briefing
SCONUL Kuali OLE BriefingSCONUL Kuali OLE Briefing
SCONUL Kuali OLE Briefing
 
New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...New Perspectives for Business Intelligence: Library and Research Technologies...
New Perspectives for Business Intelligence: Library and Research Technologies...
 
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
Kuali OLE: Deep Library Collaboration and the Release of a Community-Sourced ...
 
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...GOKb & KB+: An International Partnership to leverage Open Access and Communit...
GOKb & KB+: An International Partnership to leverage Open Access and Communit...
 
Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012Kuali OLE @ LITA Forum 2012
Kuali OLE @ LITA Forum 2012
 

Dernier

How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfChristalin Nelson
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...HetalPathak10
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 

Dernier (20)

How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdf
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
Satirical Depths - A Study of Gabriel Okara's Poem - 'You Laughed and Laughed...
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 

SEAD Datanet and Sustainability Science

  • 1. SEAD Datanet and 1. 2. NSF DataNet Overview SEAD Overview Sustainability Science 3. SEAD Active/Social Curation 4. SEAD Virtual Archive Repository Robert H. McDonald Deputy Director/Associate Dean Data to Insight Center/IU Libraries SC12 | Salt Lake City, UT November 12, 2012 http://www.sead-data.net @SEADdatanet
  • 2. SEAD DataNet and Sustainability Science http://www.sead-data.net http://slidesha.re/TAk3ht @SEADdatanet 2 SEAD DataNet Home
  • 3. SEAD TEAMS Margaret Hedstrom-PI, Marietta Van Buhler, Karen Woollams, Michigan George Alter (ICPSR), Bryan Beecher (ICPSR) Beth Plale-Co-PI, Katy Börner, Robert H. McDonald, Robert Light, Kavitha Chandrasekar, Stacy Kowalczyk, Inna Kouper, Robert Ping, Indiana Ryan Cobine James Myers-Co-PI, Ram Prasanna Govind Krishnan, Lindsay Todd Rensselaear Praveen Kumar-Co-PI, Terry McLaren (NCSA), Rob Kooper (NCSA), Illinois Luigi Marini (NCSA) 3 SEAD DataNet Home
  • 4. NSF DataNet Program Motivation: “… one of the major challenges of this scientific generation: how to develop the new methods, management structures and technologies to manage the diversity, size, and complexity of current and future data sets and data streams.” Response: DataNet creates “a set of exemplar national and global data research infrastructure organizations” to address this challenge. 4 SEAD DataNet Home
  • 5. Current NSF DataNet Projects SEAD • http://sead-data.net DataOne • http://www.dataone.org DataNet Federation Consortium • http://datafed.org Terra Populous • https://www.pop.umn.edu/terra_pop 5 SEAD DataNet Home
  • 6. SEAD’s Approach SEAD Partners - http://sead-data.net • Contribute infrastructure to the DataNet vision that supports data access, sharing, reuse, and preservation for the long tail • Develop a data access and preservation environment that supports the research, technical, and economic requirements for data management in the long tail • Enable Active and Social Curation Utilize emerging preservation and access infrastructures 6 SEAD DataNet Home
  • 7. Long Tail Data Challenges Exa Bytes Bytes per day Peta Bytes Tera Bytes Giga Bytes Many smaller datasets… 7 SEAD DataNet Home
  • 8. CI for the Long Tail What is the “long tail” of scientific research and why does it matter? • Diverse set of researchers, questions, data, and methodologies, etc. • Diverse set of requirements for instrumentation, data collection, models, analysis, etc. • Little standardization, no common denominator • Most researchers and most research dollars go to researchers in the long tail • The long tail is underserved by current CI 8 SEAD DataNet Home
  • 9. Long Tail Example: Sustainability Research Many dimensions, many coordinate systems, many scales, many data collection and analysis tools, many formats, a long-tail of providers and users, … 9 SEAD DataNet Home
  • 10. SEAD 18 month Pilot Phase Domain Engagement: • National Center for Earth-Surface Dynamics (NCED), Illinois River Basin Observatory • Requirements, Use Cases, Prioritization of Data Types and Services Active and Social Curation • Pilot Active Content Repository, VIVO deployments • Exemplar services for Data Ingest, Discovery, Re-use, Curation (Tupelo/Medici) CI for Long-term Access (Virtual Archive) • Data model, protocol design/development • Pilot Federated Repository infrastructure Education, Outreach, and Training • Post-doc mentoring • Web site, training materials, meetings, workshops, … Project Oversight • Management, reporting, committees • Business model development 10 SEAD DataNet Home
  • 11. NCED Collection Access NCED collections in SEAD-ACR • (20 Top-level Collections, 454K files, 2.25M objects, 1.6 TB data) • NCED Repository Interface • Support for hierarchy • Support for collection annotation • View/add NCED/domain specific Terms • New Large Server with Virtual Machine ACR instances • Ingest tools and procedures • csv2rdf4LOD • Archiving, Citation, DOI assignment, … NCED users can (with an account) go from web page to previews and downloads (w/o cart), can add annotations, can browse, search by text (any fields and content), tags, etc. 11 SEAD DataNet Home
  • 12. SEAD notions of defined Data Phases Phases of data lifecycle acknowledge and accommodate the difference between public data and data still in work by a researcher. Research Data Phase: data set is research data collection, owned by individual and under their control. • Data need not be licensed at this time because it is not ready for broader release • Data need not have permanent IDs because still work in progress • Corresponds to first existence in Active Curation Repository Published Phase: Owner of research data collection determines that dataset is ready for publication • License terms set • Persistent ID • Made available as part of public profile in VIVO • Activated by user-controlled publish event 12 SEAD DataNet Home
  • 14. 14 SEAD DataNet Home
  • 15. ACR Bulk Ingest Process Configuration: • Headers to Standard Vocabularies • Content Mapping to identifiers Metadata • Additional Inference possible Data TWC: csv2rdf4lod .ttl output file DROID Analysis global ID, filepath, file Extractors/ ACR Ingest Preview Incremental ingest, restart, verify On/Off SEAD ACR Instance 15 SEAD DataNet Home
  • 16. 16 SEAD DataNet Home
  • 17. 17 SEAD DataNet Home
  • 18. 18 SEAD DataNet Home
  • 19. 19
  • 20. 20 SEAD DataNet Home
  • 21. 21 SEAD DataNet Home
  • 22. 22 SEAD DataNet Home
  • 23. 23 SEAD DataNet Home
  • 24. 24 SEAD DataNet Home
  • 25. 25 SEAD DataNet Home
  • 26. 26 SEAD DataNet Home
  • 27. SEAD/NCED Data Social Network 27 SEAD DataNet Home
  • 28. NCED Data Social Network in SEAD-VIVO Mary Power NCED PI and Professor University of California William Dietrich NCED PI and Professor University of California Collin Bode NCED Data Technician NCED Social Network Connections Based on Data Authorship 28 SEAD DataNet Home
  • 29. Angelo Basic GIS Coverage Data Set 29 SEAD DataNet Home
  • 30. SEAD Data Set Publishing Workflow NCED Data Set NCED Data Set • Data content used Ingested to VA • DataCite minted Published to within ACR DOI attached to VIVO • Researcher Profile • Data Set ready to finalized Data Set publish • DOI Resolution to Established in VIVO designated IR NCED Data Set NCED Data Set Ingested to ACR Deposited with IR 30 SEAD DataNet Home
  • 31. Published NCED Data Set in IR (IU ScholarWorks) 31 SEAD DataNet Home
  • 32. SEAD Virtual Archive 32 SEAD DataNet Home
  • 33. Virtual Archive Features Usability consistent with research user expectations • Additional metadata fields for scientific datasets • Ability to ingest data with previewing data Repository tracking: tracking member Institutional Repositories (IRs) and their stored content • Not just link to repository, but extensive cataloging tool (metadata and other additional information) • Allows users to search for data in particular IR or over all IR’s Low cost replication: cloud based storage for reliability • Proof of concept uses Amazon S3 to maintain copy of files and collections. Amazon Glacier is low-cost, secure and durable. Optimized for cold storage. Other solutions exist. 33 SEAD DataNet Home
  • 34. Virtual Archive Features 34 SEAD DataNet Home
  • 35. Virtual Archive Features 35 SEAD DataNet Home
  • 36. Component Interactions: Virtual Archive and ACR Data Set Ingested Data Set to Virtual Archive Published to VIVO Data Set Data Set Uploaded Deposited with to ACR Institutional repository 36 SEAD DataNet Home
  • 37. ACR – VA Interaction Protocol ACR UI VA UI Researcher Curator Mark Data For Publication (and Accept Licensing Terms) Active Curation Repository Curator Request for Preview Virtual Archive (SPARQL) Query Metadata Return Metadata Endpoint SWORD Curator Preview T im e Ingest Data To VA User Queries VA for DOI Query Metadata update and View DOI Metadata Endpoint Query 37 SEAD DataNet Home
  • 38. Virtual Archive Workflow Accept Repository Agreement in ACR Preview File Data Upload Data Run Virus Deposit to Index to VA Character- Mint DOI Ready to Checking IR Metadata ization Publish Large Index Dataset Scientific Version IR Match- Metadata Data maker Policy Decision To be completed by March 2013 38 SEAD DataNet Home
  • 39. Key Questions for SEAD Prototype • What could SEAD capture when? • How can SEAD provide direct value to data producers, users, and curators? • How can web 2.0/3.0 and social computing lower barriers and reduce/realign costs? 39 SEAD DataNet Home
  • 40. Towards A Shared Data Future Data User functionalities, data Users capture & transfer, virtual Generators research environments Data Curation Data discovery & navigation, Community Support Services workflow generation, Trust annotation, interpretability Persistent storage, identification, authenticity Common Data Services (provenance), workflow execution, data mining Source: EU HLEG Report on Data Deluge: Riding the Wave, pg 31, 2010 40 SEAD DataNet Home
  • 41. Data Interoperability and SEAD • NSF OCI: DataNet and INTEROP now DIBBs • EUDAT • Research Data Alliance • IETF Research Data Identifier BOF • NCED Data Network 41 SEAD DataNet Home
  • 42. Acknowledgements SEAD is funded by the National Science Foundation under cooperative agreement #OCI0940824 • For more on SEAD go to: • http://sead-data.net • Follow us on Twitter @SEADdatanet http://sead-data.net 42 SEAD DataNet Home

Notes de l'éditeur

  1. A Collection of heterogeneous files. Users can tag and add comments to the entire ‘collection’ and individually tag and comment on the objects in the collection. Note: Extraction services and previewers are all driven by the file MIME type. Extraction services are customizable and are designed to automate derived data products from the file being uploaded. Examples follow…
  2. Lidar data saved as .png.The Image extraction service does the following:Creates the thumbnail and preview imageCreates an image pyramid of the image (zoom/pan large images w/o downloading entire image via the SeaDragon webapp )Extract all header information from image file to include: Exif, GPS, Interoperability, etc… Extracted data is view by clicking on the “Extracted Information” section.
  3. A data set saved as a simple ASCII text file.- Users can preview the first 80 lines of the text file.
  4. Preview the contents of .csv files
  5. Simple map image User defined informationImage is part of multiple collectionsImage is tagged
  6. 3 Images (3 clicks)Standard Medici InfoScroll down to show location and annotationThis image file also contained geo location data which become visible in “Location”. Geo-location can be extracted from the image Exif data or authors can add a geo-location to any file in the repository.Note the creator tag and vivo reference.
  7. Tif support - relatively large 71MB fileClicks…Click Zoom to enable SeaDragon to explore the details of the file via zoom and pan with mouse.Click the lower right icon to enable full screen. Use + or – key to zoom (or wheel on mouse), click image and drag to panClick lower right icon to return to embedded window in Medici
  8. Image file that contains GPS data which is extracted by Medici as part of the upload process.
  9. Mpeg file uploads:Extraction service creates a flash version of the file for preview.
  10. PDF files Extraction service generates an image per page of the file. In this case a slide set from a presentation. Click ‘Pages’ to enable the slide set mode and click on the left or right arrows to navigate the pages. 2 images – click to advance slide.
  11. .shp files The components of shape file get uploaded to Medici as a zip Medici saves the zip blob and the extraction service registers the contents of the shp file with GeoServerOpenStreetMap displays the contents of the zipLayers are on by default but can be turned by clicking the ‘show’ button.Opacity of layers can be varied using the opacity scale.(WIP) We plan to embed OpenStreetMap in Medici as a previewer for .shp and .kml
  12. All layers off except Illinois Flood Zone map. Map zoomed into the Champaign region of interest.