SlideShare a Scribd company logo
1 of 26
Prepared for 2010 Graduate seminarInformetrics and e-research (prof. Han Woo Park),at Yeungnam Univ. in S. Korea The Promise of Data in e-Research Many Challenges, Multiple Solutions, Diverse Outcomes Ann Zimmerman, Nathan Bos,  Judith S. Olsen and Gary M. Olsen Presented by Kim KyoungEun river@ynu.ac.kr 15. March 2010
Introduction ▶ The period of ‘data deluge’  (information explosion, information overload) :  The need to document, manage, transfer, analyze, and preserve digital data is a significant driver of the development of tools and technologies for e-research.  : It is not yet clear how data deluge will affect research practice and outcomes.  The purpose of this chapter is to analyze different approaches to data sharing in order to identify important factors that may lead to success.
Introduction ▶ Although importance of shared access to data and collaboration across disciplines and distance, finding from both past and present studies show that efforts to share data face considerable social, organizational, legal, scientific, and technical challenges.   ▶ The most significant obstacle is individualism of scientist.     : the ‘selfish scientist’ ◈ Goble & De Roure (2007)     – “e-science is, inherently, me-science”
Introduction ▶ A study by a special committee of the Ecological Society of America found that fields where data sharing is common are characterized by a mixture of : ,[object Object]
Scientifically motivated needs, especially the questions that researchers want to answer and
Socially influenced demands and incentives,[object Object]
Why are data sharing methods that achieve positive results in one context not effective in another case?
Do shared data get used, and if so, how are they used? ,[object Object]
Data Sources and Methods ▶ data : “scientific or technical measurements, values calculated there from, and observations or facts that can be represented by numbers, tables, graphs, models, text, or symbols and that are used as a basis for reasoning or further calculation”   ▶ Below we briefly describe our data corpus, which includes a meta-analysis of multiple distributed collaborations and a focused view of data sharing in one discipline.
Data Sources and MethodsScience of Collaboratories ▶ The Science of Collaboratories (SOC) : the name of a five-year project funded by the U.S. National Science Foundation(NSF) to study computer-supported distributed collaborations across many research disciplines.  : The overall goals of the SOC project were to : (1) perform a comparative analysis of collaboratory project (2) develop theory about this new organizational form (3) offer practical advice to collaboratory participants and to funding agencies about how to design and construct successful collaboratories.
Data Sources and MethodsThe Sharing and Reuse of Ecological Data ▶ Interviews to investigate the experiences of ecologists were also conducted with data managers in order to obtain another perspective on the sharing and reuse of ecological data.  ▶ the significant obstacles to sharing & reuse : ⒜The data are widely dispersed, heterogeneous, and complex, which make them difficult to locate and hard to reuse.  ⒝ social factors that hinder data sharing, such as issues of ownership and a lack of reward for sharing.
Data Sharing as a continuum ▶ This chapter draw on cases from their own research and examples from studies by other scholars to show that the outcomes of data sharing approaches occur along a continuum.  : At one end of the continuum are approaches that allow researchers to work as they always have, and the labor necessary to prepare data, make them available, and support their use is conducted by others. In this case, data sharing considerations are not injected into the research process, but are managed by others after the fact.  <-> In contrast, solutions at the other end of the continuum force researchers to consider barriers to sharing, integration, and federation at the outset of data collection and to develop solutions in advance to deal with these issues.  In this case, tighter links are formed between the production and the sharing of data.
Many Challenges, Multiple solutions, Diverse outcomes ▶ It is hard to share data. There are many reasons for this and numerous approaches have been devised to overcome these challenges. We describe some of the issues that make data sharing hard, and we analyze methods that have been developed to address them.
Many Challenges, Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data ▶ Bringing the widely dispersed data together in a centralized database has several potential advantages. It can help to avoid duplication of effort.  : The aggregation or integration of distributed data, which can be carried out by individuals, small teams of researchers, or a group of individuals with diverse skills, is a common way to create a publicly available data resource.  -> The following case study illustrate some prototypical strategies designed to bring dispersed data together.
Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data1) Curating published data ※ WarmBase                 : maintained by the International  WarmBase Consortium.                    : They are extracted and integrated into the  Warmbasedatabase and made available to  any user via the Internet.   : the data benefit from reuse.       : The work of WarmBase curators is made possible by funding from the National Human Genome Research Institute and the British Medical Research Counsil.
※ FlyBase FlyBase, the primary source for molecular and        genetic information about the Drosophila         (fruit fly) genome, operates and is maintained  in a fashion similar to WarmBase.  ※ The Ecological Society of America(ESA)   : developed a digital archive for appendices and supplements, including raw data associated with papers published in ESA journals.      : Since it relies on voluntary deposits of data, it lacks the comprehensiveness of WormBase and FlyBase.  Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data1) Curating published data
◈ Bos(2008)   : identified economic incentives.   : The economic method that has been most successful is the requirement that authors provide proof of data contribution as a prerequisite to publication.  : ex) GenBank   – GenBankis comprised primarily of data associated with a publication, and it does not appear to have motivated researchers to contribute unpublished data.  Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data2) Data deposition as a requirement of publication
▶ Why published data comprise the majority of data in many aggregated database? ①the time and effort required by researchers to fully document unpublished data.  ②their concerns about being ‘scooped’ by competitors.  ③fears that their data will be misused.  ▶ But, there are many demand for unpublished data.  : WarmBase and FlyBase are important resources for their research communities, but their value as a research tool has not motivated scientists to contribute their unpublished data to these databases.  Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data2) Data deposition as a requirement of publication
Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data3) Contribution has its privileges ▶ Two of the NIH(National Institutes of Health)-funded biomedical collaboratories that we studied have attempted to motivate researchers to contribute data, particularly unpublished data, by granting special privileges to those who do so.    : Consortium for Functional Glycomics(CFG)     - ‘give in order to get’ strategy     : Biomedical Informatics Research Network (BIRN)      - development a ‘rollout’ scheme & timeline    - first only to the producer, then to specified others,   then to other members of the BIRN consortium,    and lately to the general public.
Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data4) Data as a publication ▶ Peer reviewed publication, particularly journal articles, are the centerpiece of the formal scholarly communication and reward system. Some projects and publications have sought to make more data available by treating the compilation and synthesis of published and unpublished data as publications.   : ex) the partnership between the influential journal Nature and the Alliance for Cellular Singnaling’s(AfCS) Signaling Gateway
▶ There are other examples of treating data compilations as publications. Ex 1) The Ecological Society of America(ESA) developed a new form of peer-reviewed publication called Data Papers, which are compilations and syntheses of mostly unpublished datasets.    Ex 2) Cochrane Reviews – Authors of Cochrane Reviews are encouraged to locate and incorporate unpublished data into the reviews.  Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data4) Data as a publication
Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological Differences ▶ two challenges that render it difficult to integrate data.  1) each discipline and sub-discipline has its own terminology and jargon. 2) some fields, such as ecology, do not have widely standardized methods of data collection. ▶ The Geosciences Network (GEON)  : GEON is a collaboration between geoscientist and computer scientists. The main goal of GEON is to enable researchers to access, synthesize, and model geoscience data from a wide variety of sources.
Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance ▶ Another type of solution to the difficulties of sharing data considers impediments in advance of data collection.  : ex) researchers in one of the multi-institutional, medical collaborations we studied spent almost a year to develop standardized data collection and management protocols for aggregating data produced by the distributed collaboration.
◈ Karasti, Baker, and Halkola(2006)  : Findings by Karasti, Baker, and Halkola in regard to cross-site collaboration between data managers and researchers in the U.S. Long Term Ecological Research (LTER) Network are worth noting.   : Karasti and her co-authors identify signs such as dialog among stakeholders that be visible in advance of more dramatic changes in practices and attitudes related to data.   Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance
▶ Cyberinfrastructure is an important component in efforts to share large amounts of data.  There is evidence in the cases presented here that there are some instances in which authority resides in a larger set of actors, such as computer scientists and data managers, and is not dictated primarily by researchers.   Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance
Discussion ▶ Visions of e-research emphasize large-scale databases that require massive storage capabilities, robust infrastructure for data management and transfer, and sophisticated tools for visualization and analysis.   In this chapter, we have presented several cases to illustrate some of the factors that play a role and to show the continuum of outcomes that can result.  ▶ We need to understand more about the complex factors that influence the sharing and reuse of data.   Further, it is important to consider the goal when designing approaches to share data.

More Related Content

What's hot

LEARN Conference - How to cost
LEARN Conference - How to costLEARN Conference - How to cost
LEARN Conference - How to costJisc RDM
 
Open science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin WittenbergOpen science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin WittenbergAfrican Open Science Platform
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for librariesLEARN Project
 
A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHPhilip Bourne
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overviewMartin Donnelly
 
UK data management environment and support
UK data management environment and supportUK data management environment and support
UK data management environment and supportJisc
 
Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12ASIS&T
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchDatapetermurrayrust
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Jisc
 
Data Science Training and Workforce Development
Data Science Training and Workforce DevelopmentData Science Training and Workforce Development
Data Science Training and Workforce DevelopmentLew Berman
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open ScienceMartin Donnelly
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGPhilip Bourne
 
Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Philip Bourne
 

What's hot (20)

Open Science Incentives/Veerle van den Eynden
Open Science Incentives/Veerle van den EyndenOpen Science Incentives/Veerle van den Eynden
Open Science Incentives/Veerle van den Eynden
 
LEARN Conference - How to cost
LEARN Conference - How to costLEARN Conference - How to cost
LEARN Conference - How to cost
 
Data coordination and the role of RDA
Data coordination and the role of RDAData coordination and the role of RDA
Data coordination and the role of RDA
 
Open science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin WittenbergOpen science and data sharing: the DataFirst experience/Martin Wittenberg
Open science and data sharing: the DataFirst experience/Martin Wittenberg
 
Data management: The new frontier for libraries
Data management: The new frontier for librariesData management: The new frontier for libraries
Data management: The new frontier for libraries
 
A SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIHA SWOT Analysis of Data Science @ NIH
A SWOT Analysis of Data Science @ NIH
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
The FOSTER project - general overview
The FOSTER project - general overviewThe FOSTER project - general overview
The FOSTER project - general overview
 
UK data management environment and support
UK data management environment and supportUK data management environment and support
UK data management environment and support
 
Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12Data Science - Poster - Kirk Borne - RDAP12
Data Science - Poster - Kirk Borne - RDAP12
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goal
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
OPEN DATA. The researcher perspective
OPEN DATA.  The researcher perspectiveOPEN DATA.  The researcher perspective
OPEN DATA. The researcher perspective
 
Data Science Training and Workforce Development
Data Science Training and Workforce DevelopmentData Science Training and Workforce Development
Data Science Training and Workforce Development
 
Digital Resources for Open Science
Digital Resources for Open ScienceDigital Resources for Open Science
Digital Resources for Open Science
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
 
The NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAGThe NIH as a Digital Enterprise: Implications for PAG
The NIH as a Digital Enterprise: Implications for PAG
 
Data Science BD2K Update for NIH
Data Science BD2K Update for NIH Data Science BD2K Update for NIH
Data Science BD2K Update for NIH
 

Viewers also liked

Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010guest5e6f31
 
Understanding the Big Picture of e-Science
Understanding the Big Picture of e-ScienceUnderstanding the Big Picture of e-Science
Understanding the Big Picture of e-ScienceAndrew Sallans
 
Social Signals, Personal Choices: Matching Messages with Motives
Social Signals, Personal Choices: Matching Messages with MotivesSocial Signals, Personal Choices: Matching Messages with Motives
Social Signals, Personal Choices: Matching Messages with MotivesJennifer Tucker
 
Magnificent Models: Systems Thinking and Change Management
Magnificent Models:Systems Thinking and Change ManagementMagnificent Models:Systems Thinking and Change Management
Magnificent Models: Systems Thinking and Change ManagementJennifer Tucker
 
Data Sharing in Cancer Research
Data Sharing in Cancer ResearchData Sharing in Cancer Research
Data Sharing in Cancer ResearchJennifer Tucker
 
CaGrid 1.0 Service Infrastructure
CaGrid 1.0 Service InfrastructureCaGrid 1.0 Service Infrastructure
CaGrid 1.0 Service Infrastructurebosc
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06SayDotCom.com
 
Linking Strengths and Weaknesses: Portraits of Jung Type Behavior
Linking Strengths and Weaknesses: Portraits of Jung Type BehaviorLinking Strengths and Weaknesses: Portraits of Jung Type Behavior
Linking Strengths and Weaknesses: Portraits of Jung Type BehaviorJennifer Tucker
 
The Personality Dynamics of Technical Teams
The Personality Dynamics of Technical TeamsThe Personality Dynamics of Technical Teams
The Personality Dynamics of Technical TeamsJennifer Tucker
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using TavernaStian Soiland-Reyes
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryRobin Rice
 
Thesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataThesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataHeather Piwowar
 
Non-Profit FOSS Institute
Non-Profit FOSS InstituteNon-Profit FOSS Institute
Non-Profit FOSS InstituteJennifer Tucker
 

Viewers also liked (14)

Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010
 
Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010Mik Black NZ Society of Oncologists 2010
Mik Black NZ Society of Oncologists 2010
 
Understanding the Big Picture of e-Science
Understanding the Big Picture of e-ScienceUnderstanding the Big Picture of e-Science
Understanding the Big Picture of e-Science
 
Social Signals, Personal Choices: Matching Messages with Motives
Social Signals, Personal Choices: Matching Messages with MotivesSocial Signals, Personal Choices: Matching Messages with Motives
Social Signals, Personal Choices: Matching Messages with Motives
 
Magnificent Models: Systems Thinking and Change Management
Magnificent Models:Systems Thinking and Change ManagementMagnificent Models:Systems Thinking and Change Management
Magnificent Models: Systems Thinking and Change Management
 
Data Sharing in Cancer Research
Data Sharing in Cancer ResearchData Sharing in Cancer Research
Data Sharing in Cancer Research
 
CaGrid 1.0 Service Infrastructure
CaGrid 1.0 Service InfrastructureCaGrid 1.0 Service Infrastructure
CaGrid 1.0 Service Infrastructure
 
Ticer summer school_24_aug06
Ticer summer school_24_aug06Ticer summer school_24_aug06
Ticer summer school_24_aug06
 
Linking Strengths and Weaknesses: Portraits of Jung Type Behavior
Linking Strengths and Weaknesses: Portraits of Jung Type BehaviorLinking Strengths and Weaknesses: Portraits of Jung Type Behavior
Linking Strengths and Weaknesses: Portraits of Jung Type Behavior
 
The Personality Dynamics of Technical Teams
The Personality Dynamics of Technical TeamsThe Personality Dynamics of Technical Teams
The Personality Dynamics of Technical Teams
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
Thesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research dataThesis defense, Heather Piwowar, Sharing biomedical research data
Thesis defense, Heather Piwowar, Sharing biomedical research data
 
Non-Profit FOSS Institute
Non-Profit FOSS InstituteNon-Profit FOSS Institute
Non-Profit FOSS Institute
 

Similar to Chapter 12

Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...African Open Science Platform
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011heila1
 
How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...ariadnenetwork
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachMegan O'Donnell
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research DataMichael Day
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public valueSlim Turki, Dr.
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinarSarah Jones
 
Digital data collection
Digital data collectionDigital data collection
Digital data collectionCimigo
 
Griffiths lace workshop-eden-2016
Griffiths lace workshop-eden-2016Griffiths lace workshop-eden-2016
Griffiths lace workshop-eden-2016Dai Griffiths
 
Parsec 191119 slideshare
Parsec 191119 slideshareParsec 191119 slideshare
Parsec 191119 slideshareAlison Specht
 
Overview of standards/stakeholders in life science (RDA Engagement Interest G...
Overview of standards/stakeholders in life science (RDA Engagement Interest G...Overview of standards/stakeholders in life science (RDA Engagement Interest G...
Overview of standards/stakeholders in life science (RDA Engagement Interest G...Susanna-Assunta Sansone
 

Similar to Chapter 12 (20)

Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...Data as a research output and a research asset: the case for Open Science/Sim...
Data as a research output and a research asset: the case for Open Science/Sim...
 
ACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBSACRL STS Liaisons Forum - AIBS
ACRL STS Liaisons Forum - AIBS
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011Survey of research data management practices up2010digschol2011
Survey of research data management practices up2010digschol2011
 
How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...How to overcome obstacles to data publication: Issues, requirements, and good...
How to overcome obstacles to data publication: Issues, requirements, and good...
 
Data Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approachData Management and Broader Impacts: a holistic approach
Data Management and Broader Impacts: a holistic approach
 
Data at the NIH
Data at the NIHData at the NIH
Data at the NIH
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public value
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Digital data collection
Digital data collectionDigital data collection
Digital data collection
 
Griffiths lace workshop-eden-2016
Griffiths lace workshop-eden-2016Griffiths lace workshop-eden-2016
Griffiths lace workshop-eden-2016
 
Data users, data producers
Data users, data producersData users, data producers
Data users, data producers
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Bloomsbury Conference
Bloomsbury ConferenceBloomsbury Conference
Bloomsbury Conference
 
Parsec 191119 slideshare
Parsec 191119 slideshareParsec 191119 slideshare
Parsec 191119 slideshare
 
Ratan "Are we there yet? Keeping the promise of open science"
Ratan "Are we there yet?  Keeping the promise of open science"Ratan "Are we there yet?  Keeping the promise of open science"
Ratan "Are we there yet? Keeping the promise of open science"
 
Overview of standards/stakeholders in life science (RDA Engagement Interest G...
Overview of standards/stakeholders in life science (RDA Engagement Interest G...Overview of standards/stakeholders in life science (RDA Engagement Interest G...
Overview of standards/stakeholders in life science (RDA Engagement Interest G...
 

More from Webometrics Class

검색어 대중도, 연결망 분석 - 21021899 김수빈
검색어 대중도, 연결망 분석 - 21021899 김수빈검색어 대중도, 연결망 분석 - 21021899 김수빈
검색어 대중도, 연결망 분석 - 21021899 김수빈Webometrics Class
 
언론정보학과 4학년 21021863 김귀현
언론정보학과 4학년 21021863 김귀현언론정보학과 4학년 21021863 김귀현
언론정보학과 4학년 21021863 김귀현Webometrics Class
 
언론정보학과 21113132 이은혁
언론정보학과 21113132 이은혁언론정보학과 21113132 이은혁
언론정보학과 21113132 이은혁Webometrics Class
 
웹보메트릭스21110569 이지은
웹보메트릭스21110569 이지은웹보메트릭스21110569 이지은
웹보메트릭스21110569 이지은Webometrics Class
 
웹보메트릭스 손혜영
웹보메트릭스 손혜영웹보메트릭스 손혜영
웹보메트릭스 손혜영Webometrics Class
 
웹보메트릭스 2014-1학기 언론정보학과 오지수
웹보메트릭스 2014-1학기 언론정보학과  오지수 웹보메트릭스 2014-1학기 언론정보학과  오지수
웹보메트릭스 2014-1학기 언론정보학과 오지수 Webometrics Class
 
CJ E&M 계열 채널 웹가시성 분석
CJ E&M 계열 채널 웹가시성 분석CJ E&M 계열 채널 웹가시성 분석
CJ E&M 계열 채널 웹가시성 분석Webometrics Class
 
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정Webometrics Class
 
20130621134459 언론정보학과20722115임유정
20130621134459 언론정보학과20722115임유정20130621134459 언론정보학과20722115임유정
20130621134459 언론정보학과20722115임유정Webometrics Class
 
그래프서치20810587우대식
그래프서치20810587우대식그래프서치20810587우대식
그래프서치20810587우대식Webometrics Class
 
소셜마케팅 5장 유투브마케팅활용
소셜마케팅 5장 유투브마케팅활용소셜마케팅 5장 유투브마케팅활용
소셜마케팅 5장 유투브마케팅활용Webometrics Class
 
20130506132258 빅데이터시대sns의진화-지용석[1]
20130506132258 빅데이터시대sns의진화-지용석[1]20130506132258 빅데이터시대sns의진화-지용석[1]
20130506132258 빅데이터시대sns의진화-지용석[1]Webometrics Class
 
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오Webometrics Class
 

More from Webometrics Class (20)

검색어 대중도, 연결망 분석 - 21021899 김수빈
검색어 대중도, 연결망 분석 - 21021899 김수빈검색어 대중도, 연결망 분석 - 21021899 김수빈
검색어 대중도, 연결망 분석 - 21021899 김수빈
 
20922266 박경혜
20922266 박경혜20922266 박경혜
20922266 박경혜
 
21013532양몽원
21013532양몽원21013532양몽원
21013532양몽원
 
21110547김지은
21110547김지은21110547김지은
21110547김지은
 
언론정보학과 4학년 21021863 김귀현
언론정보학과 4학년 21021863 김귀현언론정보학과 4학년 21021863 김귀현
언론정보학과 4학년 21021863 김귀현
 
언론정보학과 21113132 이은혁
언론정보학과 21113132 이은혁언론정보학과 21113132 이은혁
언론정보학과 21113132 이은혁
 
21110978 박정은
21110978 박정은 21110978 박정은
21110978 박정은
 
웹보메트릭스21110569 이지은
웹보메트릭스21110569 이지은웹보메트릭스21110569 이지은
웹보메트릭스21110569 이지은
 
웹보메트릭스 손혜영
웹보메트릭스 손혜영웹보메트릭스 손혜영
웹보메트릭스 손혜영
 
웹보메트릭스 2014-1학기 언론정보학과 오지수
웹보메트릭스 2014-1학기 언론정보학과  오지수 웹보메트릭스 2014-1학기 언론정보학과  오지수
웹보메트릭스 2014-1학기 언론정보학과 오지수
 
CJ E&M 계열 채널 웹가시성 분석
CJ E&M 계열 채널 웹가시성 분석CJ E&M 계열 채널 웹가시성 분석
CJ E&M 계열 채널 웹가시성 분석
 
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정
웹보팀Ppt 에이랜드 마케팅 제안 김보미, 손세욱, 곽동엽, 임유정
 
20130621134459 언론정보학과20722115임유정
20130621134459 언론정보학과20722115임유정20130621134459 언론정보학과20722115임유정
20130621134459 언론정보학과20722115임유정
 
Zara vs aland
Zara vs alandZara vs aland
Zara vs aland
 
20130621103231 페북
20130621103231 페북20130621103231 페북
20130621103231 페북
 
그래프서치20810587우대식
그래프서치20810587우대식그래프서치20810587우대식
그래프서치20810587우대식
 
소셜마케팅 5장 유투브마케팅활용
소셜마케팅 5장 유투브마케팅활용소셜마케팅 5장 유투브마케팅활용
소셜마케팅 5장 유투브마케팅활용
 
유튜브이야기
유튜브이야기유튜브이야기
유튜브이야기
 
20130506132258 빅데이터시대sns의진화-지용석[1]
20130506132258 빅데이터시대sns의진화-지용석[1]20130506132258 빅데이터시대sns의진화-지용석[1]
20130506132258 빅데이터시대sns의진화-지용석[1]
 
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오
청소년 위기 극복을 위한 빅데이터 기반 정책 시나리오
 

Recently uploaded

Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 

Recently uploaded (20)

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 

Chapter 12

  • 1. Prepared for 2010 Graduate seminarInformetrics and e-research (prof. Han Woo Park),at Yeungnam Univ. in S. Korea The Promise of Data in e-Research Many Challenges, Multiple Solutions, Diverse Outcomes Ann Zimmerman, Nathan Bos, Judith S. Olsen and Gary M. Olsen Presented by Kim KyoungEun river@ynu.ac.kr 15. March 2010
  • 2. Introduction ▶ The period of ‘data deluge’ (information explosion, information overload) : The need to document, manage, transfer, analyze, and preserve digital data is a significant driver of the development of tools and technologies for e-research. : It is not yet clear how data deluge will affect research practice and outcomes. The purpose of this chapter is to analyze different approaches to data sharing in order to identify important factors that may lead to success.
  • 3. Introduction ▶ Although importance of shared access to data and collaboration across disciplines and distance, finding from both past and present studies show that efforts to share data face considerable social, organizational, legal, scientific, and technical challenges. ▶ The most significant obstacle is individualism of scientist. : the ‘selfish scientist’ ◈ Goble & De Roure (2007) – “e-science is, inherently, me-science”
  • 4.
  • 5. Scientifically motivated needs, especially the questions that researchers want to answer and
  • 6.
  • 7. Why are data sharing methods that achieve positive results in one context not effective in another case?
  • 8.
  • 9. Data Sources and Methods ▶ data : “scientific or technical measurements, values calculated there from, and observations or facts that can be represented by numbers, tables, graphs, models, text, or symbols and that are used as a basis for reasoning or further calculation” ▶ Below we briefly describe our data corpus, which includes a meta-analysis of multiple distributed collaborations and a focused view of data sharing in one discipline.
  • 10. Data Sources and MethodsScience of Collaboratories ▶ The Science of Collaboratories (SOC) : the name of a five-year project funded by the U.S. National Science Foundation(NSF) to study computer-supported distributed collaborations across many research disciplines. : The overall goals of the SOC project were to : (1) perform a comparative analysis of collaboratory project (2) develop theory about this new organizational form (3) offer practical advice to collaboratory participants and to funding agencies about how to design and construct successful collaboratories.
  • 11. Data Sources and MethodsThe Sharing and Reuse of Ecological Data ▶ Interviews to investigate the experiences of ecologists were also conducted with data managers in order to obtain another perspective on the sharing and reuse of ecological data. ▶ the significant obstacles to sharing & reuse : ⒜The data are widely dispersed, heterogeneous, and complex, which make them difficult to locate and hard to reuse. ⒝ social factors that hinder data sharing, such as issues of ownership and a lack of reward for sharing.
  • 12. Data Sharing as a continuum ▶ This chapter draw on cases from their own research and examples from studies by other scholars to show that the outcomes of data sharing approaches occur along a continuum. : At one end of the continuum are approaches that allow researchers to work as they always have, and the labor necessary to prepare data, make them available, and support their use is conducted by others. In this case, data sharing considerations are not injected into the research process, but are managed by others after the fact. <-> In contrast, solutions at the other end of the continuum force researchers to consider barriers to sharing, integration, and federation at the outset of data collection and to develop solutions in advance to deal with these issues. In this case, tighter links are formed between the production and the sharing of data.
  • 13. Many Challenges, Multiple solutions, Diverse outcomes ▶ It is hard to share data. There are many reasons for this and numerous approaches have been devised to overcome these challenges. We describe some of the issues that make data sharing hard, and we analyze methods that have been developed to address them.
  • 14. Many Challenges, Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data ▶ Bringing the widely dispersed data together in a centralized database has several potential advantages. It can help to avoid duplication of effort. : The aggregation or integration of distributed data, which can be carried out by individuals, small teams of researchers, or a group of individuals with diverse skills, is a common way to create a publicly available data resource. -> The following case study illustrate some prototypical strategies designed to bring dispersed data together.
  • 15. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data1) Curating published data ※ WarmBase : maintained by the International WarmBase Consortium. : They are extracted and integrated into the Warmbasedatabase and made available to any user via the Internet. : the data benefit from reuse. : The work of WarmBase curators is made possible by funding from the National Human Genome Research Institute and the British Medical Research Counsil.
  • 16. ※ FlyBase FlyBase, the primary source for molecular and genetic information about the Drosophila (fruit fly) genome, operates and is maintained in a fashion similar to WarmBase. ※ The Ecological Society of America(ESA) : developed a digital archive for appendices and supplements, including raw data associated with papers published in ESA journals. : Since it relies on voluntary deposits of data, it lacks the comprehensiveness of WormBase and FlyBase. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data1) Curating published data
  • 17. ◈ Bos(2008) : identified economic incentives. : The economic method that has been most successful is the requirement that authors provide proof of data contribution as a prerequisite to publication. : ex) GenBank – GenBankis comprised primarily of data associated with a publication, and it does not appear to have motivated researchers to contribute unpublished data. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data2) Data deposition as a requirement of publication
  • 18. ▶ Why published data comprise the majority of data in many aggregated database? ①the time and effort required by researchers to fully document unpublished data. ②their concerns about being ‘scooped’ by competitors. ③fears that their data will be misused. ▶ But, there are many demand for unpublished data. : WarmBase and FlyBase are important resources for their research communities, but their value as a research tool has not motivated scientists to contribute their unpublished data to these databases. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data2) Data deposition as a requirement of publication
  • 19. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data3) Contribution has its privileges ▶ Two of the NIH(National Institutes of Health)-funded biomedical collaboratories that we studied have attempted to motivate researchers to contribute data, particularly unpublished data, by granting special privileges to those who do so. : Consortium for Functional Glycomics(CFG) - ‘give in order to get’ strategy : Biomedical Informatics Research Network (BIRN) - development a ‘rollout’ scheme & timeline - first only to the producer, then to specified others, then to other members of the BIRN consortium, and lately to the general public.
  • 20. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data4) Data as a publication ▶ Peer reviewed publication, particularly journal articles, are the centerpiece of the formal scholarly communication and reward system. Some projects and publications have sought to make more data available by treating the compilation and synthesis of published and unpublished data as publications. : ex) the partnership between the influential journal Nature and the Alliance for Cellular Singnaling’s(AfCS) Signaling Gateway
  • 21. ▶ There are other examples of treating data compilations as publications. Ex 1) The Ecological Society of America(ESA) developed a new form of peer-reviewed publication called Data Papers, which are compilations and syntheses of mostly unpublished datasets. Ex 2) Cochrane Reviews – Authors of Cochrane Reviews are encouraged to locate and incorporate unpublished data into the reviews. Many Challenges. Multiple solutions, Diverse outcomesAggregating and Integrating Dispersed Data4) Data as a publication
  • 22. Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological Differences ▶ two challenges that render it difficult to integrate data. 1) each discipline and sub-discipline has its own terminology and jargon. 2) some fields, such as ecology, do not have widely standardized methods of data collection. ▶ The Geosciences Network (GEON) : GEON is a collaboration between geoscientist and computer scientists. The main goal of GEON is to enable researchers to access, synthesize, and model geoscience data from a wide variety of sources.
  • 23. Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance ▶ Another type of solution to the difficulties of sharing data considers impediments in advance of data collection. : ex) researchers in one of the multi-institutional, medical collaborations we studied spent almost a year to develop standardized data collection and management protocols for aggregating data produced by the distributed collaboration.
  • 24. ◈ Karasti, Baker, and Halkola(2006) : Findings by Karasti, Baker, and Halkola in regard to cross-site collaboration between data managers and researchers in the U.S. Long Term Ecological Research (LTER) Network are worth noting. : Karasti and her co-authors identify signs such as dialog among stakeholders that be visible in advance of more dramatic changes in practices and attitudes related to data. Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance
  • 25. ▶ Cyberinfrastructure is an important component in efforts to share large amounts of data. There is evidence in the cases presented here that there are some instances in which authority resides in a larger set of actors, such as computer scientists and data managers, and is not dictated primarily by researchers. Many Challenges. Multiple solutions, Diverse outcomesOvercoming Semantic and Methodological DifferencesStandardizing in advance
  • 26. Discussion ▶ Visions of e-research emphasize large-scale databases that require massive storage capabilities, robust infrastructure for data management and transfer, and sophisticated tools for visualization and analysis. In this chapter, we have presented several cases to illustrate some of the factors that play a role and to show the continuum of outcomes that can result. ▶ We need to understand more about the complex factors that influence the sharing and reuse of data. Further, it is important to consider the goal when designing approaches to share data.
  • 27.
  • 28. What degree of control should investigators expect to have over data that they share?
  • 29. Do the benefits outweigh the financial and human costs of sharing?
  • 30. Should all data be subject to the same sharing policies? -> Answers to these and other questions are critical to achieving the promise of data in e-research.
  • 31. Thank you for your attention! Presented by Kim KyoungEun river@ynu.ac.kr