SlideShare une entreprise Scribd logo
1  sur  50
ChemSpider – disseminating data
and enabling an abundance of
chemistry platforms
Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey
Pshenichnov, Dmitry Ivanov, Colin Batchelor, Jon Steele
and David Sharpe
ACS New Orleans April 2013
ChemSpider
• >28.5 million unique chemicals from >400
data sources
• Focus on improving data quality, enhancing
functionality, integrating and enabling
Some usage statistics
• ca. 200 visitors at any one time, ~30,000 visits per day
• Mar 4-Apr 3, 2013
– Visits = 731,656
– Unique Visitors = 527,008
• Independent servers to support other projects
Access ChemSpider
• APIs
– Programmatic access used by Mobile Apps, Funded
Consortia projects, many Academic groups
• Widgets
– UI components for embedding in other websites
• Data
– Data access, downloads, reuse, licensing
Supporting the Semantic Web
rdf.chemspider.com/CSID
ChemSpider Resources for Chemistry
From this…..
…..to this
Simplified interface
ChemSpider Audiences
Substance Pages
It is so difficult to navigate…
What’s the
structure?
What’s the
structure?
Are they in
our file?
Are they in
our file?
What’s
similar?
What’s
similar?
What’s the
target?
What’s the
target?Pharmacology
data?
Pharmacology
data?
Known
Pathways?
Known
Pathways?
Working On
Now?
Working On
Now?Connections to
disease?
Connections to
disease?
Expressed in
right cell type?
Expressed in
right cell type?
Competitors?Competitors?
IP?IP?
• 3-year knowledge management IMI project
• Integrating chemistry and biology data and delivering
using semantic web technologies
• Open source code, open data and open standards
• Academics, Pharma companies, Publishers….
ChemSpider Contributions
• The host of the chemistry services
– Supplier of “standardized” chemical data files
– Chemistry searching (structure, substructure etc)
– Provider of data in RDF format
– Curator and data quality checking
• Now building the Open PHACTS chemical
registration system
ChemSpider Contributions
• Supplier of chemistry UI components
• “Quality Police” for data checking
• Chemical Validation and Standardization Platform
• Nanopublications from RSC publications
• FP7 Initiative. PharmaSea: increasing value and flow in
the marine biodiscovery pipeline
PharmaSea
• Dereplication via ChemSpider
• Segregation of natural products datasets
• Analytical data algorithms & integration
– Mass spec searching – predicted fragmentation
– NMR feature searching – NMR prediction
– Computer-assisted structure elucidation
Integrate to instruments and software
• Integration to analytical instrumentation vendors
already in place
– Agilent, Bruker, Thermo, Waters
• Also, Cheminformatics vendors link to ChemSpider
– Accelrys, ACD/Labs, ChemAxon, iChemLabs, and…
Natural Products Updates
• Names hard, Structures
“Obvious”
• New content based on
monthly updates of the
database
• Click through to the Natural
Products Updates entry
National Chemical Database Service
Chemical Database
Service
• National Chemical Database Service
for UK Academics
• Integrating Commercial Databases
and Services
• Chemicals, analytical data,
prediction algorithms
• Development of data repository
Retrosynthetic Analysis
Publications - a summary of work
• Scientific publications are a summary of work
– Is all work reported?
– How much science is lost to pruning?
– What of value sits in notebooks and is lost?
• How much data is lost?
– How many compounds never reported?
– How many syntheses fail or succeed?
– How many characterization measurements?
Community Repository for Data
• Funding agencies encourage sharing of data
• Increasing availability of “Open Data”
• Institutional repositories no specific domain
support
• Develop a community repository for chemistry
data – private, public, embargoed
• Provides data to develop models/algorithms
Community Repository for Data
• Automated depositions of data
• DOI’ed data objects for citation purposes
• A database of reference data, but validated by
the community
• National services feeding the repository –
crystallography, mass spectrometry
• Integrate to blogging tools for chemistry
• Integrate to Electronic Lab Notebooks as feeds
Model Building with Community Data
• Community data as a basis of model building
– Consume data from available databases, community
data, new publications and build predictive
algorithms for the community
– How many algorithms are reported and lost? How
much repeat work is done in the domain of
algorithmic development?
Recognition on
Data
IC50 Measurements for 62 substituted benzoxazoles
ChemSpider Data Repository: DOI: 10.1356/CSID784.4
Integrate to electronic lab notebooks
E-Lab Notebooks
• Previous work with IDBS and
University of Cambridge
• Working on LabTrove integration
win U. Southampton
• Integration between ELNs and:
• ChemSpider
• ChemSpider Reactions
• CDS Repository
• Publish data from ELNs issue DOIs
• Data aggregated into fully indexed
ESI format for publication
Support for Chemical Reactions
• Integrating mined reaction data from patents
(Daniel Lowe)
• Will also incorporate and integrate: Methods
of Organic Synthesis, Catalysts and Catalyzed
Reactions and…
Micro-publishing Chemical Reactions
ChemSpider SyntheticPages
Retrosynthetic Analysis
Inside our Publication Archive
• How much data is in the archive, in the
publications and in the supplementary info?
– How many compounds for ChemSpider?
– How many syntheses for ChemSpider reactions?
– How many characterization measurements?
• Property Data
• Spectral Data
• Graphs and charts to be used for modeling?
What if we could capture it all?
Digitally Enhancing the RSC Archive
Start with data in publications
Recent Work
Comparison of Spectra
Data Validation and Curation Required
CVSP: Validation and Standardization
Data Validation and Curation Required
Encouraging Participation with
Rewards and RECOGNITION
Manual Curation
• Integrated commenting, curating and validation
platform across ALL eScience and publishing
platforms
• All integrated to a central RSC profile and
feeding the AltMetrics tools
Structure Review
Where we are now…
Rewards and Recognition
Congratulations! Your 1st CSSP article
has been published. Philosopher Lao
Tzu said “A journey of a thousand
miles begins with a single step”. In the
same way we hope that this will be
the first of many submissions that you
make to CSSP.
The First Step badge is
awarded when a user
submits (& has published)
their 1st
CSSP article.
Future Recognition in AltMetrics?
ChemSpider
Why is ChemSpider “different”
• Interfaces for integration
• Sharing of data – and increasingly open
• Open for community participation
– Deposition
– Annotation
– Curation
• We are clear…the world is changing
Internet Data
The Future
Commercial Software
Pre-competitive Data
Open Science
Open Data
Publishers
Educators
Open Databases
Chemical Vendors
Small organic molecules
Undefined materials
Organometallics
Nanomaterials
Polymers
Minerals
Particle bound
Links to Biologicals
Acknowledgments
• The RSC eScience and infrastructure teams
• Our data providers, depositors, collaborators
and curators
• Daniel Lowe for Reaction Data
• William Brouwer, Penn State
• Software providers – OpenEye, ChemDoodle,
ACD/Labs, GGA Software, Open Source (Jmol,
JSpecView, OpenBabel)
Thank you
Email: williamsa@rsc.org
Twitter: ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

Contenu connexe

Tendances

L clarke faang_dcc_isag_2017_compress
L clarke faang_dcc_isag_2017_compressL clarke faang_dcc_isag_2017_compress
L clarke faang_dcc_isag_2017_compressLaura Clarke
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...Catherine Canevet
 
Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingPhilippe Rocca-Serra
 
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...Dr. Haxel Consult
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015University of Edinburgh
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows Carole Goble
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...Carole Goble
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Valery Tkachenko
 
GWAS and DAS
GWAS and DASGWAS and DAS
GWAS and DASVerena139
 
Realising the Potential of Algal Biomass Production through Semantic Web an...
Realising the Potential of Algal Biomass Production   through Semantic Web an...Realising the Potential of Algal Biomass Production   through Semantic Web an...
Realising the Potential of Algal Biomass Production through Semantic Web an...Monika Solanki
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) nickyn
 

Tendances (20)

L clarke faang_dcc_isag_2017_compress
L clarke faang_dcc_isag_2017_compressL clarke faang_dcc_isag_2017_compress
L clarke faang_dcc_isag_2017_compress
 
Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...Delivering web-based access to data and algorithms to support computational t...
Delivering web-based access to data and algorithms to support computational t...
 
From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...From data to knowledge – the Ondex System for integrating Life Sciences data ...
From data to knowledge – the Ondex System for integrating Life Sciences data ...
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
Met soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharingMet soc15 roccaserra-biocrates-datasharing
Met soc15 roccaserra-biocrates-datasharing
 
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...
ICIC 2014 Increasing the efficiency of pharmaceutical research through data i...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Using online chemistry databases to facilitate structure identification in ma...
Using online chemistry databases to facilitate structure identification in ma...Using online chemistry databases to facilitate structure identification in ma...
Using online chemistry databases to facilitate structure identification in ma...
 
Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015Panel members v2_datajournals_repositories_repofringe3aug2015
Panel members v2_datajournals_repositories_repofringe3aug2015
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Presentation of ChemSPider at PubChem Public Meeting
Presentation of ChemSPider at PubChem Public MeetingPresentation of ChemSPider at PubChem Public Meeting
Presentation of ChemSPider at PubChem Public Meeting
 
SciVerse @ TJU
SciVerse @ TJUSciVerse @ TJU
SciVerse @ TJU
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...
 
GWAS and DAS
GWAS and DASGWAS and DAS
GWAS and DAS
 
Digitizing documents to provide a public spectroscopy database
Digitizing documents to provide a public spectroscopy databaseDigitizing documents to provide a public spectroscopy database
Digitizing documents to provide a public spectroscopy database
 
Realising the Potential of Algal Biomass Production through Semantic Web an...
Realising the Potential of Algal Biomass Production   through Semantic Web an...Realising the Potential of Algal Biomass Production   through Semantic Web an...
Realising the Potential of Algal Biomass Production through Semantic Web an...
 
Structure verification and elucidation using the ChemSpider database
Structure verification and elucidation using the ChemSpider databaseStructure verification and elucidation using the ChemSpider database
Structure verification and elucidation using the ChemSpider database
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI)
 

Similaire à ChemSpider – disseminating data and enabling an abundance of chemistry platforms

ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryDr. Haxel Consult
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineKen Karapetyan
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College LondonTorsten Reimer
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsValery Tkachenko
 
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...Ardan Patwardhan
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Ann-Marie Roche
 

Similaire à ChemSpider – disseminating data and enabling an abundance of chemistry platforms (20)

ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
The expansive reach of ChemSpider as a resource for the chemistry community
The expansive reach of ChemSpider as a resource for the chemistry communityThe expansive reach of ChemSpider as a resource for the chemistry community
The expansive reach of ChemSpider as a resource for the chemistry community
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College London
 
Royal Society of Chemistry projects underpinning open innovation
Royal Society of Chemistry projects underpinning open innovationRoyal Society of Chemistry projects underpinning open innovation
Royal Society of Chemistry projects underpinning open innovation
 
Utilizing Online Databases for the Purpose of Structure Identification – Appr...
Utilizing Online Databases for the Purpose of Structure Identification – Appr...Utilizing Online Databases for the Purpose of Structure Identification – Appr...
Utilizing Online Databases for the Purpose of Structure Identification – Appr...
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Providing support for JC Bradleys vision of open science using RSC cheminform...
Providing support for JC Bradleys vision of open science using RSC cheminform...Providing support for JC Bradleys vision of open science using RSC cheminform...
Providing support for JC Bradleys vision of open science using RSC cheminform...
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platforms
 
Challenging cajoling and rewarding the community for their contributions to o...
Challenging cajoling and rewarding the community for their contributions to o...Challenging cajoling and rewarding the community for their contributions to o...
Challenging cajoling and rewarding the community for their contributions to o...
 
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007
 

Plus de Ken Karapetyan

ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...Ken Karapetyan
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...Ken Karapetyan
 
Digitally enabling the RSC archive
Digitally enabling the RSC archiveDigitally enabling the RSC archive
Digitally enabling the RSC archiveKen Karapetyan
 
Building support for the semantic web for chemistry at the Royal Society of C...
Building support for the semantic web for chemistry at the Royal Society of C...Building support for the semantic web for chemistry at the Royal Society of C...
Building support for the semantic web for chemistry at the Royal Society of C...Ken Karapetyan
 
Royal society of chemistry developments to support open drug discovery
Royal society of chemistry developments to support open drug discoveryRoyal society of chemistry developments to support open drug discovery
Royal society of chemistry developments to support open drug discoveryKen Karapetyan
 
Data enhancing the royal society of chemistry publication archive
Data enhancing the royal society of chemistry publication archiveData enhancing the royal society of chemistry publication archive
Data enhancing the royal society of chemistry publication archiveKen Karapetyan
 
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...Applying Royal Society of Chemistry cheminformatics skills to support the Pha...
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...Ken Karapetyan
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...Ken Karapetyan
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectKen Karapetyan
 
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Ken Karapetyan
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspKen Karapetyan
 

Plus de Ken Karapetyan (13)

ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...
 
The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...The RSC chemical validation and standardization platform, a potential path to...
The RSC chemical validation and standardization platform, a potential path to...
 
Digitally enabling the RSC archive
Digitally enabling the RSC archiveDigitally enabling the RSC archive
Digitally enabling the RSC archive
 
Building support for the semantic web for chemistry at the Royal Society of C...
Building support for the semantic web for chemistry at the Royal Society of C...Building support for the semantic web for chemistry at the Royal Society of C...
Building support for the semantic web for chemistry at the Royal Society of C...
 
Royal society of chemistry developments to support open drug discovery
Royal society of chemistry developments to support open drug discoveryRoyal society of chemistry developments to support open drug discovery
Royal society of chemistry developments to support open drug discovery
 
Data enhancing the royal society of chemistry publication archive
Data enhancing the royal society of chemistry publication archiveData enhancing the royal society of chemistry publication archive
Data enhancing the royal society of chemistry publication archive
 
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...Applying Royal Society of Chemistry cheminformatics skills to support the Pha...
Applying Royal Society of Chemistry cheminformatics skills to support the Pha...
 
How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...How the InChI identifier is used to underpin our online chemistry databases a...
How the InChI identifier is used to underpin our online chemistry databases a...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
Standardization and Generation of Parents for Open PHACTS Chemical Registry S...
 
SERMACS 2012
SERMACS 2012SERMACS 2012
SERMACS 2012
 
Acs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvspAcs 2013 indianapolis_cvsp
Acs 2013 indianapolis_cvsp
 
Data model
Data modelData model
Data model
 

Dernier

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 

Dernier (20)

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 

ChemSpider – disseminating data and enabling an abundance of chemistry platforms

  • 1. ChemSpider – disseminating data and enabling an abundance of chemistry platforms Antony Williams, Valery Tkachenko, Ken Karapetyan, Alexey Pshenichnov, Dmitry Ivanov, Colin Batchelor, Jon Steele and David Sharpe ACS New Orleans April 2013
  • 2. ChemSpider • >28.5 million unique chemicals from >400 data sources • Focus on improving data quality, enhancing functionality, integrating and enabling
  • 3.
  • 4. Some usage statistics • ca. 200 visitors at any one time, ~30,000 visits per day • Mar 4-Apr 3, 2013 – Visits = 731,656 – Unique Visitors = 527,008 • Independent servers to support other projects
  • 5. Access ChemSpider • APIs – Programmatic access used by Mobile Apps, Funded Consortia projects, many Academic groups • Widgets – UI components for embedding in other websites • Data – Data access, downloads, reuse, licensing
  • 6. Supporting the Semantic Web rdf.chemspider.com/CSID
  • 8.
  • 9. From this….. …..to this Simplified interface ChemSpider Audiences
  • 11.
  • 12. It is so difficult to navigate… What’s the structure? What’s the structure? Are they in our file? Are they in our file? What’s similar? What’s similar? What’s the target? What’s the target?Pharmacology data? Pharmacology data? Known Pathways? Known Pathways? Working On Now? Working On Now?Connections to disease? Connections to disease? Expressed in right cell type? Expressed in right cell type? Competitors?Competitors? IP?IP?
  • 13. • 3-year knowledge management IMI project • Integrating chemistry and biology data and delivering using semantic web technologies • Open source code, open data and open standards • Academics, Pharma companies, Publishers….
  • 14. ChemSpider Contributions • The host of the chemistry services – Supplier of “standardized” chemical data files – Chemistry searching (structure, substructure etc) – Provider of data in RDF format – Curator and data quality checking • Now building the Open PHACTS chemical registration system
  • 15. ChemSpider Contributions • Supplier of chemistry UI components • “Quality Police” for data checking • Chemical Validation and Standardization Platform • Nanopublications from RSC publications
  • 16. • FP7 Initiative. PharmaSea: increasing value and flow in the marine biodiscovery pipeline
  • 17. PharmaSea • Dereplication via ChemSpider • Segregation of natural products datasets • Analytical data algorithms & integration – Mass spec searching – predicted fragmentation – NMR feature searching – NMR prediction – Computer-assisted structure elucidation
  • 18. Integrate to instruments and software • Integration to analytical instrumentation vendors already in place – Agilent, Bruker, Thermo, Waters • Also, Cheminformatics vendors link to ChemSpider – Accelrys, ACD/Labs, ChemAxon, iChemLabs, and…
  • 19. Natural Products Updates • Names hard, Structures “Obvious” • New content based on monthly updates of the database • Click through to the Natural Products Updates entry
  • 21. Chemical Database Service • National Chemical Database Service for UK Academics • Integrating Commercial Databases and Services • Chemicals, analytical data, prediction algorithms • Development of data repository
  • 23. Publications - a summary of work • Scientific publications are a summary of work – Is all work reported? – How much science is lost to pruning? – What of value sits in notebooks and is lost? • How much data is lost? – How many compounds never reported? – How many syntheses fail or succeed? – How many characterization measurements?
  • 24. Community Repository for Data • Funding agencies encourage sharing of data • Increasing availability of “Open Data” • Institutional repositories no specific domain support • Develop a community repository for chemistry data – private, public, embargoed • Provides data to develop models/algorithms
  • 25. Community Repository for Data • Automated depositions of data • DOI’ed data objects for citation purposes • A database of reference data, but validated by the community • National services feeding the repository – crystallography, mass spectrometry • Integrate to blogging tools for chemistry • Integrate to Electronic Lab Notebooks as feeds
  • 26. Model Building with Community Data • Community data as a basis of model building – Consume data from available databases, community data, new publications and build predictive algorithms for the community – How many algorithms are reported and lost? How much repeat work is done in the domain of algorithmic development?
  • 27. Recognition on Data IC50 Measurements for 62 substituted benzoxazoles ChemSpider Data Repository: DOI: 10.1356/CSID784.4
  • 28. Integrate to electronic lab notebooks
  • 29. E-Lab Notebooks • Previous work with IDBS and University of Cambridge • Working on LabTrove integration win U. Southampton • Integration between ELNs and: • ChemSpider • ChemSpider Reactions • CDS Repository • Publish data from ELNs issue DOIs • Data aggregated into fully indexed ESI format for publication
  • 30. Support for Chemical Reactions • Integrating mined reaction data from patents (Daniel Lowe) • Will also incorporate and integrate: Methods of Organic Synthesis, Catalysts and Catalyzed Reactions and…
  • 34. Inside our Publication Archive • How much data is in the archive, in the publications and in the supplementary info? – How many compounds for ChemSpider? – How many syntheses for ChemSpider reactions? – How many characterization measurements? • Property Data • Spectral Data • Graphs and charts to be used for modeling?
  • 35. What if we could capture it all? Digitally Enhancing the RSC Archive
  • 36. Start with data in publications
  • 39. Data Validation and Curation Required
  • 40. CVSP: Validation and Standardization
  • 41. Data Validation and Curation Required Encouraging Participation with Rewards and RECOGNITION
  • 42. Manual Curation • Integrated commenting, curating and validation platform across ALL eScience and publishing platforms • All integrated to a central RSC profile and feeding the AltMetrics tools
  • 44. Where we are now…
  • 45. Rewards and Recognition Congratulations! Your 1st CSSP article has been published. Philosopher Lao Tzu said “A journey of a thousand miles begins with a single step”. In the same way we hope that this will be the first of many submissions that you make to CSSP. The First Step badge is awarded when a user submits (& has published) their 1st CSSP article.
  • 46. Future Recognition in AltMetrics? ChemSpider
  • 47. Why is ChemSpider “different” • Interfaces for integration • Sharing of data – and increasingly open • Open for community participation – Deposition – Annotation – Curation • We are clear…the world is changing
  • 48. Internet Data The Future Commercial Software Pre-competitive Data Open Science Open Data Publishers Educators Open Databases Chemical Vendors Small organic molecules Undefined materials Organometallics Nanomaterials Polymers Minerals Particle bound Links to Biologicals
  • 49. Acknowledgments • The RSC eScience and infrastructure teams • Our data providers, depositors, collaborators and curators • Daniel Lowe for Reaction Data • William Brouwer, Penn State • Software providers – OpenEye, ChemDoodle, ACD/Labs, GGA Software, Open Source (Jmol, JSpecView, OpenBabel)
  • 50. Thank you Email: williamsa@rsc.org Twitter: ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams