SlideShare une entreprise Scribd logo
1  sur  61
Télécharger pour lire hors ligne
#openminted_eu
Stelios Piperidis
Athena Research & Innovation Centre
OPEN SCIENCE FAIR
ATHENS, 6 SEPTEMBER 2017
The global research community generates ~2.5 million new scholarly articles
per year (English only)
… one paper published every 12seconds
… 70,000 papers published on a single protein, the tumor suppressor p53
● 1,8 billion websites & 3,46 billion internet users, on 25 September 2016.
● 24 million wireless sensors and actuators worldwide (553% up, between
2011 and 2016)
● 16 zettabytes of useful data (16 Trillion GB) by 2020
● YouTube claims to upload 24 hours of video every minute, making the site a
hugely significant data aggregator.
● Every second, on average, around 6,000 tweets are tweeted on Twitter,
which corresponds to over 350,000 tweets sent per minute, >500 million
tweets per day and around 200 billion tweets per year.
● 74,200,000 pages existed on Facebook, with 7 million apps and websites
integrated with Facebook on 30/5/2016
process textual sources, organise and classify in various dimensions, extract
main (indexical) information items,
identify and extract entities and relations between entities, facilitate the
transformation of unstructured textual sources into structured data
enable the multidimensional analysis of structured data to extract meaningful
insights and improve the ability to predict
Text Types
Newswire
Scientific Literature
Tweets/blogs
Patents
Clinical/medical records
Textbooks, monographs
Online forums
….
Languages
English
French
German
Spanish
Portuguese
Italian
Polish
….
Tasks
Translation
Information Extraction
Semantic Search
Question Answering
Sentiment Analysis
Summarization
Knowledge Discovery
….
Domains
Finance/Business
Health
Biology
Social Sciences
Humanities
….
Establish an open and sustainable Text and Data Mining
(TDM) platform and infrastructure where researchers
can discover, collaboratively create, share and re-use
knowledge from a wide range of text based scientific
and scholarly related sources.
OpenMinted sets out to create an
open, service-oriented e-Infrastructure
for Text and Data Mining (TDM) of
scientific and scholarly content.
Content/Corpora Services/tools Annotated
corpora
and CORE
content
web services
"ancillary" resources
Scientific pubs =
Research data
ANNOTATED
DATASET
DERIVED
KNOWLEDGE
1st layer
2nd layer
at the level of
licensing conditions
SCIENTIFIC
DATA
PROCESSING TOOLS/SERVICES
production Level
IaaS Cloud
Open Source
OpenStack compliant
software stack
Pithos+
Object
storage
Cyclades
Virtual Machines
Management
Builds on
GRNET DataCenters
3 locations (Athens, Crete,
Epirus)
1000+ servers,
16PB Raw Storage,
x10G InterconnectsMember of
EGI Federated
Cloud
• Model for describing content, language/knowledge
resources, tools/services (aka components)
• OMTD-SHARE schema
• Allows search and browse of publications
• Maps local schemata to OMTD-SHARE schema
• Provides access to full text
• In cooperation with respective projects (OpenAIRE, CORE)
• Provides access to external sources for metadata of TDM
related resources:
• Maven for tools (e.g. GATE, UIMA, uimaFIT, etc.) or machine
learning models.
• LR repositories (e.g. META-SHARE) for metadata of language
resources.
• Docker for dockerized tools and services.
• Provides content to OpenMinTeD
• Search on publication sources (OpenAIRE, CORE)
• Builds corpora of publications
• Stores archives of content
• Different storage backends
• Pithos+
• Local filesystem
Slide by Petr Knoth, CORE/OU
• Create/modify workflows of TDM components
• Execute workflows with user supplied content
• Provide friendly UI
• Used in biomedical research
• Cooperation with 4 international projects
• Ingest TDM tool descriptions from the registry
• Start/stop/monitor workflows
• Integrate with OMTD Store Service to supply content and store
results
Rather novice users who want to find services (end to end) that fill their needs in an
off-the-shelf type of situation.
Understand basic usage of NLP and TDM services, but not the details. They know
how to connect components, which content they must work on to get the required
results. They need to develop end to end applications.
agnostic to the internal specifics of TDM, but they need to integrate and operate
TDM services into daily workflows.
Publishers and repository managers (research libraries).
Expert language technology oriented people, who are using
specific technologies and frameworks to develop and enhance
their services.
Non NLP expert developers, creating TDM modules based on
off the shelf libraries and tools (e.g. Python, Jupyter). Not
familiar with NLP frameworks and terminology but eager to
publish their services.
Feature extraction
Data citation
Research analytics
Curation of
databases and
lexica in
Chembolomics &
neuroinformatics
Extracting
information from
tables for food
safety alerts
Data citation
From the very beginning…
Requirements, content, barriers, expected outcomes.
… to the very end
Create applications, validate and evaluate the results.
Unless possible
conflict with NC
twitter.com/openminted_eu
facebook.com/openminted
bit.do/openmintedlinkedin
vimeo.com/openminted
bit.do/openmintedplus
spip@ilsp.gr
twitter.com/openminted_eu
facebook.com/openminted
bit.do/openmintedlinkedin
vimeo.com/openminted
bit.do/openmintedplus

Contenu connexe

Plus de Open Science Fair

OSFair2017 Workshop | EOSCpilot governance
OSFair2017 Workshop | EOSCpilot governanceOSFair2017 Workshop | EOSCpilot governance
OSFair2017 Workshop | EOSCpilot governanceOpen Science Fair
 
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...Open Science Fair
 
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOpen Science Fair
 
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystemOSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystemOpen Science Fair
 
OSFair2017 Theatrical Workshop | Nucleus H2020 EU project
OSFair2017 Theatrical Workshop | Nucleus H2020 EU projectOSFair2017 Theatrical Workshop | Nucleus H2020 EU project
OSFair2017 Theatrical Workshop | Nucleus H2020 EU projectOpen Science Fair
 
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...Open Science Fair
 
OSFair2017 Training | Reproducibility in critical care research
OSFair2017 Training | Reproducibility in critical care researchOSFair2017 Training | Reproducibility in critical care research
OSFair2017 Training | Reproducibility in critical care researchOpen Science Fair
 
OSFair2017 Training | Big data and evidence-based medicine in Greece
OSFair2017 Training | Big data and evidence-based medicine in GreeceOSFair2017 Training | Big data and evidence-based medicine in Greece
OSFair2017 Training | Big data and evidence-based medicine in GreeceOpen Science Fair
 
OSFair2017 Training | What is Open Science and why should I care?
OSFair2017 Training | What is Open Science and why should I care?OSFair2017 Training | What is Open Science and why should I care?
OSFair2017 Training | What is Open Science and why should I care?Open Science Fair
 
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...Open Science Fair
 
OSFair2017 Training | Designing & implementing open access, open data & open ...
OSFair2017 Training | Designing & implementing open access, open data & open ...OSFair2017 Training | Designing & implementing open access, open data & open ...
OSFair2017 Training | Designing & implementing open access, open data & open ...Open Science Fair
 
OSFair2017 Training | Best practice in Open Science
OSFair2017 Training | Best practice in Open ScienceOSFair2017 Training | Best practice in Open Science
OSFair2017 Training | Best practice in Open ScienceOpen Science Fair
 
OSFair2017 Worksop | Innovative dissemination practices & Altmetrics
OSFair2017 Worksop | Innovative dissemination practices & AltmetricsOSFair2017 Worksop | Innovative dissemination practices & Altmetrics
OSFair2017 Worksop | Innovative dissemination practices & AltmetricsOpen Science Fair
 
OSFair2017 | Barriers to Open Science for junior researchers
OSFair2017 | Barriers to Open Science for junior researchersOSFair2017 | Barriers to Open Science for junior researchers
OSFair2017 | Barriers to Open Science for junior researchersOpen Science Fair
 
OSFair2017 | The role of women in exploring, understanding and archiving the ...
OSFair2017 | The role of women in exploring, understanding and archiving the ...OSFair2017 | The role of women in exploring, understanding and archiving the ...
OSFair2017 | The role of women in exploring, understanding and archiving the ...Open Science Fair
 
OSFair2017 | Open Science: A Global South Perspective
OSFair2017 | Open Science: A Global South PerspectiveOSFair2017 | Open Science: A Global South Perspective
OSFair2017 | Open Science: A Global South PerspectiveOpen Science Fair
 
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...Open Science Fair
 
OSFair2017 training | Open Science check list for repositories and publishers
OSFair2017 training | Open Science check list for repositories and publishersOSFair2017 training | Open Science check list for repositories and publishers
OSFair2017 training | Open Science check list for repositories and publishersOpen Science Fair
 
OSFair2017 training | Machine accessibility of Open Access scientific publica...
OSFair2017 training | Machine accessibility of Open Access scientific publica...OSFair2017 training | Machine accessibility of Open Access scientific publica...
OSFair2017 training | Machine accessibility of Open Access scientific publica...Open Science Fair
 
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...Open Science Fair
 

Plus de Open Science Fair (20)

OSFair2017 Workshop | EOSCpilot governance
OSFair2017 Workshop | EOSCpilot governanceOSFair2017 Workshop | EOSCpilot governance
OSFair2017 Workshop | EOSCpilot governance
 
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...OSFair2017 Workshop | Brokering services facilitating interoperability and da...
OSFair2017 Workshop | Brokering services facilitating interoperability and da...
 
OSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciencesOSFair2017 Workshop | Service provisioning for excellent sciences
OSFair2017 Workshop | Service provisioning for excellent sciences
 
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystemOSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
OSFair2017 Theatrical Workshop | Are you ready to perform in the rri ecosystem
 
OSFair2017 Theatrical Workshop | Nucleus H2020 EU project
OSFair2017 Theatrical Workshop | Nucleus H2020 EU projectOSFair2017 Theatrical Workshop | Nucleus H2020 EU project
OSFair2017 Theatrical Workshop | Nucleus H2020 EU project
 
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
OSFair2017 Workshop | Open Knowledge Maps, A visual interface to the world's ...
 
OSFair2017 Training | Reproducibility in critical care research
OSFair2017 Training | Reproducibility in critical care researchOSFair2017 Training | Reproducibility in critical care research
OSFair2017 Training | Reproducibility in critical care research
 
OSFair2017 Training | Big data and evidence-based medicine in Greece
OSFair2017 Training | Big data and evidence-based medicine in GreeceOSFair2017 Training | Big data and evidence-based medicine in Greece
OSFair2017 Training | Big data and evidence-based medicine in Greece
 
OSFair2017 Training | What is Open Science and why should I care?
OSFair2017 Training | What is Open Science and why should I care?OSFair2017 Training | What is Open Science and why should I care?
OSFair2017 Training | What is Open Science and why should I care?
 
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
OSFair2017 Training | OpenAIRE monitoring services, EC FP7 & H2020 & other na...
 
OSFair2017 Training | Designing & implementing open access, open data & open ...
OSFair2017 Training | Designing & implementing open access, open data & open ...OSFair2017 Training | Designing & implementing open access, open data & open ...
OSFair2017 Training | Designing & implementing open access, open data & open ...
 
OSFair2017 Training | Best practice in Open Science
OSFair2017 Training | Best practice in Open ScienceOSFair2017 Training | Best practice in Open Science
OSFair2017 Training | Best practice in Open Science
 
OSFair2017 Worksop | Innovative dissemination practices & Altmetrics
OSFair2017 Worksop | Innovative dissemination practices & AltmetricsOSFair2017 Worksop | Innovative dissemination practices & Altmetrics
OSFair2017 Worksop | Innovative dissemination practices & Altmetrics
 
OSFair2017 | Barriers to Open Science for junior researchers
OSFair2017 | Barriers to Open Science for junior researchersOSFair2017 | Barriers to Open Science for junior researchers
OSFair2017 | Barriers to Open Science for junior researchers
 
OSFair2017 | The role of women in exploring, understanding and archiving the ...
OSFair2017 | The role of women in exploring, understanding and archiving the ...OSFair2017 | The role of women in exploring, understanding and archiving the ...
OSFair2017 | The role of women in exploring, understanding and archiving the ...
 
OSFair2017 | Open Science: A Global South Perspective
OSFair2017 | Open Science: A Global South PerspectiveOSFair2017 | Open Science: A Global South Perspective
OSFair2017 | Open Science: A Global South Perspective
 
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...
OSFair2017 Workshop overview | The roadmap to better food using ICT and open ...
 
OSFair2017 training | Open Science check list for repositories and publishers
OSFair2017 training | Open Science check list for repositories and publishersOSFair2017 training | Open Science check list for repositories and publishers
OSFair2017 training | Open Science check list for repositories and publishers
 
OSFair2017 training | Machine accessibility of Open Access scientific publica...
OSFair2017 training | Machine accessibility of Open Access scientific publica...OSFair2017 training | Machine accessibility of Open Access scientific publica...
OSFair2017 training | Machine accessibility of Open Access scientific publica...
 
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...OSFair2017 training | Explore, model, analyze and visualize systematic resear...
OSFair2017 training | Explore, model, analyze and visualize systematic resear...
 

Dernier

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxBerniceCayabyab1
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 

Dernier (20)

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxSTOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptxGenBio2 - Lesson 1 - Introduction to Genetics.pptx
GenBio2 - Lesson 1 - Introduction to Genetics.pptx
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 

OSFair2017 training | From Open Access to Open Science: making sense of scientific content

  • 1. #openminted_eu Stelios Piperidis Athena Research & Innovation Centre OPEN SCIENCE FAIR ATHENS, 6 SEPTEMBER 2017
  • 2. The global research community generates ~2.5 million new scholarly articles per year (English only) … one paper published every 12seconds … 70,000 papers published on a single protein, the tumor suppressor p53
  • 3. ● 1,8 billion websites & 3,46 billion internet users, on 25 September 2016. ● 24 million wireless sensors and actuators worldwide (553% up, between 2011 and 2016) ● 16 zettabytes of useful data (16 Trillion GB) by 2020 ● YouTube claims to upload 24 hours of video every minute, making the site a hugely significant data aggregator. ● Every second, on average, around 6,000 tweets are tweeted on Twitter, which corresponds to over 350,000 tweets sent per minute, >500 million tweets per day and around 200 billion tweets per year. ● 74,200,000 pages existed on Facebook, with 7 million apps and websites integrated with Facebook on 30/5/2016
  • 4. process textual sources, organise and classify in various dimensions, extract main (indexical) information items, identify and extract entities and relations between entities, facilitate the transformation of unstructured textual sources into structured data enable the multidimensional analysis of structured data to extract meaningful insights and improve the ability to predict
  • 5. Text Types Newswire Scientific Literature Tweets/blogs Patents Clinical/medical records Textbooks, monographs Online forums …. Languages English French German Spanish Portuguese Italian Polish …. Tasks Translation Information Extraction Semantic Search Question Answering Sentiment Analysis Summarization Knowledge Discovery …. Domains Finance/Business Health Biology Social Sciences Humanities ….
  • 6.
  • 7.
  • 8. Establish an open and sustainable Text and Data Mining (TDM) platform and infrastructure where researchers can discover, collaboratively create, share and re-use knowledge from a wide range of text based scientific and scholarly related sources.
  • 9. OpenMinted sets out to create an open, service-oriented e-Infrastructure for Text and Data Mining (TDM) of scientific and scholarly content. Content/Corpora Services/tools Annotated corpora and CORE
  • 10.
  • 12. Scientific pubs = Research data ANNOTATED DATASET DERIVED KNOWLEDGE 1st layer 2nd layer at the level of licensing conditions SCIENTIFIC DATA PROCESSING TOOLS/SERVICES
  • 13.
  • 14. production Level IaaS Cloud Open Source OpenStack compliant software stack Pithos+ Object storage Cyclades Virtual Machines Management Builds on GRNET DataCenters 3 locations (Athens, Crete, Epirus) 1000+ servers, 16PB Raw Storage, x10G InterconnectsMember of EGI Federated Cloud
  • 15. • Model for describing content, language/knowledge resources, tools/services (aka components) • OMTD-SHARE schema • Allows search and browse of publications • Maps local schemata to OMTD-SHARE schema • Provides access to full text • In cooperation with respective projects (OpenAIRE, CORE)
  • 16. • Provides access to external sources for metadata of TDM related resources: • Maven for tools (e.g. GATE, UIMA, uimaFIT, etc.) or machine learning models. • LR repositories (e.g. META-SHARE) for metadata of language resources. • Docker for dockerized tools and services.
  • 17.
  • 18. • Provides content to OpenMinTeD • Search on publication sources (OpenAIRE, CORE) • Builds corpora of publications • Stores archives of content • Different storage backends • Pithos+ • Local filesystem
  • 19. Slide by Petr Knoth, CORE/OU
  • 20. • Create/modify workflows of TDM components • Execute workflows with user supplied content • Provide friendly UI • Used in biomedical research • Cooperation with 4 international projects • Ingest TDM tool descriptions from the registry • Start/stop/monitor workflows • Integrate with OMTD Store Service to supply content and store results
  • 21.
  • 22. Rather novice users who want to find services (end to end) that fill their needs in an off-the-shelf type of situation. Understand basic usage of NLP and TDM services, but not the details. They know how to connect components, which content they must work on to get the required results. They need to develop end to end applications. agnostic to the internal specifics of TDM, but they need to integrate and operate TDM services into daily workflows.
  • 23. Publishers and repository managers (research libraries). Expert language technology oriented people, who are using specific technologies and frameworks to develop and enhance their services. Non NLP expert developers, creating TDM modules based on off the shelf libraries and tools (e.g. Python, Jupyter). Not familiar with NLP frameworks and terminology but eager to publish their services.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52. Feature extraction Data citation Research analytics Curation of databases and lexica in Chembolomics & neuroinformatics Extracting information from tables for food safety alerts Data citation From the very beginning… Requirements, content, barriers, expected outcomes. … to the very end Create applications, validate and evaluate the results.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.