SlideShare une entreprise Scribd logo
1  sur  32
Introduction to Databases
and Ontologies
Sadia Madhuri Rasha
B160305043
26/05/23 Computer Science and Engineering Department, Jagannath University 1
 Introduction
 NIF - the Neuroscience Information Framework
 Federating neuroscience-relevant databases
 Structured information: data, databases
 Information frameworks
 Ontologies
 Databases
 Conclusions
Outline 2
NIF - Neuroscience Information Framework 3
https://neuinfo.org/
 2000 databases
 170 federating databases
 The Neuroscience Information Framework is a dynamic repository of Web-based
neuroscience, global neuroscience web resources, including experimental,
clinical, and translational neuroscience databases, knowledge bases, atlases, and
genetic/genomic resources and provides many authoritative links throughout the
neuroscience portal of Wikipedia.
Data format: html, xml, json, csv
 Established in 2004 by the National Institutes of Health.
 Development of the NIF started in 2008, when the University of California, San
Diego School of Medicine obtained an NIH contract to create and maintain "a
dynamic inventory of web-based neurosciences data, resources, and tools that
scientists and students can access via any computer connected to the Internet.
NIF - Neuroscience Information Framework 4
https://en.wikipedia.org/wiki/Neuroscience_Information_Framework
NIF- SciCrunch 5
https://scicrunch.org/
 The main goal of establishing NIF is to build a culture of data sharing.
 This creates a strong community of researchers as “One mind for
research”.
 SciCrunch- NIF has been adapted to create SciCrunch to create more
narrow community based portals based on common data platform.
 Cost effective. Data portal can be
set up in a few hours.
 Allows researchers organize and
store their data.
 Connects communities through
data and tools.
 Shared curation shared knowledge
Data Federation 6
https://www.tibco.com/reference-center/what-is-a-data-federation
 Data Federation: is a software process that allows multiple
databases to appear and function as one. This creates virtual
database that takes data from a range of sources and converts them
all to a common model.
This provides a single source of data for front-end applications. A
data federation is part of the data virtualization framework.
Data virtualization grew with data federation but sprouted extra
features, applications, and functions
Examples: Teradata with Query Grid, IBM Pure Data Systems with
Fluid Query, and SAP HANA with Smart Data Services are some of
the products offering data federation.
Data Federation 7
 Data federation eliminates the need to create yet another database or data
warehouse and save the cost of creating a permanent, physical relational
database.
 It manages the integration of individual databases from different sources with a
central data store and makes it accessible under a uniform data model through
virtualization rather than by physical storage.
 It enables querying data from different sources into a single virtual format. As
such, data federation has fewer points of potential failure and provides better
data integration.
A key advantage of the federation approach is that it allows for real-time
information access
Why data federation? 8
Federated Database: system is a type of meta-database management system,
which transparently maps multiple autonomous database systems into a single
federated database. The constituent databases are fully integrated and
interconnected via a computer network and can be geographically
decentralized.
Data sources could be both structured (relational databases, XML documents,
etc) and/or unstructured (Excel spread sheets, medical records, etc).
Because various database management systems employ different query
languages, federated database systems can apply ‘wrappers’ to the subqueries
to translate them into the appropriate query language.
Federated Database 9
https://en.wikipedia.org/wiki/Federated_database_system
Data Federation VS Data Warehouse 10
Data Federation VS Data Warehouse:
Data federation software creates a single repository that
doesn’t contain the data itself, rather its metadata
(information about the actual data/its location). It
involves transformation, cleansing, and possibly even
enrichment of data in a decentralized manner while
data warehouse or enterprise data warehouse (EDW) is
a major alternative to a data federation. It creates a
centralized repository that pulls data from multiple
sources for analysis and integrates data physically.
DF does not need to move or copy data and it prevents
data from being changed. While an EDW moves data
into a single repository and updates data.
Structured vs Unstructured Data 11
Two main categories of data is: structured and unstructured data type.
Structured data is highly specific and is stored in a predefined format, where
unstructured data is a compilation of many varied types of data that are stored in their
native formats. This means that structured data takes advantage of schema-on-write and
unstructured data employs schema-on-read.
 Structured data consists of clearly defined standardized data types with patterns that
make them easily searchable while Unstructured data is more difficult to collect, process,
and analyze.
Common examples of structured data include customer relationship management
(CRM), invoicing systems, product databases, and contact lists. Unstructured data
includes various content such as documents, videos, audio files, posts on social media,
and emails.
Requirements for Effective Data Sharing 12
 Discoverability
 Data can be found
 Accessibility
 Data can be accessed
 Assessability
 The reliability of the data can be
determined
 Understandability
 Data can be understood
 Usability
 Data are in a usable form
 Population
Comprehensive coverage of
domain
Up to date information
 Trusted source
Provenance of data
Quality of curation
Introduction to Ontologies 13
https://www.bioontology.org/
 Ontology comes from two Greek words: on, which means "being," and logia, which means "study." So ontology
is the study of being alive and existing. The term is generally credited to the great Ionian mathematician,
scientist, and religious mystic Pythagoras
Ontology 14
 Ontology: An ontology is a formal representation of knowledge that consists of
a set of classes that represent concepts defining a field and the relationships
among these classes.
 In simple terms, ontology is a data structure that specifies, for a given
application area,
 Entities
 Properties of entities
 Relationships among the entities
NIF Semantic Frame Work - NIFSTD 15
NIF Standard Ontology (NIFSTD): is a core
component of Neuroscience Information
Framework (NIF) project
(http://neuinfo.org), a semantically
enhanced portal for accessing and
integrating neuroscience data, tools and
information.
 NIFSTD includes a set of modular
ontologies that provide a comprehensive
collection of terminologies to describe
neuroscience data and resources. Fig: Ontology of diseases:
Ontology (Cont.) 16
 Typically ontology can be seen as a 5-tuple where its components are: Concepts, relationships,
functions, instances and axioms.
Why develop an ontology? 18
 It shares a common understanding of information under a particular
domain that allows:
 organizations to make better sense of their data.
 Provide explicit meaning of the entities among peoples and software.
 It defines and arranges the classes in the ontology in a taxonomic
(subclass–superclass) hierarchy, then define the roles and values for these
classes.
 Increased quality of entity analysis. Increased use, reuse, and
maintainability of the information systems.
 Explains the properties of each classes describing features and attributes
of the class.
 Facilitation of domain knowledge sharing, with common vocabulary
across independent software applications.
How to develop an ontology? 19
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
Step 1: Determine the domain and scope of the ontology. The
ontology have to contain enough information to answer the
following list of questions:
What is the domain that the ontology will cover?
For what we are going to use the ontology?
For what types of questions the information in the ontology should provide
answers?
Who will use and maintain the ontology?
How to develop an ontology? 20
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
 Step 2: Consider reusing existing ontologies.
 Refine and extend existing sources for the particular domain and task.
May be a requirement if our system needs to interact with other applications
that have already committed to particular ontologies or controlled
vocabularies.
 Step 3: Enumerate important terms in the ontology.
 List all terms needed to make statements to explain to a user.
 List all the properties belongs to that terms have.
Example:
– Important wine-related terms will include wine, grape, winery, location, a
wine’s color, body, flavor and sugar content;
– Subtypes of wine such as white wine, and so on.
How to develop an ontology? 21
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
Step 4: Define the classes and the class hierarchy.
Start by defining classes from the list created in Step 3.
Classes in the ontology: – The terms that describe objects having independent
existence rather than terms that describe these objects.
Organize the classes into a hierarchical taxonomy by asking if by being an instance of
one class.
If a class A is a superclass of class B, then every instance of B is also an instance of A
 Possible approaches of creating class hierarchy :
A top-down development process starts with the definition of the most general
concepts in the domain.
A bottom-up development process starts with the definition of the most specific
classes.
A combination development process is a combination of the top-down and bottom-
up approaches
How to develop an ontology? 22
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
Step 5: Define the properties of classes (slots).
Remaining terms created in step 3 are properties of these classes.
From the wine example, the remaining terms are: wine’s color, body, flavor,
sugar content and location of a winery.
For each property in the list, determine which class it describes.
In the example: the Wine class will have the following slots:
―color, body, flavor, and sugar.
All subclasses of a class inherit the slot of that class.
How to develop an ontology? 23
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
 Step 6: Define the facets of the slots.
Slot can have different facets describing the value type, allowed values,
the number of the values (cardinality), and other features of the values
the slot can take.
―E.g. The value of a name is a string.
Slot cardinality – defines how many values a slot can have. Some
systems distinguish only between single cardinality and multiple
cardinality.
Slot-value type – A value-type facet describes what types of values can
fill in the slot.
How to develop an ontology? 24
https://protege.stanford.edu/publications/ontology_development/ontology101.pdf
 Step 7: Instances of classes in the hierarchy.
Defining an individual instance of a class requires
choosing a class
creating an individual instance of that class
filling in the slot values.
Applications of Ontologies 25
Machine Learning & Deep Learning: Supervised and unsupervised machine
learning algorithms applied to well known, and understood data can rapidly detect
failures and put corrective actions into motion before severe damage occurs.
Business Intelligence (BI): Any enterprise
information architecture intended to enable
horizontal communication between disparate
data sources, with related and/or potentially
different domains (e.g., banking and
insurance), must identify an ontology for
rapidly merging, and extracting Key Data
Elements (KDE) necessary for answering
essential competency questions.
Applications of Ontologies 26
 Ontology in AI: The ontology sets the scene for the knowledge graph to
capture the data in a domain by specifying the structure of the knowledge in
that domain including taxonomies, topic maps, logical models, and vocabularies.
Personalized Shopping
AI-Powered Assistants
Fraud Prevention
Administrative Tasks Automated to Aid Educators
Creating Smart Content
Voice Assistants
Personalized Learning
Autonomous Vehicles.
Natural Language processing
Cognitive AI
 Ontology in research: Ontology helps researchers recognize how certain they
can be about the nature and existence of objects they are researching.
Applications of Ontologies 27
 Ontology in Biomedicine: Ontologies are used in the biomedical and health
sciences in areas ranging from gene function, as seen in the gene ontology GO, to
those used in healthcare informatics such as the International Classification of
Diseases, ICD.
Ontology in education: The ontology can be used to guide students
to understand the organization of their own learning and to self-
assess their own progress.
 Ontology in neuroscience: Ontology is broadly used in Neuroscience
Information Framework (NIF) project. NIFSTD and Web Ontology
Language (OWL) are two important tools to analyze neuroscience
data.
Applications of Ontologies 30
Databases vs Ontologies 28
1. The fundamental focus of an ontology is to specify and share meaning. The
fundamental focus for a database schema is to describe data and storage.
2. A relational database schema has a single purpose: to structure a set of
instances for efficient storage and querying. An ontology can also be used to
structure a set of instances in a database but it has a broader range of
purposes including better communication, interoperability, search, and
software engineering, a communication bridge between a human and a
machine.
3. When an instance is created in an ontology, the respective mappings must
also be created based on rules, which is not necessarily the case for
databases. That’s why ontology perform better than databases.
4. Ontologies utilize the OWA system of knowledge representation, while the
CWA is used by databases.
Databases vs Ontologies 29
5. Database system apply the normalization of tables to delete redundant
data from the tables, to reduce the complexity. Normal forms are a set of
rules that help to correct the transformation of entities and relationships to
the structure of the physical layout of the tables that is not used in
ontologies.
6. A database uses an ER diagram to describe the syntax; this technique is
used for abstract and conceptual data representation. In ontology, however,
the syntax is written by logic; the most common is the description logic that
corresponds to the OWL DL, a language for creating ontologies.
7. Ontologies can be created using existing ontologies while databases are
created from scratch like all tables and their content have to design new.
8. But there are certain similarities between ontologies and databases. we can
convert a database to an ontology and vice versa using the approach
Ontology Inverse Engineering.
1. Data dependency
2. Efficient data access;
3. Data integrity and security;
4. Data administration;
5. Concurrent access and crash recovery;
6. Reduced application development time.
Purposes of Development of the Databases 30
What is NIF and it’s purposes?
 What is data federation?
 Data federation vs data warehouse
 What is the introduction of ontology?
 What is the difference between database and ontology?
 What are the advantages of ontology over database?
 What is the main goal of ontology?
 Why is ontology important?
 What is the application of ontology?
 Web ontology Language(OWL)
Conclusions 31
32
Thank You

Contenu connexe

Similaire à Neuroinformatics Databases Ontologies Federated Database.pptx

Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1
Svensson Leung
 

Similaire à Neuroinformatics Databases Ontologies Federated Database.pptx (20)

AIS 3 - EDITED.pdf
AIS 3 - EDITED.pdfAIS 3 - EDITED.pdf
AIS 3 - EDITED.pdf
 
Data Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information SystemsData Integration in Multi-sources Information Systems
Data Integration in Multi-sources Information Systems
 
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.Exploration of a Data Landscape using a Collaborative Linked Data Framework.
Exploration of a Data Landscape using a Collaborative Linked Data Framework.
 
Using Taxonomies to Create People Directories and Author Networks
Using Taxonomies to Create People Directories and Author Networks Using Taxonomies to Create People Directories and Author Networks
Using Taxonomies to Create People Directories and Author Networks
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
An Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured DataAn Improved Annotation Based Summary Generation For Unstructured Data
An Improved Annotation Based Summary Generation For Unstructured Data
 
Llinked open data training for EU institutions
Llinked open data training for EU institutionsLlinked open data training for EU institutions
Llinked open data training for EU institutions
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)MC0088 Internal Assignment (SMU)
MC0088 Internal Assignment (SMU)
 
Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
Paper id 252014139
Paper id 252014139Paper id 252014139
Paper id 252014139
 
Linked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and ExamplesLinked Open Data Principles, Technologies and Examples
Linked Open Data Principles, Technologies and Examples
 
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...Reference Model for an Open Archival Information Systems (OAIS): Overview and...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
 
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
Database systems Handbook 4th  dbms by Muhammad Sharif.pdfDatabase systems Handbook 4th  dbms by Muhammad Sharif.pdf
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
 
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
Database systems Handbook 4th  dbms by Muhammad Sharif.pdfDatabase systems Handbook 4th  dbms by Muhammad Sharif.pdf
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
 
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
Database systems Handbook 4th  dbms by Muhammad Sharif.pdfDatabase systems Handbook 4th  dbms by Muhammad Sharif.pdf
Database systems Handbook 4th dbms by Muhammad Sharif.pdf
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
chapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.pptchapter 1-Overview of Information Retrieval.ppt
chapter 1-Overview of Information Retrieval.ppt
 

Dernier

Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 

Dernier (20)

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 

Neuroinformatics Databases Ontologies Federated Database.pptx

  • 1. Introduction to Databases and Ontologies Sadia Madhuri Rasha B160305043 26/05/23 Computer Science and Engineering Department, Jagannath University 1
  • 2.  Introduction  NIF - the Neuroscience Information Framework  Federating neuroscience-relevant databases  Structured information: data, databases  Information frameworks  Ontologies  Databases  Conclusions Outline 2
  • 3. NIF - Neuroscience Information Framework 3 https://neuinfo.org/  2000 databases  170 federating databases
  • 4.  The Neuroscience Information Framework is a dynamic repository of Web-based neuroscience, global neuroscience web resources, including experimental, clinical, and translational neuroscience databases, knowledge bases, atlases, and genetic/genomic resources and provides many authoritative links throughout the neuroscience portal of Wikipedia. Data format: html, xml, json, csv  Established in 2004 by the National Institutes of Health.  Development of the NIF started in 2008, when the University of California, San Diego School of Medicine obtained an NIH contract to create and maintain "a dynamic inventory of web-based neurosciences data, resources, and tools that scientists and students can access via any computer connected to the Internet. NIF - Neuroscience Information Framework 4 https://en.wikipedia.org/wiki/Neuroscience_Information_Framework
  • 5. NIF- SciCrunch 5 https://scicrunch.org/  The main goal of establishing NIF is to build a culture of data sharing.  This creates a strong community of researchers as “One mind for research”.  SciCrunch- NIF has been adapted to create SciCrunch to create more narrow community based portals based on common data platform.  Cost effective. Data portal can be set up in a few hours.  Allows researchers organize and store their data.  Connects communities through data and tools.  Shared curation shared knowledge
  • 6. Data Federation 6 https://www.tibco.com/reference-center/what-is-a-data-federation  Data Federation: is a software process that allows multiple databases to appear and function as one. This creates virtual database that takes data from a range of sources and converts them all to a common model. This provides a single source of data for front-end applications. A data federation is part of the data virtualization framework. Data virtualization grew with data federation but sprouted extra features, applications, and functions Examples: Teradata with Query Grid, IBM Pure Data Systems with Fluid Query, and SAP HANA with Smart Data Services are some of the products offering data federation.
  • 8.  Data federation eliminates the need to create yet another database or data warehouse and save the cost of creating a permanent, physical relational database.  It manages the integration of individual databases from different sources with a central data store and makes it accessible under a uniform data model through virtualization rather than by physical storage.  It enables querying data from different sources into a single virtual format. As such, data federation has fewer points of potential failure and provides better data integration. A key advantage of the federation approach is that it allows for real-time information access Why data federation? 8
  • 9. Federated Database: system is a type of meta-database management system, which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are fully integrated and interconnected via a computer network and can be geographically decentralized. Data sources could be both structured (relational databases, XML documents, etc) and/or unstructured (Excel spread sheets, medical records, etc). Because various database management systems employ different query languages, federated database systems can apply ‘wrappers’ to the subqueries to translate them into the appropriate query language. Federated Database 9 https://en.wikipedia.org/wiki/Federated_database_system
  • 10. Data Federation VS Data Warehouse 10 Data Federation VS Data Warehouse: Data federation software creates a single repository that doesn’t contain the data itself, rather its metadata (information about the actual data/its location). It involves transformation, cleansing, and possibly even enrichment of data in a decentralized manner while data warehouse or enterprise data warehouse (EDW) is a major alternative to a data federation. It creates a centralized repository that pulls data from multiple sources for analysis and integrates data physically. DF does not need to move or copy data and it prevents data from being changed. While an EDW moves data into a single repository and updates data.
  • 11. Structured vs Unstructured Data 11 Two main categories of data is: structured and unstructured data type. Structured data is highly specific and is stored in a predefined format, where unstructured data is a compilation of many varied types of data that are stored in their native formats. This means that structured data takes advantage of schema-on-write and unstructured data employs schema-on-read.  Structured data consists of clearly defined standardized data types with patterns that make them easily searchable while Unstructured data is more difficult to collect, process, and analyze. Common examples of structured data include customer relationship management (CRM), invoicing systems, product databases, and contact lists. Unstructured data includes various content such as documents, videos, audio files, posts on social media, and emails.
  • 12. Requirements for Effective Data Sharing 12  Discoverability  Data can be found  Accessibility  Data can be accessed  Assessability  The reliability of the data can be determined  Understandability  Data can be understood  Usability  Data are in a usable form  Population Comprehensive coverage of domain Up to date information  Trusted source Provenance of data Quality of curation
  • 13. Introduction to Ontologies 13 https://www.bioontology.org/  Ontology comes from two Greek words: on, which means "being," and logia, which means "study." So ontology is the study of being alive and existing. The term is generally credited to the great Ionian mathematician, scientist, and religious mystic Pythagoras
  • 14. Ontology 14  Ontology: An ontology is a formal representation of knowledge that consists of a set of classes that represent concepts defining a field and the relationships among these classes.  In simple terms, ontology is a data structure that specifies, for a given application area,  Entities  Properties of entities  Relationships among the entities
  • 15. NIF Semantic Frame Work - NIFSTD 15 NIF Standard Ontology (NIFSTD): is a core component of Neuroscience Information Framework (NIF) project (http://neuinfo.org), a semantically enhanced portal for accessing and integrating neuroscience data, tools and information.  NIFSTD includes a set of modular ontologies that provide a comprehensive collection of terminologies to describe neuroscience data and resources. Fig: Ontology of diseases:
  • 16. Ontology (Cont.) 16  Typically ontology can be seen as a 5-tuple where its components are: Concepts, relationships, functions, instances and axioms.
  • 17. Why develop an ontology? 18  It shares a common understanding of information under a particular domain that allows:  organizations to make better sense of their data.  Provide explicit meaning of the entities among peoples and software.  It defines and arranges the classes in the ontology in a taxonomic (subclass–superclass) hierarchy, then define the roles and values for these classes.  Increased quality of entity analysis. Increased use, reuse, and maintainability of the information systems.  Explains the properties of each classes describing features and attributes of the class.  Facilitation of domain knowledge sharing, with common vocabulary across independent software applications.
  • 18. How to develop an ontology? 19 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf Step 1: Determine the domain and scope of the ontology. The ontology have to contain enough information to answer the following list of questions: What is the domain that the ontology will cover? For what we are going to use the ontology? For what types of questions the information in the ontology should provide answers? Who will use and maintain the ontology?
  • 19. How to develop an ontology? 20 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf  Step 2: Consider reusing existing ontologies.  Refine and extend existing sources for the particular domain and task. May be a requirement if our system needs to interact with other applications that have already committed to particular ontologies or controlled vocabularies.  Step 3: Enumerate important terms in the ontology.  List all terms needed to make statements to explain to a user.  List all the properties belongs to that terms have. Example: – Important wine-related terms will include wine, grape, winery, location, a wine’s color, body, flavor and sugar content; – Subtypes of wine such as white wine, and so on.
  • 20. How to develop an ontology? 21 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf Step 4: Define the classes and the class hierarchy. Start by defining classes from the list created in Step 3. Classes in the ontology: – The terms that describe objects having independent existence rather than terms that describe these objects. Organize the classes into a hierarchical taxonomy by asking if by being an instance of one class. If a class A is a superclass of class B, then every instance of B is also an instance of A  Possible approaches of creating class hierarchy : A top-down development process starts with the definition of the most general concepts in the domain. A bottom-up development process starts with the definition of the most specific classes. A combination development process is a combination of the top-down and bottom- up approaches
  • 21. How to develop an ontology? 22 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf Step 5: Define the properties of classes (slots). Remaining terms created in step 3 are properties of these classes. From the wine example, the remaining terms are: wine’s color, body, flavor, sugar content and location of a winery. For each property in the list, determine which class it describes. In the example: the Wine class will have the following slots: ―color, body, flavor, and sugar. All subclasses of a class inherit the slot of that class.
  • 22. How to develop an ontology? 23 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf  Step 6: Define the facets of the slots. Slot can have different facets describing the value type, allowed values, the number of the values (cardinality), and other features of the values the slot can take. ―E.g. The value of a name is a string. Slot cardinality – defines how many values a slot can have. Some systems distinguish only between single cardinality and multiple cardinality. Slot-value type – A value-type facet describes what types of values can fill in the slot.
  • 23. How to develop an ontology? 24 https://protege.stanford.edu/publications/ontology_development/ontology101.pdf  Step 7: Instances of classes in the hierarchy. Defining an individual instance of a class requires choosing a class creating an individual instance of that class filling in the slot values.
  • 24. Applications of Ontologies 25 Machine Learning & Deep Learning: Supervised and unsupervised machine learning algorithms applied to well known, and understood data can rapidly detect failures and put corrective actions into motion before severe damage occurs. Business Intelligence (BI): Any enterprise information architecture intended to enable horizontal communication between disparate data sources, with related and/or potentially different domains (e.g., banking and insurance), must identify an ontology for rapidly merging, and extracting Key Data Elements (KDE) necessary for answering essential competency questions.
  • 25. Applications of Ontologies 26  Ontology in AI: The ontology sets the scene for the knowledge graph to capture the data in a domain by specifying the structure of the knowledge in that domain including taxonomies, topic maps, logical models, and vocabularies. Personalized Shopping AI-Powered Assistants Fraud Prevention Administrative Tasks Automated to Aid Educators Creating Smart Content Voice Assistants Personalized Learning Autonomous Vehicles. Natural Language processing Cognitive AI  Ontology in research: Ontology helps researchers recognize how certain they can be about the nature and existence of objects they are researching.
  • 26. Applications of Ontologies 27  Ontology in Biomedicine: Ontologies are used in the biomedical and health sciences in areas ranging from gene function, as seen in the gene ontology GO, to those used in healthcare informatics such as the International Classification of Diseases, ICD.
  • 27. Ontology in education: The ontology can be used to guide students to understand the organization of their own learning and to self- assess their own progress.  Ontology in neuroscience: Ontology is broadly used in Neuroscience Information Framework (NIF) project. NIFSTD and Web Ontology Language (OWL) are two important tools to analyze neuroscience data. Applications of Ontologies 30
  • 28. Databases vs Ontologies 28 1. The fundamental focus of an ontology is to specify and share meaning. The fundamental focus for a database schema is to describe data and storage. 2. A relational database schema has a single purpose: to structure a set of instances for efficient storage and querying. An ontology can also be used to structure a set of instances in a database but it has a broader range of purposes including better communication, interoperability, search, and software engineering, a communication bridge between a human and a machine. 3. When an instance is created in an ontology, the respective mappings must also be created based on rules, which is not necessarily the case for databases. That’s why ontology perform better than databases. 4. Ontologies utilize the OWA system of knowledge representation, while the CWA is used by databases.
  • 29. Databases vs Ontologies 29 5. Database system apply the normalization of tables to delete redundant data from the tables, to reduce the complexity. Normal forms are a set of rules that help to correct the transformation of entities and relationships to the structure of the physical layout of the tables that is not used in ontologies. 6. A database uses an ER diagram to describe the syntax; this technique is used for abstract and conceptual data representation. In ontology, however, the syntax is written by logic; the most common is the description logic that corresponds to the OWL DL, a language for creating ontologies. 7. Ontologies can be created using existing ontologies while databases are created from scratch like all tables and their content have to design new. 8. But there are certain similarities between ontologies and databases. we can convert a database to an ontology and vice versa using the approach Ontology Inverse Engineering.
  • 30. 1. Data dependency 2. Efficient data access; 3. Data integrity and security; 4. Data administration; 5. Concurrent access and crash recovery; 6. Reduced application development time. Purposes of Development of the Databases 30
  • 31. What is NIF and it’s purposes?  What is data federation?  Data federation vs data warehouse  What is the introduction of ontology?  What is the difference between database and ontology?  What are the advantages of ontology over database?  What is the main goal of ontology?  Why is ontology important?  What is the application of ontology?  Web ontology Language(OWL) Conclusions 31