From allotrope to reference master data management

V.2.2
Eric Little, PhD
Chief Data Officer
OSTHUS
eric.little@osthus.com
From Allotrope to Reference Master
Data Management:
How semantic metadata in .adf can
be extended across the enterprise

LIMS
Studies
Registration
The Silo Situation: Expensive, Ineffective and Error Prone
… ISO3
… DEU
… FRA
… …
… Country
… Germany
… France
… …
… ISO3-num
… 276
… 250
… …
?
?
• Applications use different names for the same things.
• Data exchange is expensive and limited (mapping knowledge in interfaces).

Situation with Semantic Reference Master Data Management
LIMS
Study
Management
Product
Registration
DataGovernance
Semantic Reference Master Data System
“France”@en
“FRA”
“250”
…
“EU”
“European Union”@en
registered
“AAFYZ-1217”
products locations
Value of Semantics:
• standardized naming conventions
for your core entities
• standardized meta models
(vendor agnostic)
• reuse of public ontologies (see
e.g., BioPortal)
• well defined hierarchies
• synonyms & mappings
• qualified relationships
• flexibility of graph models
• rules and inference
• data validation

Documents are processed for term/concept extraction
Extracted concepts are checked for accuracy
A Gold Standard Doc is created by a human – fully accurate reading
Documents are re-run based on human/machine corrections
Machine Learning improves performance over time
How Text Extraction Basically works (highly simplified version)
Documents
Text
Analytics
Engine
• Strains
• Persons
• Organizations
• Seasons
• Locations
• Etc…
Gold
Standard
Document
Extracted
Entities
Human In
The Loop
Feedback Loop for
Learning/Improvement

Extracted entities from the text source are stored in a DB or File Store
They are mapped to other data
 Legacy RDBs
 Semantic Models (shown here)
 Other data sources
The semantic model adds context to the extracted information
 A term can now be related to other objects from other sources
Linking to Semantics (Knowledge Graph)
Semantic Model
Documents
Text
Analytics
Engine
• Strains
• Persons
• Organizations
• Seasons
• Locations
• Etc…
Extracted
Entities

A Semantic Framework can connect the entire enterprise using a common semantics
The Semantic Hub should only focus on metadata (not instance level data)
Benefits: Common Terms, Models, Queries, Rules and Results (End-to-End)
Integrating Data Across the Enterprise
Lab Instruments Clinical Trials Regulatory AffairsProduction eArchiving

Allotrope Structure 2017
Astrix Technology Group
BSSN Software
Elemental Machines
Erasmus MC
Fraunhofer IPA
The HDF Group
LabAnswer
LabWare
Mettler Toledo
NIST
SciBite
Stanford University
University of Illinois at Chicago
University of Southampton

The Allotrope Framework

Allotrope Data Format (ADF)
HDF5
Platform Independent File Format
Allotrope Data Format (ADF)
Descriptive metadata about
• Method, instrument, sample,
process, result, etc.
• Provenance, audit trail
• Data Cube, Data Package
Analytical data represented by
one- or multidimensional arrays
of homogeneous data structures.
Analytical data represented by
arbitrary formats, incl. native
instrument formats, images,
pdf, video, etc.
Specifically designed to store
and organize large amounts
of scientific data.
Data Description
Semantic Graph Model
Data Cubes
Universal Data Container
Data Package
Virtual File System
APIs(Java&.NETclasslibraries)
Chromatogram 2D HDF

Example Use Case
HPLC – UV
Mobile Phase Selection

Ontology for HPLC Example
resultdevice
material
process

expected answer
 specified percentage of components,
e.g. 25% A, 75% B
 specified composition of components,
e.g. A = 0.5 mol/L Acetonitrile, B = Methanol
 specified qualities of chemical compounds
What mobile phase is required ?
MeCN/MeOH 40/60

What mobile phase is required ?
specification of
mobile phase
composition
of
mobile
phase
device
experiment

Models to Capture Plans, Workflows (Processes), Entities & Results

V.2.2
Applying Allotrope to
eArchiving

Using ADF for eArchiving in ZONTAL

V.2.2
Applying Allotrope to
eDecision

manual state of batch comparison
Final Step
Manual Report
LIMS
Purity Summary –
Crude to Drug Prod
Batch Comparison Table
Early/Late
Impurities
Batch to Batch
Comparison
Submission/Sample #
Embedded in ELN
Analyst
ELN
Manual Communication
SME
• Significant amounts of
manual effort
• Disconnected data sources
• Locally stored information
• Lack of traceability
• Data is difficult to interpret
or manipulate
Instrument
Data
Inst.
File
DB
% Purity
Full Lngth
Prod
% Indiv
Impurities
• No Automation
• Limited Batch Comparisons
can be produced
• Limited Distribution

Integration for batch comparison
Final Step
Manual Report
LIMS
Purity Summary –
Crude to Drug Prod
Batch Comparison Table
Early/Late
Impurities
Batch to Batch
Comparison
Submission/Sample #
Embedded in ELN
Analyst
ELN
Manual Communication SME
• Shows data integration
capabilities from LIMS + ELN
data
• Utilizes important metadata
• Metadata is key component of
ADF flies (Data Description)
Instrument
Data
Inst.File
DB
% Purity
Full Lngth
Prod
% Indiv
Impurities
• Can be expanded to include
all Batch Comparison steps
• Provides Integration +
Automation over time

Moving to “product genealogy”
• ZONTAL integrates data across the
enterprise
• Reporting and visibility utilizes the
entire Data Lake
• Instrument data is captured via the
Allotrope Framework
• Expanded to include all scientific
data feeding into ELN, LIMS, etc.
Enterprise-Wide User Community

Benefits of Data Lifecycle Management
Cost Saving Measures:
• Scientists spend more time doing science – not computer science
• Data can be generated and found easily – saves time/money
• Conceptual information is more easily shared/understood upstream
and downstream (w traceability)
• Faster project decisions can be made (with more complete data)
• Managing data/projects across multiple locations/labs is easier
• Integration provides a more complete picture
Innovation:
• Leading your organization to better leverage the value of Data
Science
• Adopting new technologies fosters new ideas and breakthroughs
• 86% of CEO’s surveyed said “technological advances will transform
business the most over the next 5 years” (PWC, Jan 2014)
1

From allotrope to reference master data management

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à From allotrope to reference master data management

Similaire à From allotrope to reference master data management (20)

Plus de OSTHUS

Plus de OSTHUS (10)

Dernier

Dernier (20)

From allotrope to reference master data management