SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
Maryann	
  E.	
  	
  Martone,	
  Ph.	
  D.	
  
University	
  of	
  California,	
  San	
  Diego	
  
“A	
  grand	
  challenge	
  in	
  neuroscience	
  is	
  to	
  elucidate	
  brain	
  func>on	
  in	
  rela>on	
  to	
  
its	
  mul>ple	
  layers	
  of	
  organiza>on	
  that	
  operate	
  at	
  different	
  spa>al	
  and	
  
temporal	
  scales.	
  	
  Central	
  to	
  this	
  effort	
  is	
  tackling	
  “neural	
  choreography”	
  -­‐-­‐	
  
the	
  integrated	
  func>oning	
  of	
  neurons	
  into	
  brain	
  circuits-­‐-­‐	
  Neural	
  
choreography	
  cannot	
  be	
  understood	
  via	
  a	
  purely	
  reduc>onist	
  approach.	
  
Rather,	
  it	
  entails	
  the	
  convergent	
  use	
  of	
  analy>cal	
  and	
  synthe>c	
  tools	
  to	
  
gather,	
  analyze	
  and	
  mine	
  informa>on	
  from	
  each	
  level	
  of	
  analysis,	
  and	
  
capture	
  the	
  emergence	
  of	
  new	
  layers	
  of	
  func>on	
  (or	
  dysfunc>on)	
  as	
  we	
  
move	
  from	
  studying	
  genes	
  and	
  proteins,	
  to	
  cells,	
  circuits,	
  thought,	
  and	
  
behavior....	
  	
  
However,	
  the	
  neuroscience	
  community	
  is	
  not	
  yet	
  fully	
  engaged	
  in	
  exploi;ng	
  
the	
  rich	
  array	
  of	
  data	
  currently	
  available,	
  nor	
  is	
  it	
  adequately	
  poised	
  to	
  
capitalize	
  on	
  the	
  forthcoming	
  data	
  explosion.	
  	
  “	
  
Akil	
  et	
  al.,	
  Science,	
  Feb	
  11,	
  2011	
  
•  In	
  that	
  same	
  issue	
  of	
  Science	
  
–  Asked	
  peer	
  reviewers	
  from	
  last	
  year	
  about	
  the	
  availability	
  and	
  use	
  of	
  
data	
  
•  About	
  half	
  of	
  those	
  polled	
  store	
  their	
  data	
  only	
  in	
  their	
  
laboratories—not	
  an	
  ideal	
  long-­‐term	
  solu>on.	
  	
  
•  Many	
  bemoaned	
  the	
  lack	
  of	
  common	
  metadata	
  and	
  
archives	
  as	
  a	
  main	
  impediment	
  to	
  using	
  and	
  storing	
  
data,	
  and	
  most	
  of	
  the	
  respondents	
  have	
  no	
  funding	
  to	
  
support	
  archiving	
  
•  And	
  even	
  where	
  accessible,	
  much	
  data	
  in	
  many	
  fields	
  is	
  
too	
  poorly	
  organized	
  to	
  enable	
  it	
  to	
  be	
  efficiently	
  used.	
  
“...it	
  is	
  a	
  growing	
  challenge	
  to	
  ensure	
  that	
  data	
  produced	
  during	
  the	
  course	
  
of	
  reported	
  research	
  are	
  appropriately	
  described,	
  standardized,	
  archived,	
  
and	
  available	
  to	
  all.”	
  	
  Lead	
  Science	
  editorial,	
  2011	
  
Neuroscience	
  is	
  unlikely	
  to	
  be	
  
served	
  by	
  a	
  few	
  large	
  databases	
  
like	
  the	
  genomics	
  and	
  proteomics	
  
community	
  Whole	
  brain	
  data	
  
(20	
  um	
  
microscopic	
  MRI)	
  
Mosiac	
  LM	
  
images	
  (1	
  GB+)	
  
Conven>onal	
  LM	
  
images	
  
Individual	
  cell	
  
morphologies	
  
EM	
  volumes	
  &	
  
reconstruc>ons	
  
Solved	
  molecular	
  
structures	
  
No	
  single	
  technology	
  serves	
  these	
  all	
  
equally	
  well.	
  
 Mul6ple	
  data	
  types;	
  	
  mul6ple	
  
scales;	
  	
  mul6ple	
  databases	
  
hZp://neuinfo.org	
  
•  Current	
  web	
  is	
  
designed	
  to	
  share	
  
documents	
  
– Documents	
  are	
  
unstructured	
  data	
  
•  Much	
  of	
  the	
  
content	
  of	
  digital	
  
resources	
  is	
  part	
  of	
  
the	
  “hidden	
  web”	
  
•  Wikipedia:	
  	
  The	
  Deep	
  Web	
  
(also	
  called	
  Deepnet,	
  the	
  
invisible	
  Web,	
  DarkNet,	
  
Undernet	
  or	
  the	
  hidden	
  
Web)	
  refers	
  to	
  
World	
  Wide	
  Web	
  content	
  
that	
  is	
  not	
  part	
  of	
  the	
  
Surface	
  Web,	
  which	
  is	
  
indexed	
  by	
  standard	
  
search	
  engines.	
  
•  NIF	
  has	
  developed	
  a	
  
produc>on	
  technology	
  
pla]orm	
  for	
  researchers	
  to:	
  
–  Discover	
  
–  Share	
  
–  Analyze	
  
–  Integrate	
  	
  
neuroscience-­‐relevant	
  
informa>on	
  
•  Since	
  2008,	
  NIF	
  has	
  
assembled	
  the	
  largest	
  
searchable	
  catalog	
  of	
  
neuroscience	
  data	
  and	
  
resources	
  on	
  the	
  web	
  
•  Cost-­‐effec>ve	
  and	
  
innova>ve	
  strategy	
  for	
  
managing	
  data	
  assets	
  
“This	
  unique	
  data	
  depository	
  serves	
  as	
  a	
  model	
  
for	
  other	
  Web	
  sites	
  to	
  provide	
  research	
  data.	
  “	
  -­‐	
  
Choice	
  Reviews	
  Online	
  
NIF	
  is	
  poised	
  to	
  capitalize	
  on	
  the	
  new	
  tools	
  
and	
  emphasis	
  on	
  big	
  data	
  and	
  open	
  
science	
  
h?p://neuinfo.org	
  
June10,	
  2013	
   dkCOIN	
  Inves>gator's	
  Retreat	
   8	
  
•  A	
  portal	
  for	
  finding	
  and	
  using	
  
neuroscience	
  resources	
  
  A	
  consistent	
  framework	
  for	
  
describing	
  resources	
  
  Provides	
  simultaneous	
  
search	
  of	
  mul>ple	
  types	
  of	
  
informa>on,	
  organized	
  by	
  
category	
  
  Supported	
  by	
  an	
  expansive	
  
ontology	
  for	
  neuroscience	
  
  U>lizes	
  advanced	
  
technologies	
  to	
  search	
  the	
  
“hidden	
  web”	
  
UCSD,	
  Yale,	
  Cal	
  Tech,	
  George	
  Mason,	
  Washington	
  Univ	
  
Literature	
  
Database	
  
Federa>on	
  
Registry	
  
• NIF	
  Registry:	
  	
  A	
  catalog	
  
of	
  neuroscience-­‐
relevant	
  resources	
  
• >	
  6000	
  currently	
  
listed	
  
• >	
  2200	
  databases	
  
• And	
  we	
  are	
  finding	
  
more	
  every	
  day	
  
“Of	
  relevance	
  to	
  neuroscience”	
  is	
  very	
  broad	
  
dkCOIN	
  Inves>gator's	
  Retreat	
   10	
  
• NIF	
  curators	
  
• Nomina>on	
  by	
  the	
  
community	
  
• Semi-­‐automated	
  text	
  mining	
  
pipelines	
  
 NIF	
  Registry	
  
 Requires	
  no	
  special	
  
skills	
  
 Site	
  map	
  available	
  for	
  
local	
  hos>ng	
  
• NIF	
  Data	
  Federa>on	
  
• DISCO	
  interop	
  
• Requires	
  some	
  
programming	
  skill	
  
Low	
  barrier	
  to	
  entry	
  
•  Extended	
  over	
  >me	
  
–  Parent	
  resource	
  
–  Suppor>ng	
  agency	
  
–  Grant	
  numbers	
  
–  Accessibility	
  
–  Related	
  to	
  
–  Organism	
  
–  Disease	
  or	
  condi>on	
  
–  Last	
  updated	
  
First	
  catalog:	
  	
  SFN	
  Neuroscience	
  Database	
  Gateway	
  	
  NIF	
  0.5	
  	
  NIF	
  1.0+	
  
Simple	
  metadata	
  model	
  
Name,	
  descrip>on,	
  type,	
  URL,	
  other	
  names,	
  keywords,	
  
unique	
  iden>fier	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  ~2003	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  2006	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  2008	
  
12	
  
•  NIF	
  Registry	
  is	
  hosted	
  
on	
  Seman>c	
  Media	
  
Wiki	
  pla]orm	
  
Neurolex	
  
–  Community	
  can	
  add,	
  
review,	
  edit	
  without	
  
special	
  privileges	
  
–  Searchable	
  by	
  Google	
  
–  Integrated	
  with	
  NIF	
  
ontologies	
  
–  Graph	
  structure	
  
Seman>c	
  wiki:	
  	
  A	
  wiki	
  with	
  seman>cs;	
  	
  pages	
  are	
  linked	
  through	
  rela>onships	
  
NIF	
  is	
  crea>ng	
  the	
  linked	
  data	
  graph	
  of	
  resources	
  
–  NIF	
  employs	
  an	
  automated	
  link	
  checker	
  	
  
–  Last	
  analysis:	
  	
  478/6100	
  invalid	
  URL’s	
  (~8%)	
  
–  199	
  can’t	
  locate	
  at	
  another	
  university	
  or	
  loca>on	
  	
  out	
  of	
  service	
  (~3%)	
  
–  Bigger	
  issue:	
  	
  Many	
  resources	
  are	
  no	
  longer	
  updated	
  or	
  maintained	
  
0	
  
20	
  
40	
  
60	
  
80	
  
100	
  
120	
  
140	
  
160	
  
180	
  
200	
  
1996	
   1998	
   2000	
   2002	
   2004	
   2006	
   2008	
   2010	
   2012	
   2014	
  
0	
  
500	
  
1000	
  
1500	
  
2000	
  
2500	
  
3000	
  
3500	
  
Resources	
  added	
  
Last	
  updated	
  
Keeping	
  content	
  up	
  
to	
  date	
  
Connectome	
  
Tractography	
  
Epigene>cs	
  
• New	
  tags	
  come	
  into	
  
existence	
  
• New	
  resource	
  types	
  come	
  
into	
  existence,	
  e.g.,	
  Mobile	
  
apps	
  
• Resources	
  add	
  new	
  types	
  of	
  
content	
  	
  
• Change	
  name	
  
• Change	
  scope	
  
• >	
  7000	
  updates	
  to	
  the	
  
registry	
  last	
  year	
  
It’s	
  a	
  challenge	
  to	
  keep	
  the	
  registry	
  up	
  to	
  date;	
  	
  
sitemaps,	
  cura>on,	
  ontologies,	
  community	
  review	
  
• The	
  NIF	
  Registry	
  has	
  created	
  a	
  linked	
  data	
  
graph	
  of	
  web-­‐accessible	
  resources	
  
• Maintained	
  on	
  a	
  community	
  wiki	
  
pla]orm	
  
• Provides	
  data	
  on	
  the	
  fluidity	
  of	
  the	
  
resource	
  landscape	
  
–  New	
  resources	
  con>nue	
  to	
  be	
  created	
  and	
  
found	
  
–  Rela>vely	
  few	
  disappear	
  altogether	
  
–  Many	
  more	
  grow	
  stale,	
  although	
  their	
  value	
  
may	
  s>ll	
  be	
  significant	
  
–  Maintaining	
  up	
  to	
  date	
  cura>on	
  requires	
  
frequent	
  upda>ng	
  
NIF	
  Registry	
  provides	
  insight	
  into	
  the	
  state	
  of	
  digital	
  
resources	
  on	
  the	
  web	
  
• The	
  NIF	
  data	
  federa>on	
  performs	
  deep	
  search	
  over	
  
the	
  content	
  of	
  over	
  200	
  databases	
  
• New	
  databases	
  are	
  added	
  at	
  a	
  rate	
  of	
  25-­‐40	
  per	
  year	
  
• Latest	
  update:	
  	
  Open	
  Source	
  Brain;	
  	
  ingest	
  
completed	
  in	
  2	
  hours	
  
• Databases	
  chosen	
  on	
  a	
  variety	
  of	
  criteria:	
  
• Early:	
  	
  tes>ng	
  different	
  types	
  of	
  resources	
  
• Thema>c	
  areas	
  
• Volunteers	
  
NIF	
  provides	
  access	
  to	
  the	
  largest	
  aggrega>on	
  of	
  
neuroscience-­‐relevant	
  informa>on	
  on	
  the	
  web	
  
•  NIF	
  was	
  one	
  of	
  the	
  first	
  projects	
  to	
  aZempt	
  data	
  integra>on	
  
in	
  the	
  neurosciences	
  on	
  a	
  large	
  scale	
  
•  NIF	
  is	
  supported	
  by	
  a	
  contract	
  that	
  specified	
  the	
  number	
  of	
  
resources	
  to	
  be	
  added	
  per	
  year	
  	
  
–  Designed	
  to	
  be	
  populated	
  rapidly;	
  	
  set	
  up	
  process	
  for	
  progressive	
  
refinement	
  
–  No	
  budget	
  was	
  allocated	
  to	
  retrofit	
  exis>ng	
  resources;	
  	
  had	
  to	
  
work	
  with	
  them	
  in	
  their	
  current	
  state	
  
–  We	
  designed	
  a	
  system	
  that	
  required	
  liZle	
  to	
  no	
  coopera>on	
  or	
  
work	
  from	
  providers	
  
–  Supports	
  many	
  formats:	
  	
  rela>onal,	
  XML,	
  RDF	
  
Current	
  
Planned	
  
DISCO	
  Dashboard	
  Func6ons	
  
•  Ingest	
  Script	
  Manager	
  
•  Public	
  Script	
  Repository	
  
•  Data	
  &	
  Event	
  Tracker	
  
•  Versioning	
  System	
  
•  Curator	
  Tool	
  	
  
•  Data	
  Transformer	
  Manager	
  
June10,	
  2013	
   dkCOIN	
  Inves>gator's	
  Retreat	
   19	
  Luis	
  Marenco,	
  Rixin	
  Wang,	
  Perrry	
  Miller,	
  Gordon	
  Shepherd	
  
Yale	
  University	
  
0	
  
50	
  
100	
  
150	
  
200	
  
250	
  
0.01	
  
0.1	
  
1	
  
10	
  
100	
  
1000	
  
6-­‐12	
   12-­‐12	
   7-­‐13	
   1-­‐14	
   8-­‐14	
   2-­‐15	
   9-­‐15	
   4-­‐16	
   10-­‐16	
   5-­‐17	
  
Number	
  of	
  Federated	
  Databases	
  
Number	
  of	
  Federated	
  Records	
  (Millions)	
  
NIF	
  searches	
  the	
  largest	
  colla>on	
  of	
  
neuroscience-­‐relevant	
  data	
  on	
  the	
  web	
  
DISCO	
  
June10,	
  2013	
   dkCOIN	
  Inves>gator's	
  Retreat	
   20	
  
Results	
  categorized	
  by	
  data	
  type	
  and	
  level	
  
of	
  nervous	
  system	
  	
  
Hippocampus	
  OR	
  “Cornu	
  Ammonis”	
  OR	
  
“Ammon’s	
  horn”	
   Query	
  expansion:	
  	
  Synonyms	
  
and	
  related	
  concepts	
  
Boolean	
  queries	
  
Data	
  sources	
  
categorized	
  by	
  
“data	
  type”	
  and	
  
level	
  of	
  nervous	
  
system	
  
Common	
  views	
  
across	
  mul>ple	
  
sources	
  
Tutorials	
  for	
  using	
  
full	
  resource	
  when	
  
gewng	
  there	
  from	
  
NIF	
  
Link	
  back	
  to	
  
record	
  in	
  
original	
  source	
  
Connects	
  to	
  
Synapsed	
  with	
  
Synapsed	
  by	
  
Input	
  region	
  
innervates	
  
Axon	
  innervates	
  
Projects	
  to	
  Cellular	
  contact	
  
Subcellular	
  contact	
  
Source	
  site	
  
Target	
  	
  site	
  
Each	
  resource	
  implements	
  a	
  different,	
  though	
  related	
  model;	
  	
  
systems	
  are	
  complex	
  and	
  difficult	
  to	
  learn,	
  in	
  many	
  cases	
  
• NIF	
  Connec>vity:	
  	
  7	
  databases	
  containing	
  connec>vity	
  primary	
  data	
  or	
  claims	
  
from	
  literature	
  on	
  connec>vity	
  between	
  brain	
  regions	
  
• Brain	
  Architecture	
  Management	
  System	
  (rodent)	
  
• Temporal	
  lobe.com	
  (rodent)	
  
• Connectome	
  Wiki	
  (human)	
  
• Brain	
  Maps	
  (various)	
  
• CoCoMac	
  (primate	
  cortex)	
  
• UCLA	
  Mul>modal	
  database	
  (Human	
  fMRI)	
  
• Avian	
  Brain	
  Connec>vity	
  Database	
  (Bird)	
  
• Total:	
  	
  1800	
  unique	
  brain	
  terms	
  (excluding	
  Avian)	
  
• Number	
  of	
  exact	
  terms	
  used	
  in	
  >	
  1	
  database:	
  	
  42	
  
• Number	
  of	
  synonym	
  matches:	
  	
  99	
  
• Number	
  of	
  1st	
  order	
  partonomy	
  matches:	
  	
  385	
  
– You	
  (and	
  the	
  machine)	
  have	
  to	
  be	
  able	
  to	
  
find	
  it	
  
•  Accessible	
  through	
  the	
  web	
  
•  Annota>ons	
  
– You	
  have	
  to	
  be	
  able	
  to	
  access	
  and	
  use	
  it	
  
•  Data	
  type	
  specified	
  and	
  in	
  a	
  usable	
  form	
  
– You	
  have	
  to	
  know	
  what	
  the	
  data	
  mean	
  
•  Some	
  seman>cs:	
  	
  “1”	
  
•  Context:	
  	
  Experimental	
  metadata	
  
•  Provenance:	
  	
  Where	
  did	
  the	
  data	
  come	
  from?	
  
Repor>ng	
  neuroscience	
  data	
  within	
  a	
  consistent	
  framework	
  helps	
  
enormously	
  
Knowledge	
  in	
  space	
  and	
  spa>al	
  rela>onships	
  
(the	
  “where”)	
  
Knowledge	
  in	
  words,	
  terminologies	
  and	
  
logical	
  rela>onships	
  (the	
  “what”)	
  
•  NIF	
  covers	
  mul>ple	
  structural	
  scales	
  and	
  domains	
  of	
  relevance	
  to	
  neuroscience	
  
•  Aggregate	
  of	
  community	
  ontologies	
  with	
  some	
  extensions	
  for	
  neuroscience,	
  e.g.,	
  Gene	
  
Ontology,	
  Chebi,	
  Protein	
  Ontology	
  
NIFSTD	
  
Organism	
  
NS	
  Func>on	
  Molecule	
   Inves>ga>on	
  
Subcellular	
  
structure	
  
Macromolecule	
   Gene	
  
Molecule	
  Descriptors	
  
Techniques	
  
Reagent	
   Protocols	
  
Cell	
  
Resource	
   Instrument	
  
Dysfunc>on	
   Quality	
  
Anatomical	
  
Structure	
  
NIF	
  capitalizes	
  on	
  the	
  growing	
  set	
  of	
  community	
  ontologies	
  
available	
  in	
  biomedical	
  science	
  
Purkinje	
  
Cell	
  
Axon	
  
Terminal	
  
Axon	
  
Dendri>c	
  
Tree	
  
Dendri>c	
  
Spine	
  
Dendrite	
  
Cell	
  body	
  
Cerebellar	
  
cortex	
  
There	
  is	
  liZle	
  obvious	
  connec>on	
  between	
  
data	
  sets	
  taken	
  at	
  different	
  scales	
  using	
  
different	
  microscopies	
  without	
  an	
  explicit	
  
representa>on	
  of	
  the	
  biological	
  objects	
  that	
  
the	
  data	
  represent	
  
Brain	
  
Cerebellum	
  
Purkinje	
  Cell	
  Layer	
  
Purkinje	
  cell	
  
neuron	
  
has	
  a	
  
has	
  a	
  
has	
  a	
  
is	
  a	
  
•  Ontology:	
  an	
  explicit,	
  formal	
  representa>on	
  
of	
  concepts	
  	
  rela>onships	
  among	
  them	
  
within	
  a	
  par>cular	
  domain	
  that	
  expresses	
  
human	
  knowledge	
  in	
  a	
  machine	
  readable	
  
form	
  
–  Branch	
  of	
  philosophy:	
  	
  a	
  theory	
  of	
  what	
  is	
  
–  e.g.,	
  Gene	
  ontologies	
  
•  Provide	
  universals	
  for	
  naviga>ng	
  across	
  
different	
  data	
  sources	
  
–  Seman>c	
  “index”	
  
•  Provide	
  the	
  basis	
  for	
  concept-­‐based	
  
queries	
  to	
  probe	
  and	
  mine	
  data	
  
–  Perform	
  reasoning	
  
–  Link	
  data	
  through	
  rela>onships	
  not	
  just	
  one-­‐
to-­‐one	
  mappings	
  
“Search	
  compu6ng”	
  
What	
  genes	
  are	
  upregulated	
  by	
  drugs	
  of	
  abuse	
  
in	
  the	
  adult	
  mouse?	
  
Morphine	
  
Increased	
  
expression	
  
Adult	
  Mouse	
  
Some	
  concepts,	
  e.g.,	
  age	
  category,	
  are	
  quan>ta>ve	
  but	
  
s>ll	
  must	
  be	
  interpreted	
  in	
  a	
  global	
  query	
  system	
  
June10,	
  2013	
   dkCOIN	
  Inves>gator's	
  Retreat	
   32	
  
hZp://neurolex.org	
   Stephen	
  Larson	
  
• Provide	
  a	
  simple	
  
interface	
  for	
  defining	
  the	
  
concepts	
  required	
  
• Light	
  weight	
  seman>cs	
  
• Good	
  teaching	
  tool	
  for	
  
learning	
  about	
  seman>c	
  
integra>on	
  and	
  the	
  
benefits	
  of	
  a	
  consistent	
  
seman>c	
  framework	
  
• Community	
  based:	
  
• Anyone	
  can	
  contribute	
  
their	
  terms,	
  concepts,	
  
things	
  
• Anyone	
  can	
  edit	
  
• Anyone	
  can	
  link	
  
• Accessible:	
  	
  searched	
  by	
  
Google	
  
• Growing	
  into	
  a	
  significant	
  
knowledge	
  base	
  for	
  
neuroscience	
   Demo	
  	
  D03	
  
 200,000	
  
edits	
  
 150	
  
contributors	
  
•  NIF	
  can	
  be	
  used	
  to	
  survey	
  the	
  
data	
  landscape	
  
•  Analysis	
  of	
  NIF	
  shows	
  mul>ple	
  
databases	
  with	
  similar	
  scope	
  
and	
  content	
  
•  Many	
  contain	
  par>ally	
  
overlapping	
  data	
  
•  Data	
  “flows”	
  from	
  one	
  
resource	
  to	
  the	
  next	
  
–  Data	
  is	
  reinterpreted,	
  reanalyzed	
  or	
  
added	
  to	
  
•  Is	
  duplica>on	
  good	
  or	
  bad?	
  
Databases	
  come	
  in	
  many	
  shapes	
  and	
  sizes	
  
•  Primary	
  data:	
  
–  Data	
  available	
  for	
  reanalysis,	
  e.g.,	
  
microarray	
  data	
  sets	
  from	
  GEO;	
  	
  
brain	
  images	
  from	
  XNAT;	
  	
  
microscopic	
  images	
  (CCDB/CIL)	
  
•  Secondary	
  data	
  
–  Data	
  features	
  extracted	
  through	
  
data	
  processing	
  and	
  some>mes	
  
normaliza>on,	
  e.g,	
  brain	
  structure	
  
volumes	
  (IBVD),	
  gene	
  expression	
  
levels	
  (Allen	
  Brain	
  Atlas);	
  	
  brain	
  
connec>vity	
  statements	
  (BAMS)	
  
•  Ter>ary	
  data	
  
–  Claims	
  and	
  asser>ons	
  about	
  the	
  
meaning	
  of	
  data	
  
•  E.g.,	
  gene	
  upregula>on/
downregula>on,	
  brain	
  
ac>va>on	
  as	
  a	
  func>on	
  of	
  task	
  
•  Registries:	
  
–  Metadata	
  
–  Pointers	
  to	
  data	
  sets	
  or	
  
materials	
  stored	
  elsewhere	
  
•  Data	
  aggregators	
  
–  Aggregate	
  data	
  of	
  the	
  same	
  
type	
  from	
  mul>ple	
  sources,	
  
e.g.,	
  Cell	
  Image	
  
Library	
  ,SUMSdb,	
  Brede	
  
•  Single	
  source	
  
–  Data	
  acquired	
  within	
  a	
  single	
  
context	
  ,	
  e.g.,	
  Allen	
  Brain	
  Atlas	
  
Researchers	
  are	
  producing	
  a	
  variety	
  of	
  
informa>on	
  ar>facts	
  using	
  a	
  mul>tude	
  of	
  
technologies	
  
NIF	
  Analy6cs:	
  	
  The	
  Neuroscience	
  Landscape	
  
NIF	
  is	
  in	
  a	
  unique	
  posi>on	
  to	
  answer	
  ques>ons	
  about	
  the	
  neuroscience	
  
landscape	
  
Where	
  are	
  the	
  data?	
  
Striatum	
  
Hypothalamus	
  
Olfactory	
  bulb	
  
Cerebral	
  cortex	
  
Brain	
  
Brain	
  region	
  
Data	
  source	
  
Vadim	
  Astakhov,	
  Kepler	
  Workflow	
  Engine	
  
Diseases	
  of	
  nervous	
  system	
  
Adding	
  more	
  seman6cs	
  
The	
  combina>on	
  of	
  ontologies,	
  diverse	
  data	
  and	
  analy>cs	
  lets	
  us	
  look	
  at	
  
the	
  current	
  landscape	
  in	
  interes>ng	
  ways	
  	
  	
  
Neurodegenera>ve	
  
Seizure	
  disorders	
  
Neoplas>c	
  disease	
  of	
  nervous	
  system	
  
NIH	
  
Reporter	
  
NIF	
  data	
  federated	
  sources	
  
•  Gemma:	
  	
  Gene	
  ID	
  	
  +	
  Gene	
  Symbol	
  
•  DRG:	
  	
  Gene	
  name	
  +	
  Probe	
  ID	
  
•  Gemma	
  presented	
  results	
  rela>ve	
  to	
  baseline	
  chronic	
  
morphine;	
  	
  DRG	
  with	
  respect	
  to	
  saline,	
  so	
  direc>on	
  of	
  change	
  is	
  
opposite	
  in	
  the	
  2	
  databases	
  
• 	
  	
  	
  	
  	
  Analysis:	
  
• 1370	
  statements	
  from	
  Gemma	
  regarding	
  gene	
  expression	
  as	
  a	
  func>on	
  of	
  chronic	
  
morphine	
  
• 617	
  were	
  consistent	
  with	
  DRG;	
  	
  	
  over	
  half	
  	
  of	
  the	
  claims	
  of	
  the	
  paper	
  were	
  not	
  
confirmed	
  in	
  this	
  analysis	
  
• Results	
  for	
  1	
  gene	
  were	
  opposite	
  in	
  DRG	
  and	
  Gemma	
  
• 45	
  did	
  not	
  have	
  enough	
  informa>on	
  provided	
  in	
  the	
  paper	
  to	
  make	
  a	
  judgment	
  
Rela>vely	
  simple	
  standards	
  would	
  make	
  life	
  easier	
  
NIF	
  favors	
  a	
  hybrid,	
  >ered,	
  
federated	
  system	
  
•  Domain	
  knowledge	
  
–  Ontologies	
  
•  Claims,	
  models	
  and	
  
observa>ons	
  
–  Virtuoso	
  RDF	
  triples	
  	
  
–  Model	
  repositories	
  
•  Data	
  
–  Data	
  federa>on	
  
–  Spa>al	
  data	
  
–  Workflows	
  
•  Narra>ve	
  
–  Full	
  text	
  access	
  
Neuron	
   Brain	
  part	
   Disease	
  
Organism	
   Gene	
  
Caudate	
  projects	
  to	
  
Snpc	
   Grm1	
  is	
  upregulated	
  in	
  
chronic	
  cocaine	
  
Betz	
  cells	
  
degenerate	
  in	
  ALS	
  
NIF	
  provides	
  the	
  tentacles	
  that	
  connect	
  the	
  pieces:	
  	
  a	
  
new	
  type	
  of	
  en>ty	
  for	
  21st	
  century	
  science	
  
Technique	
  
People	
  
•  2006-­‐2008:	
  	
  A	
  survey	
  of	
  what	
  was	
  out	
  there	
  
•  2008-­‐2009:	
  	
  Strategy	
  for	
  resource	
  discovery	
  
–  NIF	
  Registry	
  vs	
  NIF	
  data	
  federa>on	
  
–  Inges>on	
  of	
  data	
  contained	
  within	
  different	
  technology	
  pla]orms,	
  e.g.,	
  XML	
  vs	
  rela>onal	
  
vs	
  RDF	
  
–  Effec>ve	
  search	
  across	
  seman>cally	
  diverse	
  sources	
  
•  NIFSTD	
  ontologies	
  
•  2009-­‐2011:	
  	
  Strategy	
  for	
  data	
  integra>on	
  
–  Unified	
  views	
  across	
  common	
  sources	
  
–  Mapping	
  of	
  content	
  to	
  NIF	
  vocabularies	
  
•  2011-­‐present:	
  	
  Data	
  analy>cs	
  
–  Uniform	
  external	
  data	
  references	
  
•  2012-­‐present:	
  	
  	
  SciCrunch:	
  	
  unified	
  biomedical	
  resource	
  
services	
  
NIF	
  provides	
  a	
  strategy	
  and	
  set	
  of	
  tools	
  applicable	
  to	
  all	
  
domains	
  grappling	
  with	
  mul>ple	
  sources	
  of	
  diverse	
  data	
  
(i.e.,	
  preZy	
  much	
  everything)	
  
•  Search	
  seman>cs	
  
•  Ranking	
  
•  Resources	
  supported	
  by	
  NIH	
  Blueprint	
  Ins>tutes	
  are	
  
more	
  thoroughly	
  covered	
  
•  Data	
  types,	
  e.g.,	
  Brain	
  ac>va>on	
  foci	
  
June10,	
  2013	
   dkCOIN	
  Inves>gator's	
  Retreat	
   41	
  
June10,	
  2013	
   42	
  
SciCrunch	
  
NIF	
   MONARCH	
  
Community	
  
Services	
  
dkCOIN	
  
Shared	
  
Resources	
  
Undiagnosed	
  
Disease	
  Program	
  
Phenotype	
  RCN	
  
3D	
  Virtual	
  Cell	
  
Na>onal	
  Ins>tute	
  
on	
  Aging	
  
One	
  Mind	
  for	
  
Research	
  
BIRN	
  
Interna>onal	
  
Neuroinforma>cs	
  
Coordina>ng	
  
Facility	
  
Model	
  Organism	
  
Databases	
  
Community	
  
Outreach	
  
DELSA	
  
(not	
  just	
  a	
  data	
  catalog)	
  
43	
  
• 3dVC:	
  	
  Focus	
  on	
  models	
  and	
  simula>on	
  
• Gene	
  Ontology:	
  	
  Focus	
  on	
  
bioinforma>cs	
  tools	
  
• Na>onal	
  Ins>tute	
  on	
  aging:	
  Aging-­‐
related	
  data	
  sets	
  
• Monarch:	
  	
  Phenotype-­‐Genotype;	
  	
  deep	
  
seman>c	
  data	
  integra>on	
  
• One	
  Mind	
  for	
  Research:	
  	
  Biospecimen	
  
repositories	
  
• NeuroGateway:	
  	
  Computa>onal	
  
resources	
  
• FORCE11:	
  	
  Tools	
  for	
  next-­‐gen	
  publishing	
  
and	
  e-­‐scholarship	
  
SciCrunch	
  
SciCrunch	
  is	
  ac>vely	
  suppor>ng	
  mul>ple	
  
communi>es;	
  mul>ple	
  communi>es	
  are	
  
enriching	
  	
  and	
  improving	
  SciCrunch	
  	
  	
  	
  
Community	
  
database:	
  
beginning	
  
Community	
  
database:	
  	
  
End	
  
“How	
  do	
  I	
  share	
  my	
  
data/tool?”	
  
“There	
  is	
  no	
  database	
  
for	
  my	
  data”	
  
1	
  
2	
  
3	
  
4	
  
Ins3tu3onal	
  
repositories	
  
Cloud	
  
INCF:	
  	
  Global	
  
infrastructure	
  
Government	
  
Educa>on	
  
Industry	
  
NIF	
  is	
  designed	
  to	
  leverage	
  exis>ng	
  investments	
  in	
  resources	
  and	
  infrastructure	
  
Tool	
  repositories	
  
•  No	
  one	
  can	
  be	
  stopped	
  from	
  doing	
  what	
  they	
  need	
  to	
  do	
  	
  
•  Every	
  resource	
  is	
  resource	
  limited:	
  	
  few	
  have	
  enough	
  >me,	
  money,	
  
staff	
  or	
  	
  exper>se	
  required	
  to	
  do	
  everything	
  they	
  would	
  like	
  
–  If	
  the	
  market	
  can	
  support	
  11	
  MRI	
  databases,	
  fine	
  
–  Some	
  consolida>on,	
  coordina>on	
  is	
  warranted	
  though	
  
•  Big,	
  broad	
  and	
  messy	
  beats	
  small,	
  narrow	
  and	
  neat	
  
–  Without	
  trying	
  to	
  integrate	
  a	
  lot	
  of	
  data,	
  we	
  will	
  not	
  know	
  what	
  needs	
  to	
  be	
  done	
  
–  A	
  lot	
  can	
  be	
  done	
  with	
  messy	
  data;	
  	
  neatness	
  helps	
  though	
  
–  Progressive	
  refinement;	
  	
  addi>on	
  of	
  complexity	
  through	
  layers	
  
•  Be	
  flexible	
  and	
  opportunis>c	
  
–  A	
  single	
  	
  op>mal	
  technology/container	
  for	
  all	
  types	
  of	
  scien>fic	
  data	
  and	
  informa>on	
  does	
  not	
  exist;	
  	
  
technology	
  is	
  changing	
  
•  Think	
  globally;	
  	
  act	
  locally:	
  
–  No	
  source,	
  not	
  even	
  NIF,	
  is	
  THE	
  source;	
  	
  we	
  are	
  all	
  a	
  source	
  
•  Several	
  powerful	
  trends	
  should	
  change	
  the	
  way	
  we	
  think	
  about	
  
our	
  data:	
  	
  One	
  	
  Many	
  
–  Many	
  data	
  
•  Genera>on	
  of	
  data	
  is	
  gewng	
  easier	
  	
  shared	
  data	
  
•  Data	
  space	
  is	
  gewng	
  richer:	
  	
  more	
  –omes	
  everyday	
  
•  But...compared	
  to	
  the	
  biological	
  space,	
  s>ll	
  sparse	
  
–  Many	
  eyes	
  
•  Wisdom	
  of	
  crowds	
  
•  More	
  than	
  one	
  way	
  to	
  interpret	
  data	
  
–  Many	
  algorithms	
  
•  Not	
  a	
  single	
  way	
  to	
  analyze	
  data	
  
–  Many	
  analy>cs	
  
•  “Signatures”	
  in	
  data	
  may	
  not	
  be	
  directly	
  related	
  to	
  the	
  ques>on	
  for	
  which	
  they	
  
were	
  acquired	
  but	
  tell	
  us	
  something	
  really	
  interes>ng	
  
Are	
  you	
  exposing	
  or	
  burying	
  your	
  work?	
  
Jeff	
  Grethe,	
  UCSD,	
  Co	
  Inves>gator,	
  Interim	
  PI	
  
Amarnath	
  Gupta,	
  UCSD,	
  Co	
  Inves>gator	
  
Anita	
  Bandrowski,	
  NIF	
  Project	
  Leader	
  
Gordon	
  Shepherd,	
  Yale	
  University	
  
Perry	
  Miller	
  
Luis	
  Marenco	
  
Rixin	
  Wang	
  
David	
  Van	
  Essen,	
  Washington	
  University	
  
Erin	
  Reid	
  
Paul	
  Sternberg,	
  Cal	
  Tech	
  
Arun	
  Rangarajan	
  
Hans	
  Michael	
  Muller	
  
Yuling	
  Li	
  
Giorgio	
  Ascoli,	
  George	
  Mason	
  University	
  
Sridevi	
  Polavarum	
  
Fahim	
  Imam	
  
Larry	
  Lui	
  
Andrea	
  Arnaud	
  Stagg	
  
Jonathan	
  Cachat	
  
Jennifer	
  Lawrence	
  
Svetlana	
  Sulima	
  
Davis	
  Banks	
  
Vadim	
  Astakhov	
  
Xufei	
  Qian	
  
Chris	
  Condit	
  
Mark	
  Ellisman	
  
Stephen	
  Larson	
  
Willie	
  Wong	
  
Tim	
  Clark,	
  Harvard	
  University	
  
Paolo	
  Ciccarese	
  
Karen	
  Skinner,	
  NIH,	
  Program	
  Officer	
  
(re>red)	
  
Jonathan	
  Pollock,	
  NIH,	
  Program	
  Officer	
  
And	
  my	
  colleagues	
  in	
  Monarch,	
  dkNet,	
  3DVC,	
  Force	
  11	
  

Contenu connexe

Tendances

The Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark DataThe Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark Datavbrant
 
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...Bryan Heidorn
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...Maryann Martone
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformTrevor Owens
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
Museum Data Exchange
Museum Data ExchangeMuseum Data Exchange
Museum Data ExchangeOCLC Research
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EITESANGO
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataShenghui Wang
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalASIS&T
 
Towards collaboration at scale: Libraries, the social and the technical
Towards collaboration at scale:  Libraries, the social and the technicalTowards collaboration at scale:  Libraries, the social and the technical
Towards collaboration at scale: Libraries, the social and the technicallisld
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking FunctionalityNicholas Loulloudes
 
ALA 2010 -- Jane Burke
ALA 2010 -- Jane BurkeALA 2010 -- Jane Burke
ALA 2010 -- Jane Burkebisg
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Amit Sheth
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentationekansa
 

Tendances (19)

The Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark DataThe Path to Enlightened Solutions for Biodiversity's Dark Data
The Path to Enlightened Solutions for Biodiversity's Dark Data
 
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
Heidorn The Path to Enlightened Solutions for Biodiversity's Dark DataViBRANT...
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...
 
Next Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital PlatformNext Steps for IMLS's National Digital Platform
Next Steps for IMLS's National Digital Platform
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
Museum Data Exchange
Museum Data ExchangeMuseum Data Exchange
Museum Data Exchange
 
EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017 EiTESAL eHealth Conference 14&15 May 2017
EiTESAL eHealth Conference 14&15 May 2017
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
Exploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadataExploring a world of networked information built from free-text metadata
Exploring a world of networked information built from free-text metadata
 
Poster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goalPoster RDAP13: Data information literacy multiple paths to a single goal
Poster RDAP13: Data information literacy multiple paths to a single goal
 
Reference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and RemedyReference Rot and Linked Data: Threat and Remedy
Reference Rot and Linked Data: Threat and Remedy
 
Towards collaboration at scale: Libraries, the social and the technical
Towards collaboration at scale:  Libraries, the social and the technicalTowards collaboration at scale:  Libraries, the social and the technical
Towards collaboration at scale: Libraries, the social and the technical
 
g-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionalityg-Social - Enhancing e-Science Tools with Social Networking Functionality
g-Social - Enhancing e-Science Tools with Social Networking Functionality
 
ALA 2010 -- Jane Burke
ALA 2010 -- Jane BurkeALA 2010 -- Jane Burke
ALA 2010 -- Jane Burke
 
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
Semantic Web for 360-degree Health: State-of-the-Art & Vision for Better Inte...
 
Ji cv6n1
Ji cv6n1Ji cv6n1
Ji cv6n1
 
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI PresentationOpen Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
 

En vedette

The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...Neuroscience Information Framework
 
Introduction to Nanotechnology: Part 2
Introduction to Nanotechnology: Part 2Introduction to Nanotechnology: Part 2
Introduction to Nanotechnology: Part 2glennfish
 
Introduction to Nanotechnology: Part 4
Introduction to Nanotechnology:  Part 4Introduction to Nanotechnology:  Part 4
Introduction to Nanotechnology: Part 4glennfish
 
Behavioural explanations of addiction 2013
Behavioural explanations of addiction 2013Behavioural explanations of addiction 2013
Behavioural explanations of addiction 2013sssfcpsychology
 
Introduction to Nanotechnology: Part 3
Introduction to Nanotechnology: Part 3Introduction to Nanotechnology: Part 3
Introduction to Nanotechnology: Part 3glennfish
 
Cognitive explanations 2013
Cognitive explanations 2013Cognitive explanations 2013
Cognitive explanations 2013sssfcpsychology
 
PSYA4 Addiction - latest
PSYA4 Addiction - latestPSYA4 Addiction - latest
PSYA4 Addiction - latestNicky Burt
 
Introduction to Nanotechnology: Part 1
Introduction to Nanotechnology: Part 1Introduction to Nanotechnology: Part 1
Introduction to Nanotechnology: Part 1glennfish
 
Biological explanations of addiction 2013
Biological explanations of addiction 2013Biological explanations of addiction 2013
Biological explanations of addiction 2013sssfcpsychology
 

En vedette (14)

epact
epactepact
epact
 
The Uniform Resource Layer
The Uniform Resource LayerThe Uniform Resource Layer
The Uniform Resource Layer
 
NIF Lexical Overview
NIF Lexical OverviewNIF Lexical Overview
NIF Lexical Overview
 
NIF services overview
NIF services overviewNIF services overview
NIF services overview
 
The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...The Neuroscience Information Framework: A Scalable Platform for Information E...
The Neuroscience Information Framework: A Scalable Platform for Information E...
 
INCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource LayerINCF 2013 - Uniform Resource Layer
INCF 2013 - Uniform Resource Layer
 
Introduction to Nanotechnology: Part 2
Introduction to Nanotechnology: Part 2Introduction to Nanotechnology: Part 2
Introduction to Nanotechnology: Part 2
 
Introduction to Nanotechnology: Part 4
Introduction to Nanotechnology:  Part 4Introduction to Nanotechnology:  Part 4
Introduction to Nanotechnology: Part 4
 
Behavioural explanations of addiction 2013
Behavioural explanations of addiction 2013Behavioural explanations of addiction 2013
Behavioural explanations of addiction 2013
 
Introduction to Nanotechnology: Part 3
Introduction to Nanotechnology: Part 3Introduction to Nanotechnology: Part 3
Introduction to Nanotechnology: Part 3
 
Cognitive explanations 2013
Cognitive explanations 2013Cognitive explanations 2013
Cognitive explanations 2013
 
PSYA4 Addiction - latest
PSYA4 Addiction - latestPSYA4 Addiction - latest
PSYA4 Addiction - latest
 
Introduction to Nanotechnology: Part 1
Introduction to Nanotechnology: Part 1Introduction to Nanotechnology: Part 1
Introduction to Nanotechnology: Part 1
 
Biological explanations of addiction 2013
Biological explanations of addiction 2013Biological explanations of addiction 2013
Biological explanations of addiction 2013
 

Similaire à Neurosciences Information Framework (NIF): An example of community Cyberinfrastructure for the Neurosciences

Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?Maryann Martone
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
 
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyJim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyICZN
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNADaniel S. Katz
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...Neuroscience Information Framework
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Spark Summit
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...dkNET
 
Enabling knowledge management in the Agronomic Domain
Enabling knowledge management in the Agronomic DomainEnabling knowledge management in the Agronomic Domain
Enabling knowledge management in the Agronomic DomainPierre Larmande
 
De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...taxonbytes
 
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince Smith
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Maryann Martone
 
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...Tom Moritz
 
Sakai09 Repo Case Study
Sakai09 Repo Case StudySakai09 Repo Case Study
Sakai09 Repo Case Studyjrmdkc
 
Panel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewPanel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewLarry Smarr
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 

Similaire à Neurosciences Information Framework (NIF): An example of community Cyberinfrastructure for the Neurosciences (20)

A Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource LandscapeA Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource Landscape
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
 
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to TaxonomyJim Woolley - Name Registration: One Less Impediment to Taxonomy
Jim Woolley - Name Registration: One Less Impediment to Taxonomy
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information Framework
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
 
The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...The Neuroscience Information Framework: Establishing a practical semantic fra...
The Neuroscience Information Framework: Establishing a practical semantic fra...
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
bioCADDIE Webinar: The NIDDK Information Network (dkNET) - A Community Resear...
 
Enabling knowledge management in the Agronomic Domain
Enabling knowledge management in the Agronomic DomainEnabling knowledge management in the Agronomic Domain
Enabling knowledge management in the Agronomic Domain
 
De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...
 
Vince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notextVince smith-delivering biodiversity knowledge in the information age-notext
Vince smith-delivering biodiversity knowledge in the information age-notext
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Big Data
Big Data Big Data
Big Data
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
 
Sakai09 Repo Case Study
Sakai09 Repo Case StudySakai09 Repo Case Study
Sakai09 Repo Case Study
 
Panel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An OverviewPanel: The Global Research Platform: An Overview
Panel: The Global Research Platform: An Overview
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 

Plus de Neuroscience Information Framework

The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework Neuroscience Information Framework
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Neuroscience Information Framework
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateNeuroscience Information Framework
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...Neuroscience Information Framework
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesNeuroscience Information Framework
 

Plus de Neuroscience Information Framework (20)

Why should my institution support RRIDs?
Why should my institution support RRIDs?Why should my institution support RRIDs?
Why should my institution support RRIDs?
 
Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?
 
Funders and RRIDs
Funders and RRIDsFunders and RRIDs
Funders and RRIDs
 
NIF Services
NIF ServicesNIF Services
NIF Services
 
NIF Data Registration
NIF Data RegistrationNIF Data Registration
NIF Data Registration
 
NIF Data Ingest
NIF Data IngestNIF Data Ingest
NIF Data Ingest
 
NIF Data Federation
NIF Data FederationNIF Data Federation
NIF Data Federation
 
NIF Overview
NIF Overview NIF Overview
NIF Overview
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework
 
NIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layerNIF: A vision for a uniform resource layer
NIF: A vision for a uniform resource layer
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
Navigating the Neuroscience Data Landscape
Navigating the Neuroscience Data LandscapeNavigating the Neuroscience Data Landscape
Navigating the Neuroscience Data Landscape
 
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity DebateIn Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
In Search of a Missing Link in the Data Deluge vs. Data Scarcity Debate
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
Neuroscience Information Framework Ontologies: Nerve cells in Neurolex and NI...
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL Ontologies
 
NIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for NeuroscienceNIFSTD: A Comprehensive Ontology for Neuroscience
NIFSTD: A Comprehensive Ontology for Neuroscience
 
NIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information SystemNIF as a Multi-Model Semantic Information System
NIF as a Multi-Model Semantic Information System
 

Dernier

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sectoritnewsafrica
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...itnewsafrica
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...amber724300
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 

Dernier (20)

Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
4. Cobus Valentine- Cybersecurity Threats and Solutions for the Public Sector
 
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
Irene Moetsana-Moeng: Stakeholders in Cybersecurity: Collaborative Defence fo...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
JET Technology Labs White Paper for Virtualized Security and Encryption Techn...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 

Neurosciences Information Framework (NIF): An example of community Cyberinfrastructure for the Neurosciences

  • 1. Maryann  E.    Martone,  Ph.  D.   University  of  California,  San  Diego  
  • 2. “A  grand  challenge  in  neuroscience  is  to  elucidate  brain  func>on  in  rela>on  to   its  mul>ple  layers  of  organiza>on  that  operate  at  different  spa>al  and   temporal  scales.    Central  to  this  effort  is  tackling  “neural  choreography”  -­‐-­‐   the  integrated  func>oning  of  neurons  into  brain  circuits-­‐-­‐  Neural   choreography  cannot  be  understood  via  a  purely  reduc>onist  approach.   Rather,  it  entails  the  convergent  use  of  analy>cal  and  synthe>c  tools  to   gather,  analyze  and  mine  informa>on  from  each  level  of  analysis,  and   capture  the  emergence  of  new  layers  of  func>on  (or  dysfunc>on)  as  we   move  from  studying  genes  and  proteins,  to  cells,  circuits,  thought,  and   behavior....     However,  the  neuroscience  community  is  not  yet  fully  engaged  in  exploi;ng   the  rich  array  of  data  currently  available,  nor  is  it  adequately  poised  to   capitalize  on  the  forthcoming  data  explosion.    “   Akil  et  al.,  Science,  Feb  11,  2011  
  • 3. •  In  that  same  issue  of  Science   –  Asked  peer  reviewers  from  last  year  about  the  availability  and  use  of   data   •  About  half  of  those  polled  store  their  data  only  in  their   laboratories—not  an  ideal  long-­‐term  solu>on.     •  Many  bemoaned  the  lack  of  common  metadata  and   archives  as  a  main  impediment  to  using  and  storing   data,  and  most  of  the  respondents  have  no  funding  to   support  archiving   •  And  even  where  accessible,  much  data  in  many  fields  is   too  poorly  organized  to  enable  it  to  be  efficiently  used.   “...it  is  a  growing  challenge  to  ensure  that  data  produced  during  the  course   of  reported  research  are  appropriately  described,  standardized,  archived,   and  available  to  all.”    Lead  Science  editorial,  2011  
  • 4. Neuroscience  is  unlikely  to  be   served  by  a  few  large  databases   like  the  genomics  and  proteomics   community  Whole  brain  data   (20  um   microscopic  MRI)   Mosiac  LM   images  (1  GB+)   Conven>onal  LM   images   Individual  cell   morphologies   EM  volumes  &   reconstruc>ons   Solved  molecular   structures   No  single  technology  serves  these  all   equally  well.    Mul6ple  data  types;    mul6ple   scales;    mul6ple  databases  
  • 6. •  Current  web  is   designed  to  share   documents   – Documents  are   unstructured  data   •  Much  of  the   content  of  digital   resources  is  part  of   the  “hidden  web”   •  Wikipedia:    The  Deep  Web   (also  called  Deepnet,  the   invisible  Web,  DarkNet,   Undernet  or  the  hidden   Web)  refers  to   World  Wide  Web  content   that  is  not  part  of  the   Surface  Web,  which  is   indexed  by  standard   search  engines.  
  • 7. •  NIF  has  developed  a   produc>on  technology   pla]orm  for  researchers  to:   –  Discover   –  Share   –  Analyze   –  Integrate     neuroscience-­‐relevant   informa>on   •  Since  2008,  NIF  has   assembled  the  largest   searchable  catalog  of   neuroscience  data  and   resources  on  the  web   •  Cost-­‐effec>ve  and   innova>ve  strategy  for   managing  data  assets   “This  unique  data  depository  serves  as  a  model   for  other  Web  sites  to  provide  research  data.  “  -­‐   Choice  Reviews  Online   NIF  is  poised  to  capitalize  on  the  new  tools   and  emphasis  on  big  data  and  open   science  
  • 8. h?p://neuinfo.org   June10,  2013   dkCOIN  Inves>gator's  Retreat   8   •  A  portal  for  finding  and  using   neuroscience  resources     A  consistent  framework  for   describing  resources     Provides  simultaneous   search  of  mul>ple  types  of   informa>on,  organized  by   category     Supported  by  an  expansive   ontology  for  neuroscience     U>lizes  advanced   technologies  to  search  the   “hidden  web”   UCSD,  Yale,  Cal  Tech,  George  Mason,  Washington  Univ   Literature   Database   Federa>on   Registry  
  • 9. • NIF  Registry:    A  catalog   of  neuroscience-­‐ relevant  resources   • >  6000  currently   listed   • >  2200  databases   • And  we  are  finding   more  every  day   “Of  relevance  to  neuroscience”  is  very  broad  
  • 10. dkCOIN  Inves>gator's  Retreat   10   • NIF  curators   • Nomina>on  by  the   community   • Semi-­‐automated  text  mining   pipelines    NIF  Registry    Requires  no  special   skills    Site  map  available  for   local  hos>ng   • NIF  Data  Federa>on   • DISCO  interop   • Requires  some   programming  skill   Low  barrier  to  entry  
  • 11. •  Extended  over  >me   –  Parent  resource   –  Suppor>ng  agency   –  Grant  numbers   –  Accessibility   –  Related  to   –  Organism   –  Disease  or  condi>on   –  Last  updated   First  catalog:    SFN  Neuroscience  Database  Gateway    NIF  0.5    NIF  1.0+   Simple  metadata  model   Name,  descrip>on,  type,  URL,  other  names,  keywords,   unique  iden>fier                                                                              ~2003                                                                  2006                          2008  
  • 12. 12   •  NIF  Registry  is  hosted   on  Seman>c  Media   Wiki  pla]orm   Neurolex   –  Community  can  add,   review,  edit  without   special  privileges   –  Searchable  by  Google   –  Integrated  with  NIF   ontologies   –  Graph  structure   Seman>c  wiki:    A  wiki  with  seman>cs;    pages  are  linked  through  rela>onships  
  • 13. NIF  is  crea>ng  the  linked  data  graph  of  resources  
  • 14. –  NIF  employs  an  automated  link  checker     –  Last  analysis:    478/6100  invalid  URL’s  (~8%)   –  199  can’t  locate  at  another  university  or  loca>on    out  of  service  (~3%)   –  Bigger  issue:    Many  resources  are  no  longer  updated  or  maintained   0   20   40   60   80   100   120   140   160   180   200   1996   1998   2000   2002   2004   2006   2008   2010   2012   2014   0   500   1000   1500   2000   2500   3000   3500   Resources  added   Last  updated  
  • 15. Keeping  content  up   to  date   Connectome   Tractography   Epigene>cs   • New  tags  come  into   existence   • New  resource  types  come   into  existence,  e.g.,  Mobile   apps   • Resources  add  new  types  of   content     • Change  name   • Change  scope   • >  7000  updates  to  the   registry  last  year   It’s  a  challenge  to  keep  the  registry  up  to  date;     sitemaps,  cura>on,  ontologies,  community  review  
  • 16. • The  NIF  Registry  has  created  a  linked  data   graph  of  web-­‐accessible  resources   • Maintained  on  a  community  wiki   pla]orm   • Provides  data  on  the  fluidity  of  the   resource  landscape   –  New  resources  con>nue  to  be  created  and   found   –  Rela>vely  few  disappear  altogether   –  Many  more  grow  stale,  although  their  value   may  s>ll  be  significant   –  Maintaining  up  to  date  cura>on  requires   frequent  upda>ng   NIF  Registry  provides  insight  into  the  state  of  digital   resources  on  the  web  
  • 17. • The  NIF  data  federa>on  performs  deep  search  over   the  content  of  over  200  databases   • New  databases  are  added  at  a  rate  of  25-­‐40  per  year   • Latest  update:    Open  Source  Brain;    ingest   completed  in  2  hours   • Databases  chosen  on  a  variety  of  criteria:   • Early:    tes>ng  different  types  of  resources   • Thema>c  areas   • Volunteers   NIF  provides  access  to  the  largest  aggrega>on  of   neuroscience-­‐relevant  informa>on  on  the  web  
  • 18. •  NIF  was  one  of  the  first  projects  to  aZempt  data  integra>on   in  the  neurosciences  on  a  large  scale   •  NIF  is  supported  by  a  contract  that  specified  the  number  of   resources  to  be  added  per  year     –  Designed  to  be  populated  rapidly;    set  up  process  for  progressive   refinement   –  No  budget  was  allocated  to  retrofit  exis>ng  resources;    had  to   work  with  them  in  their  current  state   –  We  designed  a  system  that  required  liZle  to  no  coopera>on  or   work  from  providers   –  Supports  many  formats:    rela>onal,  XML,  RDF  
  • 19. Current   Planned   DISCO  Dashboard  Func6ons   •  Ingest  Script  Manager   •  Public  Script  Repository   •  Data  &  Event  Tracker   •  Versioning  System   •  Curator  Tool     •  Data  Transformer  Manager   June10,  2013   dkCOIN  Inves>gator's  Retreat   19  Luis  Marenco,  Rixin  Wang,  Perrry  Miller,  Gordon  Shepherd   Yale  University  
  • 20. 0   50   100   150   200   250   0.01   0.1   1   10   100   1000   6-­‐12   12-­‐12   7-­‐13   1-­‐14   8-­‐14   2-­‐15   9-­‐15   4-­‐16   10-­‐16   5-­‐17   Number  of  Federated  Databases   Number  of  Federated  Records  (Millions)   NIF  searches  the  largest  colla>on  of   neuroscience-­‐relevant  data  on  the  web   DISCO   June10,  2013   dkCOIN  Inves>gator's  Retreat   20  
  • 21. Results  categorized  by  data  type  and  level   of  nervous  system    
  • 22. Hippocampus  OR  “Cornu  Ammonis”  OR   “Ammon’s  horn”   Query  expansion:    Synonyms   and  related  concepts   Boolean  queries   Data  sources   categorized  by   “data  type”  and   level  of  nervous   system   Common  views   across  mul>ple   sources   Tutorials  for  using   full  resource  when   gewng  there  from   NIF   Link  back  to   record  in   original  source  
  • 23. Connects  to   Synapsed  with   Synapsed  by   Input  region   innervates   Axon  innervates   Projects  to  Cellular  contact   Subcellular  contact   Source  site   Target    site   Each  resource  implements  a  different,  though  related  model;     systems  are  complex  and  difficult  to  learn,  in  many  cases  
  • 24. • NIF  Connec>vity:    7  databases  containing  connec>vity  primary  data  or  claims   from  literature  on  connec>vity  between  brain  regions   • Brain  Architecture  Management  System  (rodent)   • Temporal  lobe.com  (rodent)   • Connectome  Wiki  (human)   • Brain  Maps  (various)   • CoCoMac  (primate  cortex)   • UCLA  Mul>modal  database  (Human  fMRI)   • Avian  Brain  Connec>vity  Database  (Bird)   • Total:    1800  unique  brain  terms  (excluding  Avian)   • Number  of  exact  terms  used  in  >  1  database:    42   • Number  of  synonym  matches:    99   • Number  of  1st  order  partonomy  matches:    385  
  • 25. – You  (and  the  machine)  have  to  be  able  to   find  it   •  Accessible  through  the  web   •  Annota>ons   – You  have  to  be  able  to  access  and  use  it   •  Data  type  specified  and  in  a  usable  form   – You  have  to  know  what  the  data  mean   •  Some  seman>cs:    “1”   •  Context:    Experimental  metadata   •  Provenance:    Where  did  the  data  come  from?   Repor>ng  neuroscience  data  within  a  consistent  framework  helps   enormously  
  • 26. Knowledge  in  space  and  spa>al  rela>onships   (the  “where”)   Knowledge  in  words,  terminologies  and   logical  rela>onships  (the  “what”)  
  • 27. •  NIF  covers  mul>ple  structural  scales  and  domains  of  relevance  to  neuroscience   •  Aggregate  of  community  ontologies  with  some  extensions  for  neuroscience,  e.g.,  Gene   Ontology,  Chebi,  Protein  Ontology   NIFSTD   Organism   NS  Func>on  Molecule   Inves>ga>on   Subcellular   structure   Macromolecule   Gene   Molecule  Descriptors   Techniques   Reagent   Protocols   Cell   Resource   Instrument   Dysfunc>on   Quality   Anatomical   Structure   NIF  capitalizes  on  the  growing  set  of  community  ontologies   available  in  biomedical  science  
  • 28. Purkinje   Cell   Axon   Terminal   Axon   Dendri>c   Tree   Dendri>c   Spine   Dendrite   Cell  body   Cerebellar   cortex   There  is  liZle  obvious  connec>on  between   data  sets  taken  at  different  scales  using   different  microscopies  without  an  explicit   representa>on  of  the  biological  objects  that   the  data  represent  
  • 29. Brain   Cerebellum   Purkinje  Cell  Layer   Purkinje  cell   neuron   has  a   has  a   has  a   is  a   •  Ontology:  an  explicit,  formal  representa>on   of  concepts    rela>onships  among  them   within  a  par>cular  domain  that  expresses   human  knowledge  in  a  machine  readable   form   –  Branch  of  philosophy:    a  theory  of  what  is   –  e.g.,  Gene  ontologies   •  Provide  universals  for  naviga>ng  across   different  data  sources   –  Seman>c  “index”   •  Provide  the  basis  for  concept-­‐based   queries  to  probe  and  mine  data   –  Perform  reasoning   –  Link  data  through  rela>onships  not  just  one-­‐ to-­‐one  mappings  
  • 30. “Search  compu6ng”   What  genes  are  upregulated  by  drugs  of  abuse   in  the  adult  mouse?   Morphine   Increased   expression   Adult  Mouse   Some  concepts,  e.g.,  age  category,  are  quan>ta>ve  but   s>ll  must  be  interpreted  in  a  global  query  system  
  • 31.
  • 32. June10,  2013   dkCOIN  Inves>gator's  Retreat   32  
  • 33. hZp://neurolex.org   Stephen  Larson   • Provide  a  simple   interface  for  defining  the   concepts  required   • Light  weight  seman>cs   • Good  teaching  tool  for   learning  about  seman>c   integra>on  and  the   benefits  of  a  consistent   seman>c  framework   • Community  based:   • Anyone  can  contribute   their  terms,  concepts,   things   • Anyone  can  edit   • Anyone  can  link   • Accessible:    searched  by   Google   • Growing  into  a  significant   knowledge  base  for   neuroscience   Demo    D03    200,000   edits    150   contributors  
  • 34. •  NIF  can  be  used  to  survey  the   data  landscape   •  Analysis  of  NIF  shows  mul>ple   databases  with  similar  scope   and  content   •  Many  contain  par>ally   overlapping  data   •  Data  “flows”  from  one   resource  to  the  next   –  Data  is  reinterpreted,  reanalyzed  or   added  to   •  Is  duplica>on  good  or  bad?  
  • 35. Databases  come  in  many  shapes  and  sizes   •  Primary  data:   –  Data  available  for  reanalysis,  e.g.,   microarray  data  sets  from  GEO;     brain  images  from  XNAT;     microscopic  images  (CCDB/CIL)   •  Secondary  data   –  Data  features  extracted  through   data  processing  and  some>mes   normaliza>on,  e.g,  brain  structure   volumes  (IBVD),  gene  expression   levels  (Allen  Brain  Atlas);    brain   connec>vity  statements  (BAMS)   •  Ter>ary  data   –  Claims  and  asser>ons  about  the   meaning  of  data   •  E.g.,  gene  upregula>on/ downregula>on,  brain   ac>va>on  as  a  func>on  of  task   •  Registries:   –  Metadata   –  Pointers  to  data  sets  or   materials  stored  elsewhere   •  Data  aggregators   –  Aggregate  data  of  the  same   type  from  mul>ple  sources,   e.g.,  Cell  Image   Library  ,SUMSdb,  Brede   •  Single  source   –  Data  acquired  within  a  single   context  ,  e.g.,  Allen  Brain  Atlas   Researchers  are  producing  a  variety  of   informa>on  ar>facts  using  a  mul>tude  of   technologies  
  • 36. NIF  Analy6cs:    The  Neuroscience  Landscape   NIF  is  in  a  unique  posi>on  to  answer  ques>ons  about  the  neuroscience   landscape   Where  are  the  data?   Striatum   Hypothalamus   Olfactory  bulb   Cerebral  cortex   Brain   Brain  region   Data  source   Vadim  Astakhov,  Kepler  Workflow  Engine  
  • 37. Diseases  of  nervous  system   Adding  more  seman6cs   The  combina>on  of  ontologies,  diverse  data  and  analy>cs  lets  us  look  at   the  current  landscape  in  interes>ng  ways       Neurodegenera>ve   Seizure  disorders   Neoplas>c  disease  of  nervous  system   NIH   Reporter   NIF  data  federated  sources  
  • 38. •  Gemma:    Gene  ID    +  Gene  Symbol   •  DRG:    Gene  name  +  Probe  ID   •  Gemma  presented  results  rela>ve  to  baseline  chronic   morphine;    DRG  with  respect  to  saline,  so  direc>on  of  change  is   opposite  in  the  2  databases   •           Analysis:   • 1370  statements  from  Gemma  regarding  gene  expression  as  a  func>on  of  chronic   morphine   • 617  were  consistent  with  DRG;      over  half    of  the  claims  of  the  paper  were  not   confirmed  in  this  analysis   • Results  for  1  gene  were  opposite  in  DRG  and  Gemma   • 45  did  not  have  enough  informa>on  provided  in  the  paper  to  make  a  judgment   Rela>vely  simple  standards  would  make  life  easier  
  • 39. NIF  favors  a  hybrid,  >ered,   federated  system   •  Domain  knowledge   –  Ontologies   •  Claims,  models  and   observa>ons   –  Virtuoso  RDF  triples     –  Model  repositories   •  Data   –  Data  federa>on   –  Spa>al  data   –  Workflows   •  Narra>ve   –  Full  text  access   Neuron   Brain  part   Disease   Organism   Gene   Caudate  projects  to   Snpc   Grm1  is  upregulated  in   chronic  cocaine   Betz  cells   degenerate  in  ALS   NIF  provides  the  tentacles  that  connect  the  pieces:    a   new  type  of  en>ty  for  21st  century  science   Technique   People  
  • 40. •  2006-­‐2008:    A  survey  of  what  was  out  there   •  2008-­‐2009:    Strategy  for  resource  discovery   –  NIF  Registry  vs  NIF  data  federa>on   –  Inges>on  of  data  contained  within  different  technology  pla]orms,  e.g.,  XML  vs  rela>onal   vs  RDF   –  Effec>ve  search  across  seman>cally  diverse  sources   •  NIFSTD  ontologies   •  2009-­‐2011:    Strategy  for  data  integra>on   –  Unified  views  across  common  sources   –  Mapping  of  content  to  NIF  vocabularies   •  2011-­‐present:    Data  analy>cs   –  Uniform  external  data  references   •  2012-­‐present:      SciCrunch:    unified  biomedical  resource   services   NIF  provides  a  strategy  and  set  of  tools  applicable  to  all   domains  grappling  with  mul>ple  sources  of  diverse  data   (i.e.,  preZy  much  everything)  
  • 41. •  Search  seman>cs   •  Ranking   •  Resources  supported  by  NIH  Blueprint  Ins>tutes  are   more  thoroughly  covered   •  Data  types,  e.g.,  Brain  ac>va>on  foci   June10,  2013   dkCOIN  Inves>gator's  Retreat   41  
  • 42. June10,  2013   42   SciCrunch   NIF   MONARCH   Community   Services   dkCOIN   Shared   Resources   Undiagnosed   Disease  Program   Phenotype  RCN   3D  Virtual  Cell   Na>onal  Ins>tute   on  Aging   One  Mind  for   Research   BIRN   Interna>onal   Neuroinforma>cs   Coordina>ng   Facility   Model  Organism   Databases   Community   Outreach   DELSA   (not  just  a  data  catalog)  
  • 43. 43   • 3dVC:    Focus  on  models  and  simula>on   • Gene  Ontology:    Focus  on   bioinforma>cs  tools   • Na>onal  Ins>tute  on  aging:  Aging-­‐ related  data  sets   • Monarch:    Phenotype-­‐Genotype;    deep   seman>c  data  integra>on   • One  Mind  for  Research:    Biospecimen   repositories   • NeuroGateway:    Computa>onal   resources   • FORCE11:    Tools  for  next-­‐gen  publishing   and  e-­‐scholarship   SciCrunch   SciCrunch  is  ac>vely  suppor>ng  mul>ple   communi>es;  mul>ple  communi>es  are   enriching    and  improving  SciCrunch        
  • 44. Community   database:   beginning   Community   database:     End   “How  do  I  share  my   data/tool?”   “There  is  no  database   for  my  data”   1   2   3   4   Ins3tu3onal   repositories   Cloud   INCF:    Global   infrastructure   Government   Educa>on   Industry   NIF  is  designed  to  leverage  exis>ng  investments  in  resources  and  infrastructure   Tool  repositories  
  • 45. •  No  one  can  be  stopped  from  doing  what  they  need  to  do     •  Every  resource  is  resource  limited:    few  have  enough  >me,  money,   staff  or    exper>se  required  to  do  everything  they  would  like   –  If  the  market  can  support  11  MRI  databases,  fine   –  Some  consolida>on,  coordina>on  is  warranted  though   •  Big,  broad  and  messy  beats  small,  narrow  and  neat   –  Without  trying  to  integrate  a  lot  of  data,  we  will  not  know  what  needs  to  be  done   –  A  lot  can  be  done  with  messy  data;    neatness  helps  though   –  Progressive  refinement;    addi>on  of  complexity  through  layers   •  Be  flexible  and  opportunis>c   –  A  single    op>mal  technology/container  for  all  types  of  scien>fic  data  and  informa>on  does  not  exist;     technology  is  changing   •  Think  globally;    act  locally:   –  No  source,  not  even  NIF,  is  THE  source;    we  are  all  a  source  
  • 46. •  Several  powerful  trends  should  change  the  way  we  think  about   our  data:    One    Many   –  Many  data   •  Genera>on  of  data  is  gewng  easier    shared  data   •  Data  space  is  gewng  richer:    more  –omes  everyday   •  But...compared  to  the  biological  space,  s>ll  sparse   –  Many  eyes   •  Wisdom  of  crowds   •  More  than  one  way  to  interpret  data   –  Many  algorithms   •  Not  a  single  way  to  analyze  data   –  Many  analy>cs   •  “Signatures”  in  data  may  not  be  directly  related  to  the  ques>on  for  which  they   were  acquired  but  tell  us  something  really  interes>ng   Are  you  exposing  or  burying  your  work?  
  • 47. Jeff  Grethe,  UCSD,  Co  Inves>gator,  Interim  PI   Amarnath  Gupta,  UCSD,  Co  Inves>gator   Anita  Bandrowski,  NIF  Project  Leader   Gordon  Shepherd,  Yale  University   Perry  Miller   Luis  Marenco   Rixin  Wang   David  Van  Essen,  Washington  University   Erin  Reid   Paul  Sternberg,  Cal  Tech   Arun  Rangarajan   Hans  Michael  Muller   Yuling  Li   Giorgio  Ascoli,  George  Mason  University   Sridevi  Polavarum   Fahim  Imam   Larry  Lui   Andrea  Arnaud  Stagg   Jonathan  Cachat   Jennifer  Lawrence   Svetlana  Sulima   Davis  Banks   Vadim  Astakhov   Xufei  Qian   Chris  Condit   Mark  Ellisman   Stephen  Larson   Willie  Wong   Tim  Clark,  Harvard  University   Paolo  Ciccarese   Karen  Skinner,  NIH,  Program  Officer   (re>red)   Jonathan  Pollock,  NIH,  Program  Officer   And  my  colleagues  in  Monarch,  dkNet,  3DVC,  Force  11