Presentation by Ola Spjuth (Uppsala University and Scaleout) at the Chemical Biology Seminar Series, February 6th, at Karolinska Institutet and Science for Life Laboratory, Stockholm, Sweden.
ABSTRACT
Phenotypic profiling of cells with high-content imaging is emerging as an important methodology with high predictive power. The true power of these methods comes when integrated into automated, robotized systems that can be run continuously and not restricted to batch analysis. One of the main challenges then becomes how to manage and continuously analyze the large amounts of data produced. In this talk I will present our efforts to establish an automated lab for cell profiling of drugs using multiplexed fluorescence imaging (Cell Painting). I will describe our computational and lab infrastructure as well as the systems, tools an methods we are developing to sustain continuous profiling of cells and continuous AI modeling. A key objective in the group is on improving screening and toxicity assessment, but also to explore predictions of mechanisms and pathways. The long-term goal is to build a closed-loop system where results from analyses are used by an AI system to design the next round of experiments and iteratively improve the confidence in predictions. Research website: https://pharmb.io
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
Towards automated phenotypic cell profiling with high-content imaging
1. Towards automated phenotypic cell profiling
with high-content imaging
Ola Spjuth
Department of Pharmaceutical Biosciences, Uppsala University
Scaleout Systems AB
5. Use of live/temporal profiling will
increase
Data velocity will increase
Automated, continuous
analytics and AI will be needed
6.
7. Who are we?
• Academic research group at Uppsala University
• Background in computational pharmacology (data science, AI/ML)
• Good at e-infrastructure, big data (data engineering)
• Setting up an high-content imaging lab for cell profiling
Research group website: http://pharmb.io
8. Accelerate drug discovery using AI,
automation and intelligent design
of experiments
• Predict safety concerns
• Explain drug mechanisms
• Screen for new drugs
Research objective
9. Hypothesis
revise
Insight
• Iterative
• Flexible
• Mostly manual
• Slow
Experiments
Analysis and interpretation
Traditional hypothesis testing
• Retrospective analysis
• Hopefully predictive
• Expensive
• Limited for hypothesis
testing
more
Predictive modeling
Database
Data generation
Traditional Processing Stream Processing
Data
Data Query
request
response
Real- T ime
Analytics
Data Results
ModelPrediction
Modeling and prediction
10. Data-driven science
Data
Hypothesis
Scientist
Data
aditional Processing Stream Processing
Data
Repository
a Query
request
response
Real- T ime
Analytics
Data Results
Current fact finding
Analyze data in motion – before it is stored
Low latency paradigm, push model
Data driven: bring data to the analytics
al fact finding
d analyze information stored on disk
aradigm, pull model
driven: submits queries to static data
Model
Insights
11. Considerations for “the next experiment”
• Quality over Quantity: Better data is often more useful than simply
more data
• Data collection may be expensive
• Cost of time and materials for an experiment
• Cheap vs. expensive data
• Raw images vs. annotated images
• Want to collect best data at minimal cost
• Can machines (AI) learn with fewer training instances if they ask the
right questions?
12. Intelligently designing experiments
• Plan experiment under
constrained resources
• Vary factors and study response
• Seek optimal design
• DoE (Design of Experiments)
• Example: Select X combinations
of drugs to test (cannot measure
all combinations due to costs,
time etc.)
”…DECREASE, an efficient machine learning model that
requires only a limited set of pairwise dose–response
measurements for accurate prediction of drug combination
synergy in a given sample.”
13. Active learning: Which experiment should
be done next?
Nature Model
Data Passive
Learning
Nature ModelResponse
Active
Learning
Query
• Exploration: Could lead to better predictions in future
• Exploitation: Make best predictions given current data
• Tradeoff!
14. Our vision:
Closed-loop (autonomous) experimentation
Automation Informatics
Data
essing Stream Processing
Query
esponse
Real- T ime
Analytics
Data Results
Current fact finding
Analyze data in motion – before it is storedstored on disk
Continuous AI
Results
Intelligent design of
experiments Experiments
Scientist External data
15.
16. Automation in life science
• Varying degrees of automation!
• Automated instrument: working with a microplate (or stack of microplates)
• Robot: Liquid handling robot
• Automated lab
• A set of instruments, each working with microplates
• A plate handling robot serving multiple instruments
17. Robot scientist
1. King, R. D. et al. Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427, 247–252 (2004)
2. King, RD et al. "The Automation of Science". (2009) Science. 324 (5923): 85–89
3. Williams, K. et al."Cheaper faster drug development validated by the repositioning of drugs against neglected tropical diseases". (2015) Journal of the
Royal Society Interface. 12 (104): 20141289.
Adam1,2 is able to perform
independent experiments to test
hypotheses and interpret findings
without human guidance:
• hypothesizing to explain
observations
• devising experiments to test these
hypotheses
• physically running the
experiments using laboratory
robotics
• interpreting the results from the
experiments
• repeating the cycle as required
18. Let’s go back to cell profiling with high
content imaging!
20. Genetic or
chemical
perturbations
Experiments
in multi-
well plates
Imaging Features Hypotheses
Convolutional Neural Network
Predictions
Cell painting: Imaging with multiplexed dyes
Bray et al. (2016). “Cell Painting, a High-Content Image-Based Assay for Morphological
Profiling Using Multiplexed Fluorescent Dyes.” Nature Protocols 11 (9): 1757–74.
21. Holographic live cell imaging
• Quantitative phase-contrast microscopy
• Holographic phase-shift imaging
• Label-free, live cell imaging
• Used inside incubator
HoloMonitor system
22. Protein degradation Cholesterol-lowering DNA replication
Microtubule stabilizer Actin disruptor Kinase inhibitor
Classify images into biological
mechanisms
Kensert A, Harrison PJ, Spjuth O.
Transfer learning with deep convolutional neural network for classifying cellular morphological changes.
SLAS DISCOVERY: Advancing Life Sciences R&D. 24, 4 (2019)
25. Make predictions
using available
data
External data
Data warehouse
Design new
experimentsAI
Modeling
Publish data and models
Manual wet lab
Hypothesis
Verify using
external
protocol
Automated lab
Carry out
new
experiments
Analysis pipeline
Aim: Intelligent system for
drug/chemical profiling
Hypothesis
Hypothesis
test
generate
26. Fully automated cell painting
• Facilities, environmental control, ventilation
• Instruments and control software
• Automation system (dynamic scheduling)
• Lab protocol
• Compute and storage resources
• Analysis pipelines
27. Automating our cell-based lab
Fixed setup (version 1)
• ImageXpress XLS (Molecular Devices)
• Plate robot (Preciseflex)
• Plate incubator (Liconic), barcode reader
• BioMek 4000 liquid handling (Beckman
Coulter)
• Green Button Go lab automation software
(Biosero)
Observations:
• Quick to get up and running
• Suitable for fixed protocols
• Dependent on vendors to
solve problems
• Not easy to expand or
configure for us
Our priorities:
• Flexibility to expand/adapt
• Open source or good APIs
• Low cost, serviceable by us
• Configurable by us
30. Dealing with large scale data
• High volume, relatively high velocity
• Continuously process data, train
models, serve models
• Embrace scalable virtual
infrastructures (cloud) and
microservices (containers)
GPU cluster
CPU server
Storage
Cloud
HPC
Online processing
31. Robotized lab
images
Automating our data processing
ImageDBImage viewer
File system
Metadata Files (images)
https://github.com/pharmbio/imagedb
Cold storage
Hot storage
Online,
intelligent
processing
Cell profilesQC workflows Interestingness models
HASTE CORE and Cell Profiler Pipeline
https://github.com/HASTE-project/cellprofiler-pipeline
Avoid storing
uninteresting data
32. Robotized lab
Data scientists
Empowering our data scientists
ImageDB
File system
Metadata Files (images)
Models
CPU/GPU/HPC cloud
Notebooks
Data
Models
External
users
Services
Public services
Publish
33. Data is not static!
• Public databases
• Batch/continuous updates
• In-house data
• Batch/continuous updates
Need to continuously re-train models.
34. AI modeling life cycle
Model Development
ML studio
ML workflow
automation
Package & Deploy Models Model Serving
Model
management
Model
serving
Monitoring
Explore Data and
Develop Models
Train at scale
Register Model
and Metadata for
Serving
Package and
Publish Run in
operations Monitor
LoggingIntegrate
Data
scientist
Data
Engineer
Data
Engineer
Promote
Model
Ship
Model
In collaboration with:
https://github.com/leanaiorg/leanaistack
Lean AI Stack
36. Relevant software we develop (and others
could use)
• Virtual Infrastructure with Kubernetes (IaaC)
• Portable, scalable, resilient
• ImageDB (projects, images, results etc.)
• ImageViewer
• Batch and Continuous Cell Profiler pipelines
• Deep Learning notebooks
• Open source lab automation system (in progress)
• Design cell-based experiments (in progress)
• Construct steering protocols for robotized lab
• Compound annotation project (to be started)
ImageDBImage viewer
https://github.com/pharmbio
37. Integrate with our other AI services
Site-of-metabolism and reaction types
http://ptp.service.pharmb.io/
https://metpred.service.pharmb.io/draw/
Target (safety) profiles
38. Data-driven science
Data
Hypothesis
Scientist
Data
aditional Processing Stream Processing
Data
Repository
a Query
request
response
Real- T ime
Analytics
Data Results
Current fact finding
Analyze data in motion – before it is stored
Low latency paradigm, push model
Data driven: bring data to the analytics
al fact finding
d analyze information stored on disk
aradigm, pull model
driven: submits queries to static data
Insights
Stream Processing
Real- T ime
Analytics
Data Results
Current fact finding
Analyze data in motion – before it is stored
Low latency paradigm, push model
Data driven: bring data to the analytics
isk
ata
Data
Repository
Data Query
request
response
Real- T ime
Analytics
Data Results
Current fact finding
Analyze data in motion – before it is stored
Low latency paradigm, push model
Data driven: bring data to the analytics
Historical fact finding
Find and analyze information stored on disk
Batch paradigm, pull model
Query-driven: submits queries to static data
39. Some ongoing projects
• Cell painting on combinations of 2-3 environmental compounds
• Cell painting on 380 kinase inhibitors on U2OS and MCF7 cell lines
• Cell painting and holographic imaging of 120 GPCR drugs
• Exploring dynamics of drug delivery using LNPs via imaging
• Deep Learning (CNN, RNN) and Cell Profiler features
• Comparing Cell Morphology with Gene Expression (public data)
• Several other projects in the pipeline…
40. We believe in Open Science
• All source code (software, notebooks) published online:
https://github.com/pharmbio
• Protocols published online
• https://protocol-delivery.protocols.opentrons.com/protocol/1494-uppsala-
university
• All data will be made available online
41. Collaborations and funding
IT-dept/UU
Andreas Hellander
Salman Toor
Carolina Wählby
Ida-Maria Sintorn
MedSci/UU
Kim Kultima
Stephanie Herman
Payam Emami
NGI/UGC
Adam Ameur
UUH/Clinical Genetics
Lucia Cavelier
AstraZeneca/Stena Line
Lars Carlsson
Ernst Ahlberg
Prosilico AB
Urban Fagerholm
Sven Hellberg
Karolinska
Institutet/MEB
Juni Palmgren
Martin Eklund
Jordi Carreras Puigvert
Karolinska
Institutet/IMM
Roland Grafström
Pekka Kohonen
SciLifeLab Data center
Johan Rung
Hanna Kultima
Funding:Consortia and involvements:
42. - Thank you -
Email: ola.spjuth@farmbio.uu.se
Web: https://pharmb.io