2. Measurements of health and productivity
Biological sequencing, chemical
characterization, yield / growth / weight, climate
data, structured & unstructured
Unifying heterogeneous
datasets
Moving beyond the Big Data craze:
3. Background
Purdue University, BSME,
Mechanical Engineering
Purdue University, MS,
Environmental Engineering
(Sustainability)
University of Iowa, PhD,
Environmental Engineering
(Microbiology/Bioremediatio
n)
Michigan State University
NSF Postdoc Math and Biology Fellow (cross-
training)
Computational Biologist
Microbiology / Microbial Ecology
4. GERMS Lab (Genomics &
Environmental Research in Microbial
Systems)
Jin Choi, PhD, University of Tennessee, ChemE
Ryan Williams, PhD, Iowa State, Ecology Evolution
Dan Shea, MS, Northeastern, Bioinforma
Website: germslab.org
5. GERMS Mission
We are changing the
environment that we live in.
To preserve our
environmental integrity, we
must understand and
manage the impacts of global
change.
Scientific research must
inform our decisions and
policy.
Therefore, we use innovative
scientific methods to evaluate
and understand our complex
6. Towards this Mission: Microbes as lens into
understanding global change in the natural
world
MICROBES
IN
ECOSYSTEMS
NATURE
AIR
WATER
SOIL
MICROBIOMES
HUMANS/ANIMAL
ENGINEERED
BIOREACTORS
WASTEWATER
7. GERMS Vision (5 year goals)
To provide scalable, quantitative tools to
monitor microbial responses in complex
environments
To identify the microbial drivers responding to
global change in complex environments (e.g.,
soils, waters, gut)
To predict and model the impacts of
microbial responses on ecosystem health and
servicesTo monitor, evaluate, and manage our microbial
partners and their services.
9. Scalable, quantitative tools to monitor microbial
responses in complex environments
Data Type Example
Cost per
sample /
Frequency of
sampling
Precision /
Water quality
information
Challenges
Water properties
chemical analysis of
water quality
narrow range of information about services
in ecosystem
Traditional integrity indicators
presence of coliform
bacteria
detection methods lack specificitity and are
often imprecise
Phytoplankton community
characterization
cyanotoxin detection
through fractionation of
ammonia
detection of toxicity may not reveal source
Microbial community
characterization (16S rRNA)
abundance of genes
present and assoiated
with all cyanobacteria
characterization of microbial community
structure may not reveal gene function;
data volume large for public understanding
Proposed MAVeRiC genes (DNA)
abundance of genes
present associated with
specific source of
pollution
identifying relevant genes of interest to
water quality; DNA reveals genes present
but not necessarily actively expressed
Proposed MAVeRiC genes (RNA)
abundance of genes
expressed and present
associated with specific
source pollution
identifying relevant genes of interest to
water quality
10. Scalable, quantitative tools to monitor microbial
responses in complex environments
Estimating risks from
pathogens
Biotic integrity of a healthy
water system
Sources of non point
pollution
Role of waters in stabilizing
climate change
Microbial genetic biomarkers can capture…
11. MicroArray Value and Risk Chip
(MAVeRIC)
$24 for 216 bioindicators/sample, estimates gene abundance of biological signals,
quantitative PCR
Pollution
Pathogens
Nutrients Toxicity
Biodiversity
Pollution
biomarkers:
Non point
pollution source
markers
Pathogen
biomarkers:
Specific
bacteria or
virus genes
Nutrient cycling
biomarkers:
Carbon, nitrogen,
phosphorus
metabolic genes
Toxicity biomarkers
Biodiversity biomarkers
A
B
C
D
Monitoring, Evaluating, Predicting
12. Scaling: Iowa Lake Waters (John
Downing and Chris Filstrup)
Integrate measurements of bioindicators with
water quality measurements in 132 lakes
sampled for a routine EPA-reported, lake water
quality assessment program.
Interdisciplinary
collaboration allowing for
evaluation and prediction
21. How do our microbial partners in our bodies
affect our stability and resilience to change?
Collaboration with ANL and University of Chicago (Eugene Chang and Daina Rin
We have the
same genes, but
why are you a
rounder?
22. A fascination with viruses
Despite its ferocity in humans, Ebola is a life-form of mysterious
simplicity. ..If it were the size of a piece of spaghetti, then a
human hair would be about twelve feet in diameter and would
resemble the trunk of a giant redwood tree. (Michael Specter,
New Yorker)
80% unknown
23. Concluding thoughts
All my projects depend heavily on
collaborations
Unifying heterogeneous datasets – improved
resolutions, investigating diverse questions
Biological data: Rapid, high resolution, cheap
Effective integrations are POISED FOR
IMPACT.
Looking forward to the adventure together!
Editor's Notes
Thank everyone for being so welcoming! And especially Lisa, Michelle, Steve, Susana, Sylvia for all their help and dealing with my pestering. Today, I’ll be giving a brief overview of some of the research that I’m excited to be tackling in the next 5 years. It’s a real honor to be in this department and I’m confident that I’m going to have a lot of fun here.
To begin with, its hard for me to be labeled as the Big Data faculty hire. If anything, I think what my goals are here is not the size of the data I use, but the way I use very different kinds of data to answer interesting & impactful questions. The types of data I work with vary broadly, but mainly, my experience lies in leveraging advances in biological sequencing and its accessibility now to describe natural systems.
I’ll start with a brief background. I’ve got, what I think, is a fairly atypical background – Starting with working with efficient combustion technologies in Mechanical Engineering to moving towards sustainable design in the built environment for my Master’s. I ended up taking a microbiology class that really awakened something in me and ended up studying how bacteria can eat pollutants in groundwater and sediments at the University of Iowa and continued working with the services microbes provide in the environment during my postdoc. At the time, there was a real need for microbiologists to learn to work with new sequencing technologies and the “big data” that was coming out of it, and so I spent a long postdoc re-training in this field and ended up being recruited at Argonne National Lab for a stint. I really wanted to do more research and train students, so I applied here, and here I am.
So I’m excited to say that I will get to work with students who will help me with many diverse projects. I’ve started what I’ve called the GERMS lab. I like this name a lot because historically, germs has been a word to describe naughty bacteria but today we talk a lot about good germs in our bodies and I want to promote positivity in association with well-behaved and even necessary bacteria surrounidng us. I’ll be joined next week by Jin and Ryan, postdocs. So if you see these two clowns in the hall, make sure you say hi and ask them if they’ve heard what a rough boss I am. In May, Dan will be joining us as a PhD student. The three of these guys make a neat team, a microbiologist, a statistician, and a programmer all interested in the questions I’ll be presenting today.
Together, as a lab, I hope that we can hold true to this mission statement.
More specifically, towards this mission, my research involves understanding microbial communities in the natural environment in the face of global change. Great field. Microbes are everywhere, and impact nearly every environment in our lives – even inside our bodies.
I’m going to give you a brief overview of 3 projects in 3 environments that are working towards this vision.
The valuation of waters in of the United States, under the Clean Water Act (CWA), has been approached very narrowly. Historically and currently, we are fairly limited to what we measure towards understanding water quality. Simple measurements of these indi- cators can answer public health questions such as ‘should I drink this water?’ or ‘should I swim at this beach?’. How- ever, these measurements are not useful when there is evidence of chronic contamination, and fecal pollution sources need to be identified to address the problem. I’ll argue that today that these values that have been impossible to evaluate due to a lack appropriate technology, estimating risks of illness from toxins or pathogens; biotic integrity of a healthy water system; the sources of non-point pollution; or the role of waters in stabilizing global problems such as climate change.
The valuation of waters in of the United States, under the Clean Water Act (CWA), has been approached very narrowly. Historically and currently, we are fairly limited to what we measure towards understanding water quality. Simple measurements of these indi- cators can answer public health questions such as ‘should I drink this water?’ or ‘should I swim at this beach?’. How- ever, these measurements are not useful when there is evidence of chronic contamination, and fecal pollution sources need to be identified to address the problem. I’ll argue that today that these values that have been impossible to evaluate due to a lack appropriate technology, estimating risks of illness from toxins or pathogens; biotic integrity of a healthy water system; the sources of non-point pollution; or the role of waters in stabilizing global problems such as climate change.
As agriculture increases on our planet, we are very concerned about how land management will impact our environment, esp in the face of a warming planet. Microbes in this soil are critical drivers of carbon cycling and decomposition in these systems and we’re very interested in their response to global change. The challenge is that these microbes are living in a very, arguably, the most complex environment on our planet and it is very hard to tease out who is the most critical drivers here.
Why do we need such vol. of data?
Most of us now recognize that microbial communities generally exhibit a high level of diversity, much highter than previously assume by what was revealed by classical microscopy and basic culturing techniques.
In soil, even in one gram of soil, there is estimated to be more microbial species than there are stars in the galaxy. We have far to go for any comprehensive characterization of any single soil community. A key question then Is why is soil diversity so high?
One reason may be that the soil structure provides unique niche that provide a high diversity of food resources.
Its varied structure provides stable, protective, and even ancient environments for microorganims.
Soil investigations are further complicated by the primarily dormant state of the large majority of the soil microbial population. The turnover rate of soil microbes is predicted to be over 30 fold and even up to 300 fold slower than that of microbes in the oceans.
And these microbes live in relatively unpredicatlbe patterns of pertubations – for example rainfall or leaf litter introduction. They also undergo defined temporal perturbations – diurnal energy input.
This complexity in the soil has formed a dynamic microbial ecosystem which interacts with nutrients, plants, and the soil structure itself at multiple scales.
I would argue that we as a field are still trying to find tractable methods of accessing these interactions and understanding the drivers of “healthy” or “productive” soils.
Despite its ferocity in humans, Ebola is a life-form of mysterious simplicity. A particle of Ebola is made of only six structural proteins, locked together to become an object that resembles a strand of cooked spaghetti. An Ebola particle is only around eighty nanometres wide and a thousand nanometres long. If it were the size of a piece of spaghetti, then a human hair would be about twelve feet in diameter and would resemble the trunk of a giant redwood tree.