Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Clustering without limits
1. NewsBytes
Aquaporin Simulations exchange experimentally for about ten
years. To him, aquaporins are a likely
De-Bunk Gas Exchange suspect for gas conduction because they
Assumptions exist in places where oxygen must go in
Biologists have long taken gas and carbon dioxide must come out. For
exchange for granted, assuming that example they are plentiful in cells that
gases simply seep through the cell’s lipid line the lung, in red blood cells, and in
membrane. Since 1998, however, evi- astrocytes—cells at the blood-brain barri-
dence has been building that gases er. But it’s very hard to measure small
might also be exchanged through pores changes in oxygen concentration at the
created by specialized proteins. surface of a membrane experimentally.
Now molecular dynamics simulations So Tajkhorshid’s team pitched in
of aquaporins have weighed in on the with molecular dynamics simulations.
question. The result: “It’s now well Aquaporins occur in groups of four
established that these proteins can con- (tetramers), with four pores that con-
duct gas molecules,” says Emad duct water (one through each aquapor-
Tajkhorshid, PhD, co-author of the in molecule) and one central pore
work and assistant professor of bio- where the molecules meet. The latter,
chemistry, pharmacology and biophysics until now, had no known function.
at the University of Illinois at Urbana- When simulated using two comple-
Champaign. But, he says, some uncer- mentary methods—explicit sampling
tainty remains: “Whether or not it’s with full gas permeation and implicit Simulations of the aquaporin tetramer
important in the human body, that’s the ligand sampling—the team found both found that carbon dioxide and oxygen are
controversial part.” The work was pub- oxygen and carbon dioxide were exchanged through the central pore—a site
lished in the March 2007 issue of the exchanged through that central pore. of previously unknown function. Image
Journal of Structural Biology. Carbon dioxide was also transmitted courtesy of Emad Tajkhorshid, a faculty
Fifteen to twenty years ago, scientists through the four water pores, while oxy- associate of the NIH Resource for
believed that water permeation through gen passed through those pores only Macromolecular Modeling and
lipid bilayers was enough for water trans- rarely. The research also found, howev- Bioinformatics, and his UIUC colleagues
port into and out of cells. Gradually, er, that a plain lipid bilayer conducts Klaus Schulten, Yi Wang, and Jordi Cohen.
“It’s now well established that [aquaporins] can conduct gas
molecules,” says Emad Tajkhorshid. “Whether or not it’s
important in the human body, that’s the controversial part.”
though, researchers realized that some two and a half times as much gas as one properties of the central pore.
cells need to control water permeability, embedded with aquaporin tetramers. Meanwhile, Boron’s group is looking for
and other cells have lipid bilayers that “The question is whether this pathway a system in which gas conduction
aren’t very permeable to water. is significant and makes any difference through aquaporins is a major pathway.
Aquaporins, it turned out, carry water in in terms of total permeability of the Says Tajkhorshid: “Even if it’s 30 per-
and out in a controllable fashion. “I membrane,” says Tajkhorshid. cent of total gas permeability, it becomes
think the same might be true for gas per- The researchers hypothesize that, as physiologically relevant because then
meability,” says Tajkhorshid. “Gas perme- with water permeability, aquaporins may you can control it.”
ability of a lipid bilayer is like an open be physiologically relevant to gas According to Nazih Nakhoul, PhD,
free highway where everything can go exchange when cells have dense, rigid research associate professor in biochem-
through. With a protein, you can have a lipid bilayers or when aquaporins occu- istry at Tulane University, “This idea of
gating mechanism and some regulation.” py a major fraction of the membrane. gas transport through membrane proteins
One of Tajkhorshid’s collaborators, Tajkhorshid plans to introduce point is really gaining support. It’s interesting to
Walter Boron, MD, PhD, professor of mutations inside the central pore and see molecular dynamics simulations con-
cellular and molecular physiology at Yale manipulate the behavior of a gating loop firm some of the earliest findings.”
University, has been working on gas to see how that changes the conducting —By Katharine Miller
2 BIOMEDICAL COMPUTATION REVIEW Summer 2007 www.biomedicalcomputationreview.org
2. NewsBytes
Parkinson’s Culprit its hexamer interacting with the cell mediate and each may last only as long
membrane required juggling around a as half of a nanosecond. Nevertheless,
Modeled million atoms, Tsigelny says. Tsigelny says, even such fleeting inter-
Under a microscope, the curious pro-
Yet more than the size of alpha-synu- mediates may aggregate. The pore-
tein clumps that dot the brains of
clein, what made it difficult to model like aggregates, they found, are far
Parkinson’s patients stick out like the
was its lack of structure. Alpha-synucle- more stable than single molecules of
culprits they are. But no one has yet
in is an intrinsically unstructured pro- alpha-synuclein.
caught the protein—alpha-synuclein—in
tein—one without a distinct three- Having this model “is one step for-
the act of causing disease. Now, investi-
dimensional shape. Most proteins con- ward,” says Hilal Lashuel, PhD, profes-
gators report in an April 2007 issue of
sistently fold into a favored shape to do sor at the Swiss Federal Institute of
FEBS Journal that they’re getting closer:
their jobs, a form that can be crystal- Technology in Lausanne, Switzerland.
they’ve modeled alpha-synuclein’s early
lized, imaged, and pored over. But The UCSD model provides a structural
aggregation and offered a detailed mech-
unstructured proteins flop this way and basis for testing the hypothesis that
anism for its participation in neuron
that, even while performing their spe- alpha-synuclein forms toxic pores, he
death.
cific tasks, making them very difficult adds. But Lashuel also cautions that
“This is not just the first computa-
to pin down and study. only biochemical and in vivo studies can
tional model of alpha-synuclein,” says
“We were not scared by an unstable prove whether alpha-synuclein pokes
Igor Tsigelny, PhD, an author of the
protein,” Tsigelny states. And he and holes in neurons. “Isolating the toxic
paper and a computational biologist at
his coworkers developed an unusual species is really the most difficult ques-
the San Diego Supercomputer Center.
“all-dynamic” approach to modeling tion we are dealing with. You have to
“Up to now, there was no molecular
the protein. None of the conformations catch it in the act.”
concept of the aggregation going on.”
are final—they are all considered inter- —By Louisa Dalton
In the brain cells of Parkinson’s
patients, alpha-synuclein first starts to
cluster as a proto-fibril. It then forms fib-
ril chains, and finally ends up in the
dense clumps of fibrils called Lewy bod-
ies. Some researchers have suggested in
the past few years that alpha-synuclein
knocks off neurons right at the begin-
ning of aggregation, long before it can be
detected as a Lewy body. Biochemical
and structural evidence hints that when
a few alpha-synuclein molecules first self-
assemble into proto-fibrils, they can
form pore-like ring structures. These
may interact with the cell membrane
and allow ions to enter the cell. The
entrance of ions such as Ca2+ could
lead to neuron death.
The computer model created by
Tsigelny and his colleagues at the
University of California, San Diego, sup-
ports this theory, providing detailed
dynamics of alpha-synuclein hexamers
and pentamers and their interaction
with the cell membrane. What’s more,
the model shows that another synuclein
in the cell—beta-synuclein—blocks alpha-
synuclein’s ring-making, suggesting at
least one avenue for future inhibitory
drug development.
Modeling such a complex aggregation
wasn’t simple. Alpha-synuclein is a large Alpha-synuclein poses as a pentamer, pore-like, on the surface of a cell membrane. Courtesy
protein (140 amino acids), and to model of Igor Tsigelny
www.biomedicalcomputationreview.org Summer 2007 BIOMEDICAL COMPUTATION REVIEW 3
3. NewsBytes
Clustering Without Limits “Part of the
Starting in preschool we all learn how
to get organized. Typically, we start with
pre-determined categories (dolls, trains, attraction of the
blocks); pre-set ideas about what belongs
in each category (Barbie: doll; Thomas [affinity propagation]
the Tank Engine: train) and a fixed num-
ber of bins to put things in. algorithm is that,
But what if you started with none of
those initial limitations? Could you still although it was
group the toys? It turns out that, in a
computer, such sorting is not only possi-
ble, but extremely efficient. Using a
complicated to
Frey and Dueck use affinity propagation
novel algorithm called affinity propaga-
tion, researchers at the University of
derive, it’s quite to cluster data around “exemplars”—
data points that best represent their
Toronto found that they can not only
cluster lots of different kinds of data simple to implement compatriots. In this graphic, after start-
ing with an equal chance of serving as an
appropriately, but do it better and faster
than other methods. The work was and to get an exemplar, candidates for that job have
already emerged (red dots). Each data
published in the February 16 issue of
Science. intuitive feel for it,” point sends messages to each candidate
exemplar conveying how well it repre-
“Almost all existing techniques work
on a hypothesis refinement basis: they says Brendan Frey. sents the blue point compared to other
candidate exemplars. And candidate
start off with a set of assumed groups
exemplars send messages conveying
and iteratively refine them,” says
their availability to serve as an exemplar
Brendan Frey, PhD, associate professor
for particular data points.
of electrical and computer engineering The task sounds mind-boggling: There
at the University of Toronto, co-author are a huge number of possible groupings.
of the paper. “To our knowledge, ours is But affinity propagation handles that says Dueck. Indeed the algorithm is so
the first algorithm to consider all possi- problem by sending messages between generic that Frey and Dueck used it to
ble groupings at once.” data points—pair-wise—so as to maximize analyze gene expression data, facial
the net similarity in images, and airline routes, while other
each group. “Each mes- researchers have found applications in
sage encapsulates or basketball statistics, the stock market and
summarizes a whole dis- computer vision. And many tasks in com-
tribution of possible putational biology require a computer to
groupings for one of the organize the data before using it to make
data points,” says predictions.
Delbert Dueck, a PhD “Part of the attraction of the algo-
candidate in Frey’s lab. rithm is that, although it was complicat-
“No one has done that ed to derive, it’s quite simple to imple-
before.” ment and to get an intuitive feel for it,”
Affinity propagation says Frey. There are basically only two
is based on an algo- equations to it. “Sometimes we’ll give a
rithm called belief prop- talk and get emails from people who’ve
agation, which has been implemented it the day after,” he says.
around in various incar- When the researchers looked at how
nations for many years. well the algorithm performed compared
But, say the authors, it’s to other clustering methods they found
an approach that has it remarkably efficient. “A problem our
never been applied to algorithm could solve in about five min-
If asked to cluster facial images, a standard clustering method
clustering. “Certainly utes on one computer would take other
(k-means clustering) would take up to a million years on a sin-
not to generic clustering methods up to one million years to solve
gle computer to achieve the accuracy achieved by affinity prop-
of any type of data,” on that same computer,” says Frey.
agation after five minutes.
4 BIOMEDICAL COMPUTATION REVIEW Summer 2007 www.biomedicalcomputationreview.org
4. NewsBytes
Tim Hughes, PhD, of the Center for lished out of the lab run by Tomaso was able to classify pictures of a busy
Cellular and Biomolecular Research at Poggio, PhD, at MIT’s McGovern street scene as well as other leading
the University of Toronto, is considering Institute for Brain Research. mathematics-based computer vision sys-
using affinity propagation in his For decades, scientists have struggled tems, as described in the March 2007
research. “It seems like it would do best to create computer programs that can rec- issue of IEEE Transactions on Pattern
when things really do form independent ognize visual objects as well as humans Analysis and Machine Intelligence.
groups, and when the data are can. Some computer systems excel at rec- Serre’s team then built a more com-
fairly sparse, so most of the correlation ognizing one particular object, but none plex system, consisting of many S and C
matrix can be dropped in early are anywhere close to recognizing the wide layers designed to closely match the flow
cycles,” he says. “I think it will work well range of objects observed by the human of information in a human brain during
with exon-profiling data or brain. Visual the first 100-200
genome-tiling data, where there is also a recognition is milliseconds of
constraint that the groups complicated by “We’ve built a model perception. This
have to correspond to regions near each two conflicting enhanced system
other on the chromosome.” goals: a program to be as close as performed as well
—By Katharine Miller must be specific as humans on a
enough to discrim- possible to what is rapid object recog-
inate between nition task: distin-
Computer Vision that different objects,
such as a person
known about the guishing animals
from non-animals
Mimics Human Vision
Our brains can recognize most of the
or a car, yet flexi-
ble enough to rec-
human visual when images were
flashed in front of
things we pass on an evening stroll: ognize the same humans and com-
Cars, buildings, trees, and people all reg- type of object in system,” says puters. The work
ister even at a great distance or from an different sizes, appeared in the
odd angle. Now, a new computer vision poses, and light- Thomas Serre. April 2007 issue of
program can do the same thing. It suc- ing. the Proceedings of
cessfully rivals the human ability to rap- To achieve these goals, Serre and col- the National Academy of Sciences. The
idly recognize objects in a complex pic- leagues used data recorded from real computer system even made errors simi-
ture because it mimics how information neurons in the visual system to program lar to the errors made by humans, sug-
flows during the initial stages of visual two fundamentally different kinds of vir- gesting that the model recapitulates the
perception. tual neurons called S (simple) and C early processes of the human visual sys-
“We’ve built a model to be as close as (complex) units. S units recognize specif- tem.
possible to what is known about the ic features of an image; C units monitor The model will be used as a tool by
human visual system,” explains Thomas a range of S units in one area and allow neuroscientists to better understand the
Serre, PhD, a postdoctoral associate in for variation in position and size. human visual system, and also has prac-
the Center for Biological and The researchers were surprised to tical applications for surveillance, driv-
Computational learning at MIT and find that a simple system, consisting of ing assistance, and autonomous robot-
lead author of two papers recently pub- four alternating layers of S and C units, ics. According to Poggio, the team’s next
When presented with a real-world
street scene (left), Serre’s computer
vision system successfully recog-
nized pedestrians, cars, buildings,
trees, sky, and the street (right).
Although not pictured, the model
also successfully identified bicycles.
Note the error in this example: the
model mistakenly classified a street
sign as a pedestrian. Graphic cour-
tesy of Stanley Bileschi, PhD,
McGovern Institute for Brain
Research at MIT.
www.biomedicalcomputationreview.org Summer 2007 BIOMEDICAL COMPUTATION REVIEW 5
5. NewsBytes
goal is to extend the model to include
the “back projections” from other parts
of the brain that allow feedback process-
ing of visual information after 200 mil-
liseconds. Agent-based computer models predict the
“This is the first demonstration that pattern (left) produced when genetically
a purely bottom up approach to visual identical cells have an inherent probability
object recognition, inspired by record- of changing (from green to red and vice
ings from the neurons in the brain, is versa), and the pattern (right) produced
effective as a practical computer vision when cells are triggered to change by an
system,” says Terry Sejnowski, PhD, extrinsic factor, such as cell density. Top
head of the Computational Neuro- images represent exponential growth;
biology Lab at the Salk Institute. “There bottom are at equilibrium. Courtesy of
is much more work to do, both to Andras Paldi.
improve its performance, and also to use
it to better understand how our own
visual system works.” agent based models of a tissue culture can affect the differentiation process.
—By Matthew Busse, PhD plate. In each model, all cells act inde- “The stem cell nature is not an intrinsic
pendently and can switch between two property of the cell,” he says. “It is a prop-
cell types: A or B. In the “extrinsic” erty of the whole cell population.” Paldi
model, A cells turn into B cells when it further believes the work supports the
Nature Versus gets crowded, and back to A cells when effort to find a way of converting adult,
Nurture In Silico they have more space. In the “intrinsic” differentiated cells into stem cells (and
Every generation, a few noncon- model, each cell has fixed probabilities of avoid the need for harvesting embryonic
formists crop up in tissue cultures of switching from A to B and back again. stem cells)—a possibility that has not just
genetically identical cells. The question is: When the scientific,
are the wayward simply born that way, or scientists ran the Why, in the same warm but social
did something in the environment affect models, they and political
them? “You have these two possibilities—
intrinsic or extrinsic, nature or nurture,”
found each pro-
duces a stable,
spot, getting the same implications
as well.
says Andras Paldi, PhD, a biologist at heterogeneous
Genethon in France. population, yet
rich media, do some cells Christa
Muller-
Now, Paldi and his colleagues have they differ in the Sieburg,
modeled such cultured cells to deter- cell patterns. differentiate and others PhD, how-
mine whether extrinsic or intrinsic The intrinsic ever, dis-
influences play a key role in the sponta- model predicts stay stem cells? putes that
neous emergence of phenotypic varia- lone A cells dis- scientific
tion. It turns out that for spatial patterns tributed evenly throughout a largely B conclusion. “The idea that mature cells
beyond randomness to arise, there has population. Extrinsic predicts that the A can turn into stem cells is very attractive
to be some effect of sensing neighboring cells will cluster. The result held even to many modelers but has little support
cells—i.e., extrinsic factors must play a though the cells were allowed to migrate. through experimental data,” says the
role. And the extrinsic model resembles This pattern difference allowed the professor at the Sidney Kimmel Cancer
results seen in real cells. The work researchers to compare their computa- Center.
appears in April in PLoS One. tional simulation with real cells. Using a Sui Huang, MD, PhD, at
Paldi’s work was motivated in part by muscle cell line that can switch between Children’s Hospital Boston, would
the open question among stem cell biol- two distinct phenotypes, a stem-cell like have liked to see Paldi’s group perturb
ogists of what triggers a stem cell to dif- progenitor state and a differentiated state, the cell line or the culture to confirm
ferentiate. Why, in the same warm spot, they found that the cell pattern mostly their model. But both he and Muller-
getting the same rich media, do some resembles that of the extrinsic model. Sieburg believe the study addressed an
cells differentiate and others stay stem Many of the rare, stem-cell like cells clus- important question, that of heterogene-
cells? It is commonly assumed that this is ter; a few are solitary. ity of a genetically identical population
because the decision to differentiate is What’s important here, Paldi says, is of cells. And, says Huang, it certainly
intrinsic—that is, purely random. that they find environment playing a “contributes to the discussion in the
To test that assumption, Paldi’s group role—a significant one. In the case of stem community.”
started by designing two simple, multi- (progenitor) cells, it means neighbor cells —By Louisa Dalton
6 BIOMEDICAL COMPUTATION REVIEW Summer 2007 www.biomedicalcomputationreview.org
6. Simulating Populations But that technique is not without its based on Python. The software is freely
problems. When a population evolves for- available at http://simupop.sourceforge.net,
with Complex Diseases ward in time, there are simply too many under a GPL license.
Diabetes, breast cancer, multiple When Peng and his colleagues used
possible outcomes. Most notably, when
sclerosis, Alzheimer’s disease. All are their method to compare several gene map-
you introduce a disease allele, it can rapid-
associated with several genes’ alleles ping techniques they found that certain
ly be eliminated and replaced with new
interacting in complex ways with one methods worked better for loci that were
alleles. So Peng came up with a trick: He
another and the environment. Now, located distantly from one another; and
pre-sets desired disease allele frequencies in
using a computationally intensive other methods were
method known as forward-time simula- more effective when
tion of human populations, researchers loci were close together.
are hoping to gain a better understand- Overall, though, says
ing of how such complex diseases Kimmel, “We’re mildly
become established. pessimistic” about cur-
“In a real population you just see peo- rent gene mapping
ple with the disease,” says Marek approaches. “When
Kimmel, PhD, professor of statistics at the number of loci
Rice University and co-author of the CANCER
involved in complex
work. “You don’t see who in the popula- disease is greater than
tion has the disease genes because peo- two, the methods rap-
ple carrying these genes do not necessar- MULTIPLE
SCLEROSIS idly lose their power.”
ily become diseased.” But in the model Until recently, gene
population, he says, “you see both.” And mapping for complex
the researchers’ approach allows them to diseases has been disap-
simulate a very complicated scenario— pointing, he says. Loci
including changes in types of selection identified in such
pressure. efforts have later
“This lets us evaluate how well statis- DIABETES
turned out to be statis-
tical genetics tests determine what genes tical artifacts. “Our
are responsible for the symptoms of a modeling could figure
disease and how frequently those genes out if this is inevitable,”
appear in the population.” That’s a he says—and help guide
non-trivial exercise, he says, because it people toward more
has been impossible, until now, to effective approaches.
compare the many existing gene-map- David Balding,
ping methods head-to-head. The work PhD, a professor of
was published in PLoS Genetics in “In a real population, you just see statistical genetics at
March 2007. Imperial College in
Before now, the most commonly people with the disease,” says
London, does similar
used approach to simulating diseases in
human populations—called the “coales-
Marek Kimmel. “You don't see work using forward-
time simulations of
cent” method—worked by coalescing who in the population has the large genomic
backward in time to a most-recent com- regions. He has
mon ancestor. But it’s extremely diffi- disease genes...” become pessimistic
cult to take selection into account using
the current generation, extrapolates them about the method’s usefulness for
the coalescent method, says co-author
backward, and starts the simulation from understanding complex diseases because
Bo Peng, PhD, a postdoctoral fellow at
there. As Kimmel puts it, “We are restrict- no one really knows what kind of selec-
the University of Texas MD Anderson
ing potential variability in one aspect of tion is going on. Nevertheless, he says,
Cancer Center. Moreover, that
the present in order to produce a simula- this work can be useful for studying
approach gets too complicated if more
tion that resembles something close to the selection itself. “People tend to look at
than one disease gene is involved. So selection one allele at a time,” he says,
actual variability that exists now.”
Peng and his colleagues turned to for-
The simulation uses a scripting lan- “But forward-time simulation lets us do
ward-time simulation, an approach
guage called simuPOP, a general-purpose it with complex interactions.”
that’s been around for about one hun-
forward-time simulation environment —By Katharine Miller ■
dred years.
www.biomedicalcomputationreview.org Summer 2007 BIOMEDICAL COMPUTATION REVIEW 7