2. A gap analysis approach for CGIAR crops
1. Brief history of gap analysis
2. Targets and outputs in the Genebank platform
3. A simple method to assess the geography of
collections and their representativeness
4. An opportunity to develop more
understanding of drivers of diversity at
taxonomic, trait and genetic levels
3. Brief history of gap analysis
• Term first coined for in situ wildlife conservation
• Various authors have adapted the concept for ex situ conservation
analyses, most notably:
• Maxted et al. (2008) –Vigna spp.
• Ramirez-Villegas et al. (2010) –Phaseolus spp.
• Shehadeh et al. (2013) –Lathyrus spp.
• 2008: Development of gap analysis methodology for crop wild relatives
• 2008–2009: GPG2 analyses on key genepools (CWR + cultivated
materials)
• 2011–2016: Global assessment of CWR conservation gaps
Gap analysis is a tool used in wildlife conservation to identify gaps in conservation lands (e.g.,
protected areas and nature reserves) or other wildlands where significant plant and animal
species and their habitat or important ecological features occur (Scott and Schipper, 2006)
7. Methodology for cultivated species
CIAT & IRRI (unpublished) –Sorghum bicolor – data from the ICRISAT genebank (through SINGER)
• Gather data from SINGER (Genesys),
and GBIF
• Assess variation in density of
accessions
• Assess environmental similarity
between accessions and ref. dist.
1 2
3 • Mash out [2] and [3] to produce gap
maps4
8. Genebank platform targets and outputs
Targets
• Representation of crop genepools in ex situ conservation quantified
• Gaps in at least 5 crop genepools addressed
Outputs
• Representation of genetic (G), taxonomic (T), geographical (Ge) and
environmental (E) diversity and traits improved
Key strategic points
• Identification of duplicates using genomic information at accession level
• Global analysis of diversity of collections and assess G, T, and Ge gaps
• Use of GIS-based tools with focus on threatened germplasm
9. Part 1: simple method to assess geography of collections
and their representativeness
Address major limitations in GPG2 approach
by
• Including other drivers of germplasm
distribution, e.g. language and culture,
land-use change, seed and delivery systems
• Use of more robust estimators of species
distributions through niche-based models
• Enhance taxonomic resolution when
possible, e.g. look at races individually
instead of at the entire species
Lasky et al. (2015) Sci. Adv.
10. Part 2: An opportunity to develop our understanding of
drivers of diversity at taxonomic, trait and genetic levels
• One key objective of ex situ conservation is to be able to have
material at hand for breeders to look for adaptive traits.
• Hence, our approach should seek to address
some key questions:
• Which traits are priorities?
• Which alleles control such traits?
• What conditions the presence these alleles
(geography, culture, environment)?
• What is the ”global” (likely) and “conserved”
distribution of the trait?
• How is trait expression affected by environment
(and management)? (i.e. GxExM)
• How important?
• Where is it distributed?
11. Drivers of landrace genetic diversity
Sorghum in Africa
• Evidence of cultural factors
shaping crop diversity.
• Close association of linguistic
families and crop diversity.
Westengen et al 2014, PNAS
What conditions the presence these alleles (geography, culture, environment)?
12. Drivers of landrace genetic diversity
Maize in Oaxaca (México)
• Traits vs genetics.
• Culture and environment as
drivers of morphological and
agronomic differences.
• Genetic distance small.
• Low population structure.
Perales et al 2005, PNAS
What conditions the presence these alleles (geography, culture, environment)?
13. • Analyzed 104 wild and 297 cultivated
accessions
• Drought stress indices and population
structure are useful for genome-wide
genetic-environmental associations
Population structure and environment
What conditions the presence these alleles (geography,
culture, environment)?
14. What is the global distribution of a trait and how is its
expression affected by environment?
• Germplasm characterization
provides “G” component of trait,
with little “E” impact.
• To predict distribution, we need “E”
impact on trait, too.
• Analogues may help find areas with
characteristics which are similar to
evaluation sites/years
• Some traits may have small “E”
component
• Breeding trials may help assessing
GxE for a sub-set of accessions
Sorghum 100-seed weight (CIAT & ICRISAT, unpublished)
100-seed weight
Climate analogues
of Patancheru
15. Final remarks on approach
• Take stock on methods and current thinking on genetic
diversity of CGIAR crops (e.g. a systematic literature review
paper)
• Need heavy input from genebank managers and their
groups, perhaps through visits (1-2 month) and joint work
at their stations –key to tap into existing knowledge
• Input from breeders for trait prioritization
• Data organization will be key –a long-term investment
• Too much we could do, and limited time, so need to
prioritize (e.g. at trait level)
But this approach has two major limitations:
It does not consider variables other than climate driving the “environmental representativeness” of the genebank accessions (i.e. that the genetics of the accessions is entirely related to the climate)
It assumes that anywhere where there is little or no representativeness, it should be possible to find genetic material of value to the genebank. In assuming this, the method ignores the distribution and cropping of improved seed, and other changes that are typical of managed systems (e.g. changes as a response to climate variability and change).
Nevertheless, as a first (basic) approach it gives an idea of which kinds of environments are sampled and which are not.
Go point by point, start at which traits saying that breeders need to be involved to define priority traits, and that some consultation has been done in the past by IFPRI under the Global Futures and Strategic Foresight project.
Then move onto second point by saying that genetic information, environmental data, and germplasm evaluations can be used to explore relationships between genetic and trait information.
Then move onto other points and use slides for these
* For example, genetic structure has been found to be conditioned by language in sorghum in Africa
* But at a more local scale, language was not found to be a driving factor of population genetic structure
* Regarding environment, a study in beans also reported that population structure bears some relationship with drought
* Lasky et al. (2016) found that E (linear combination of variables) explains ~31 % of G (linear combination of SNPs), but there was strong Geographic structure in the E variables used, hence difficult to ascertain whether it is truly E or it is some other driver (dispersal limitation or isolation-by-distance).