This document summarizes the bio-banking and metagenomics platforms used by the International Livestock Research Institute (ILRI) for pathogen discovery in Africa. The ILRI operates a bio-repository (biobank) containing over 340,000 samples from livestock, wildlife, humans and insects across Africa. Samples are collected directly from the field using standardized metadata collection tools. The biobank integrates with a metagenomics platform and high performance computing systems to conduct high-throughput sequencing and analysis without prior knowledge of pathogens. The ILRI facilities support sample processing, nucleic acid extraction, sequencing and bioinformatics analysis. The platforms have identified several pathogens in the biobank and outputs are shared publicly and with collaborators
What's New in Teams Calling, Meetings and Devices March 2024
Bio-banking and metagenomics platforms for pathogen discovery
1. Bio-banking and Metagenomics Platforms for
Pathogen Discovery
George N. Michuki
International Livestock Research Institute (ILRI)
Sequencing, Finishing, Analysis in the Future Meeting
Santa Fe, New Mexico
29-31 May 2013
2. Introduction
• Why pathogen discovery?
• Increase of emergence and re-emergence of zoonotic
infectious diseases in Africa
• Why Bio-repository (Biobank)?
• Sample value and sampling cost - expensive
• Why metagenomics?
• Looking for what you don’t know
• High performance computing systems
• High throughput data
4. Sample collection
• Use same sample to ask multiple question
• Software:
• ODK suite - captures metadata
• Tarakibu - used to collect GPS laced data as well as
time stamped to the second.
• Ukasimu - An aliquoting(splitting) system that
helps the field personnel aliquot the samples
collected from the field
5. The Bio-bank
• Currently has 340,000 samples
• Samples include: blood, tissues and semen among others
from livestock, wildlife, human and insects collected from
East, West and Central Africa among other regions
9. • High throughput sample
processing
– MagnaPure LC Instrument
– Magna-lyser
• Real time PCR
– ABI Real time cyclers
7500/7900
– Light-cycler nano
• HRM analysis
Genomics platform facilities
10. Nucleic Acid Extraction
DNA RNA
Nebulisation
Random
Labelling
cDNA
Synthesis
Fragmented DNA
(100bp to 600 bp)
Fragmented cDNA
(100bp to 600 bp)
Nucleic Acid Processing
Blood/Serum Tissue/vectors Cell Culture Supernatants
METAGENOMIC APPROACH
Targeted
approach
Targeted
approach
454 Genome Sequencing
11. • The Bioinformatics platform
has 88 compute cores,
• 31TB of network-attached
GlusterFS storage and
• back up systems.
• Variety of commercial and
custom analysis pipelines
http://hpc.ilri.cgiar.org/wiki
/listofsoftware
Bioinformatics - HPC
13. Backup
You may call it too much...we call it paranoia...
Hourly snapshots of the whole system
(external)
Daily snapshots of the dbase at 3.15am
(internal)
Incremental Backups every day (Backuppc)
Full Backups every 5 days (Backuppc)
Daily dbase snapshots sync'd to the cloud
Manual backups when updating the dbase
Backuppc data saved on a RAID which
provides increased storage functions and
reliability through redundancy
“We've got a good system in place. Guys know their roles, and we've got capable backup. If
Mike isn't 100 percent, then Matt will step right in.”
14. • AVID local database and Genebank:
o Dugbe virus,
o Semliki Forest virus,
o Bunyamwera virus,
o Partial Rift Valley Fever virus
o Babanki virus
o West Nile Virus
o Ndumu viruses
o Typing of mosquitoes using intron regions
Outputs – project specific
G. Michuki. ILRI 14
15. • None AVID:
o ECF vaccine quality check - ILRI
o Equine Encephalosis Virus – (OVI – South Africa)
o Blue Tongue Virus (OVI – South Africa)
o RVF viruses (OVI – South Africa)
o New Castle Disease Virus – ILRI – in Genebank
o Pigeon Paramyxovirus Virus –KWS – in Genebank
o Plasmodium falciparum – Kilifi welcome trust
o MHC class 1 and 2 – ILRI vaccine group
o Chikungunya Viruses
o Ndumu virus from pigs: - in genebank
Outputs with collaborators
G. Michuki. ILRI 15
16. Outputs in public….
• In Genebank:
– Accessions: KC243146.1, JQ217420.1, JQ217419.1, JQ217418.1,
JX518532.1, JN989958.1, JN989957.1, ………
17. • Team:
• Steve Kemp – Genomics and ILRI AVID team leader
• George Michuki – Wet Lab and bioinformatics
• Absolomon Kihara – Database, security and software
• Cecilia N. Rumberia – Wet Lab
• Alan Orth – Linux support and admin
• Anne Fischer – Bioinformatics support (ICIPE-ILRI)
Acknowledgements
• Funding:
• Google.org
• CGIAR – Research Program on Agriculture for Nutrition
and Health
18. The presentation has a Creative Commons licence. You are free to re-use or distribute this work, provided credit is given to ILRI.
better lives through livestock
ilri.org