SlideShare une entreprise Scribd logo
1  sur  1
PyPedal, an open source software package for pedigree analysis
John B. Cole
Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville, MD
Session 32
no. 23
Introduction
Pedigree Load Process
Program Organization
• Data integrity checks:
• Duplicate records eliminated
• Parents w/o records added to the pedigree
• Animals cannot appear as sires and dams
• Numeric or character IDs
• Animals added to NewPedigree object
• ID cross-references established
• Metadata computed and attached to pedigree
• Pedigree is renumbered (optional)
• Numerator relationship matrix formed and
attached to pedigree (optional)
• Missing informaion inferred (optional)
Other Measures of Diversity
Presentation of Results
• Quantitative data can be visualized (Figure 2)
• Printed reports can be prepared (Figure 3)
• Templates are provided for user-created
reports
PyPedal is an open source package written in
the Python programming language that provides
high-level tools for manipulating pedigrees. The
goal is to provide expressive tools for
exploratory data analysis.
• Many measures of pedigree diversity are
implemented in PyPedal:
• Ancestral and partial inbreeding
• Effective founder and ancestor numbers
• Founder genome equivalents
• Pedigree completeness
• One of the most common pedigree operations
is calculation of inbreeding and relationships.
• PyPedal originally used the recursive tabular
method of VanRaden (1992)
• Tested on a pedigree of 600,000 Ayrshires
• Relatively slow (function call overhead)
• PyPedal 2.0.4 has much faster inbreeding
routines than previous versions
• Meuwissen and Luo (1992)
• Quaas’s modified Meuwissen and Luo (1996)
• Tested on simulated pedigrees (Table 1)
Website and Documenation
• Website: http://pypedal.sourceforge.net/.
• Cole, J.B. 2007. PyPedal: A computer
program for pedigree analysis. Comp.
Electron. Agric. 57:107–113.
PyPedal is built as a series of modules (Figure
1), each of which groups related functions.
Third-party modules are used for matrix
manipulation, pedigree visualization and graph
drawing, and report generation.
pyp_newclasses
Pedigree,
animal, matrix,
and metadata
classes used by
PyPedal.
pyp_db
Save PyPedal
pedigrees into
and load them
from SQLite
tables.
pyp_graphics
Visualize
pedigrees and
relationship
matrices
(NRM).
pyp_reports
Create reports
from pedigree
database
(loaded in
pyp_db).
pyp_io
Save and load
NRM; read and
write pedigrees
used by other
packages.
pyp_utils
Load, reorder
and renumber
pedigrees; set
operations;
other tools.
pyp_metrics
Compute
inbreeding and
relationships;
identify related
animals.
pyp_netwoork
Apply network
analysis and
graph theory to
pedigrees.
pyp_nrm
Create, invert,
and decompose
and invert
NRM, recursion
in pedigrees.
pyp_demog
Demographic
reports, age
distributions,
etc.
Obtain Data Present resultsAnalyze data
Containers
Figure 1. Important PyPedal modules.
Input and Output
What is a pedigree?
A PyPedal pedigree is a complex object that
includes information about individual animals,
data about the group of animals in a pedigree,
and code for manipulating those data.
NewPedigree
Pedigree metadata
Relationship matrix
NewAnimal object
ANIMAL 1 RECORD
Animal ID: 1
Animal name: 7
Sire ID: 0
…
Birth Date: 01011900
Sex: m
CoI (f_a): 0.0
Founder: y
…
• The simplest way to get a pedigree is to read it
from a text file:
>>> p = pyp_newclasses.loadPedigree(options)
>>> print p
<PyPedal.pyp_newclasses.NewPedigree inst
ance at 0x10604a560>
• There are lots of other options, too.
Input Output
Plain text files Binary objects
Simulated pedigrees Adjacency matrices
SQLite databases
GEDCOM 5.5
GENES 1.20 (DBASE III)
Inbreeding and Relationships
Table 1. Performance of inbreeding
routines.
Size (n) Method Time (s) Speedup
100 VanRaden < 1 -
Meuwissen & Luo < 1 1x
Modified M&L < 1 1x
1,000 VanRaden 6 -
Meuwissen & Luo 1 6x
Modified M&L < 1 6x
10,000 VanRaden 304 -
Meuwissen & Luo 20 15x
Modified M&L 22 14x
100,000 VanRaden 26,726 -
Meuwissen & Luo 893 30x
Modified M&L 978 27x
• It’s a one-liner to compute coefficients of
inbreeding:
>>> fa = pyp_nrm.inbreeding(p, 
method='meu_luo')
• VanRaden’s method also provides
relationships:
>>> fa, reln = pyp_nrm.inbreeding(p, 
method=’vanraden’)
Figure 2. Average inbreeding of US
Ayrshires by birth year.
Figure 3. Sample page from a three-
generation pedigree book.
Other Features
• New methods to take unions and
intersections of pedigrees
• A  B = unique animals in A or B or both
• A  B = unique animals common to A and B
• Other operations, such as subtraction, can be
defined using these functions
• A – B = A - (A  B)
>>> difference = pedigree1 – pedigree2
• A + B = A  B
>>> sum = pedigree1 + pedigree2

Contenu connexe

En vedette

Heritable Traits in Man, Pedigree Analysis and Pedigree Application
Heritable Traits in Man, Pedigree Analysis and Pedigree ApplicationHeritable Traits in Man, Pedigree Analysis and Pedigree Application
Heritable Traits in Man, Pedigree Analysis and Pedigree ApplicationRica Joy Pontilar
 
Genetics pedigree problems
Genetics pedigree problemsGenetics pedigree problems
Genetics pedigree problemscallr
 
Pedigree presentation
Pedigree presentationPedigree presentation
Pedigree presentationjjcorrea121
 
Risk calculation in Medical Genetics
Risk calculation in Medical GeneticsRisk calculation in Medical Genetics
Risk calculation in Medical GeneticsYahya Noori, Ph.D
 

En vedette (7)

Heritable Traits in Man, Pedigree Analysis and Pedigree Application
Heritable Traits in Man, Pedigree Analysis and Pedigree ApplicationHeritable Traits in Man, Pedigree Analysis and Pedigree Application
Heritable Traits in Man, Pedigree Analysis and Pedigree Application
 
Pedigree
PedigreePedigree
Pedigree
 
HughesRiskApps Pedigree demo
HughesRiskApps Pedigree demoHughesRiskApps Pedigree demo
HughesRiskApps Pedigree demo
 
Genetics pedigree problems
Genetics pedigree problemsGenetics pedigree problems
Genetics pedigree problems
 
Pedigree presentation
Pedigree presentationPedigree presentation
Pedigree presentation
 
Risk calculation in Medical Genetics
Risk calculation in Medical GeneticsRisk calculation in Medical Genetics
Risk calculation in Medical Genetics
 
Theoretical Genetics
Theoretical GeneticsTheoretical Genetics
Theoretical Genetics
 

Similaire à PyPedal, an open source software package for pedigree analysis

MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)University of Washington
 
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....QBiC_Tue
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Dmitry Grapov
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningjaumebp
 
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Data Con LA
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisOlga Scrivner
 
Multivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisMultivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisDmitry Grapov
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda ProvenanceVlad Vega
 
QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE cscpconf
 
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) Framework
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) FrameworkMultimedia Databases: Performance Measure Benchmarking Model (PMBM) Framework
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) Frameworktheijes
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker, Inc.
 
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB Puppet
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationBoris Glavic
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilitiesIan Foster
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationTanu Malik
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitData Con LA
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsLynn Langit
 

Similaire à PyPedal, an open source software package for pedigree analysis (20)

MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
MMDS 2014: Myria (and Scalable Graph Clustering with RelaxMap)
 
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
Data Management for Quantitative Biology - Database systems, May 7, 2015, Dr....
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
 
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
Big Data Day LA 2015 - Scalable and High-Performance Analytics with Distribut...
 
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data AnalysisWorkshop nwav 47 - LVS - Tool for Quantitative Data Analysis
Workshop nwav 47 - LVS - Tool for Quantitative Data Analysis
 
Multivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisMultivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysis
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
 
QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE QUERY INVERSION TO FIND DATA PROVENANCE
QUERY INVERSION TO FIND DATA PROVENANCE
 
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) Framework
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) FrameworkMultimedia Databases: Performance Measure Benchmarking Model (PMBM) Framework
Multimedia Databases: Performance Measure Benchmarking Model (PMBM) Framework
 
Docker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce HoffDocker in Open Science Data Analysis Challenges by Bruce Hoff
Docker in Open Science Data Analysis Challenges by Bruce Hoff
 
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
Puppet Camp Melbourne 2014: Node Collaboration with PuppetDB
 
ICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database VirtualizationICDE 2015 - LDV: Light-weight Database Virtualization
ICDE 2015 - LDV: Light-weight Database Virtualization
 
Big data at experimental facilities
Big data at experimental facilitiesBig data at experimental facilities
Big data at experimental facilities
 
LDV: Light-weight Database Virtualization
LDV: Light-weight Database VirtualizationLDV: Light-weight Database Virtualization
LDV: Light-weight Database Virtualization
 
VariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn LangitVariantSpark a library for genomics by Lynn Langit
VariantSpark a library for genomics by Lynn Langit
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
The Genopolis Microarray database
The Genopolis Microarray databaseThe Genopolis Microarray database
The Genopolis Microarray database
 
VariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomicsVariantSpark - a Spark library for genomics
VariantSpark - a Spark library for genomics
 

Plus de John B. Cole, Ph.D.

Using genotypes to construct phenotypes for dairy cattle breeding programs an...
Using genotypes to construct phenotypes for dairy cattle breeding programs an...Using genotypes to construct phenotypes for dairy cattle breeding programs an...
Using genotypes to construct phenotypes for dairy cattle breeding programs an...John B. Cole, Ph.D.
 
If we would see further than others: research & technology today and tomorrow
If we would see further than others: research & technology today and tomorrowIf we would see further than others: research & technology today and tomorrow
If we would see further than others: research & technology today and tomorrowJohn B. Cole, Ph.D.
 
Using genotyping and whole-genome sequencing to identify causal variants asso...
Using genotyping and whole-genome sequencing to identify causal variants asso...Using genotyping and whole-genome sequencing to identify causal variants asso...
Using genotyping and whole-genome sequencing to identify causal variants asso...John B. Cole, Ph.D.
 
Genetic improvement programs for US dairy cattle
Genetic improvement programs for US dairy cattleGenetic improvement programs for US dairy cattle
Genetic improvement programs for US dairy cattleJohn B. Cole, Ph.D.
 
The hunt for a functional mutation affecting conformation and calving traits ...
The hunt for a functional mutation affecting conformation and calving traits ...The hunt for a functional mutation affecting conformation and calving traits ...
The hunt for a functional mutation affecting conformation and calving traits ...John B. Cole, Ph.D.
 
An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...John B. Cole, Ph.D.
 
An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...John B. Cole, Ph.D.
 
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...John B. Cole, Ph.D.
 
Stillbirth, Longevity and Fertility Update
Stillbirth, Longevity and Fertility UpdateStillbirth, Longevity and Fertility Update
Stillbirth, Longevity and Fertility UpdateJohn B. Cole, Ph.D.
 
New tools for genomic selection in dairy cattle
New tools for genomic selection in dairy cattleNew tools for genomic selection in dairy cattle
New tools for genomic selection in dairy cattleJohn B. Cole, Ph.D.
 
Opportunities for genetic improvement of health and fitness traits
Opportunities for genetic improvement of health and fitness traitsOpportunities for genetic improvement of health and fitness traits
Opportunities for genetic improvement of health and fitness traitsJohn B. Cole, Ph.D.
 
Genomic selection and systems biology – lessons from dairy cattle breeding
Genomic selection and systems biology – lessons from dairy cattle breedingGenomic selection and systems biology – lessons from dairy cattle breeding
Genomic selection and systems biology – lessons from dairy cattle breedingJohn B. Cole, Ph.D.
 
Use of NGS to identify the causal variant associated with a complex phenotype
Use of NGS to identify the causal variant associated with a complex phenotypeUse of NGS to identify the causal variant associated with a complex phenotype
Use of NGS to identify the causal variant associated with a complex phenotypeJohn B. Cole, Ph.D.
 
Genomic evaluation of dairy cattle health
Genomic evaluation of dairy cattle healthGenomic evaluation of dairy cattle health
Genomic evaluation of dairy cattle healthJohn B. Cole, Ph.D.
 
Uso e valore economico dei test genomici in azienda
Uso e valore economico dei test genomici in aziendaUso e valore economico dei test genomici in azienda
Uso e valore economico dei test genomici in aziendaJohn B. Cole, Ph.D.
 
The use and economic value of genomic testing for calves on dairy farms
The use and economic value of genomic testing for calves on dairy farmsThe use and economic value of genomic testing for calves on dairy farms
The use and economic value of genomic testing for calves on dairy farmsJohn B. Cole, Ph.D.
 
Genomic evaluation of low-heritability traits: dairy cattle health as a model
Genomic evaluation of low-heritability traits: dairy cattle health as a modelGenomic evaluation of low-heritability traits: dairy cattle health as a model
Genomic evaluation of low-heritability traits: dairy cattle health as a modelJohn B. Cole, Ph.D.
 
New applications of genomic technology in the US dairy industry
New applications of genomic technology in the US dairy industryNew applications of genomic technology in the US dairy industry
New applications of genomic technology in the US dairy industryJohn B. Cole, Ph.D.
 

Plus de John B. Cole, Ph.D. (20)

Crv 2015 jbc
Crv 2015 jbcCrv 2015 jbc
Crv 2015 jbc
 
Using genotypes to construct phenotypes for dairy cattle breeding programs an...
Using genotypes to construct phenotypes for dairy cattle breeding programs an...Using genotypes to construct phenotypes for dairy cattle breeding programs an...
Using genotypes to construct phenotypes for dairy cattle breeding programs an...
 
2015 AGIL Update
2015 AGIL Update2015 AGIL Update
2015 AGIL Update
 
If we would see further than others: research & technology today and tomorrow
If we would see further than others: research & technology today and tomorrowIf we would see further than others: research & technology today and tomorrow
If we would see further than others: research & technology today and tomorrow
 
Using genotyping and whole-genome sequencing to identify causal variants asso...
Using genotyping and whole-genome sequencing to identify causal variants asso...Using genotyping and whole-genome sequencing to identify causal variants asso...
Using genotyping and whole-genome sequencing to identify causal variants asso...
 
Genetic improvement programs for US dairy cattle
Genetic improvement programs for US dairy cattleGenetic improvement programs for US dairy cattle
Genetic improvement programs for US dairy cattle
 
The hunt for a functional mutation affecting conformation and calving traits ...
The hunt for a functional mutation affecting conformation and calving traits ...The hunt for a functional mutation affecting conformation and calving traits ...
The hunt for a functional mutation affecting conformation and calving traits ...
 
An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...
 
An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...
 
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
 
Stillbirth, Longevity and Fertility Update
Stillbirth, Longevity and Fertility UpdateStillbirth, Longevity and Fertility Update
Stillbirth, Longevity and Fertility Update
 
New tools for genomic selection in dairy cattle
New tools for genomic selection in dairy cattleNew tools for genomic selection in dairy cattle
New tools for genomic selection in dairy cattle
 
Opportunities for genetic improvement of health and fitness traits
Opportunities for genetic improvement of health and fitness traitsOpportunities for genetic improvement of health and fitness traits
Opportunities for genetic improvement of health and fitness traits
 
Genomic selection and systems biology – lessons from dairy cattle breeding
Genomic selection and systems biology – lessons from dairy cattle breedingGenomic selection and systems biology – lessons from dairy cattle breeding
Genomic selection and systems biology – lessons from dairy cattle breeding
 
Use of NGS to identify the causal variant associated with a complex phenotype
Use of NGS to identify the causal variant associated with a complex phenotypeUse of NGS to identify the causal variant associated with a complex phenotype
Use of NGS to identify the causal variant associated with a complex phenotype
 
Genomic evaluation of dairy cattle health
Genomic evaluation of dairy cattle healthGenomic evaluation of dairy cattle health
Genomic evaluation of dairy cattle health
 
Uso e valore economico dei test genomici in azienda
Uso e valore economico dei test genomici in aziendaUso e valore economico dei test genomici in azienda
Uso e valore economico dei test genomici in azienda
 
The use and economic value of genomic testing for calves on dairy farms
The use and economic value of genomic testing for calves on dairy farmsThe use and economic value of genomic testing for calves on dairy farms
The use and economic value of genomic testing for calves on dairy farms
 
Genomic evaluation of low-heritability traits: dairy cattle health as a model
Genomic evaluation of low-heritability traits: dairy cattle health as a modelGenomic evaluation of low-heritability traits: dairy cattle health as a model
Genomic evaluation of low-heritability traits: dairy cattle health as a model
 
New applications of genomic technology in the US dairy industry
New applications of genomic technology in the US dairy industryNew applications of genomic technology in the US dairy industry
New applications of genomic technology in the US dairy industry
 

Dernier

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingadibshanto115
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 

Dernier (20)

Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 

PyPedal, an open source software package for pedigree analysis

  • 1. PyPedal, an open source software package for pedigree analysis John B. Cole Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville, MD Session 32 no. 23 Introduction Pedigree Load Process Program Organization • Data integrity checks: • Duplicate records eliminated • Parents w/o records added to the pedigree • Animals cannot appear as sires and dams • Numeric or character IDs • Animals added to NewPedigree object • ID cross-references established • Metadata computed and attached to pedigree • Pedigree is renumbered (optional) • Numerator relationship matrix formed and attached to pedigree (optional) • Missing informaion inferred (optional) Other Measures of Diversity Presentation of Results • Quantitative data can be visualized (Figure 2) • Printed reports can be prepared (Figure 3) • Templates are provided for user-created reports PyPedal is an open source package written in the Python programming language that provides high-level tools for manipulating pedigrees. The goal is to provide expressive tools for exploratory data analysis. • Many measures of pedigree diversity are implemented in PyPedal: • Ancestral and partial inbreeding • Effective founder and ancestor numbers • Founder genome equivalents • Pedigree completeness • One of the most common pedigree operations is calculation of inbreeding and relationships. • PyPedal originally used the recursive tabular method of VanRaden (1992) • Tested on a pedigree of 600,000 Ayrshires • Relatively slow (function call overhead) • PyPedal 2.0.4 has much faster inbreeding routines than previous versions • Meuwissen and Luo (1992) • Quaas’s modified Meuwissen and Luo (1996) • Tested on simulated pedigrees (Table 1) Website and Documenation • Website: http://pypedal.sourceforge.net/. • Cole, J.B. 2007. PyPedal: A computer program for pedigree analysis. Comp. Electron. Agric. 57:107–113. PyPedal is built as a series of modules (Figure 1), each of which groups related functions. Third-party modules are used for matrix manipulation, pedigree visualization and graph drawing, and report generation. pyp_newclasses Pedigree, animal, matrix, and metadata classes used by PyPedal. pyp_db Save PyPedal pedigrees into and load them from SQLite tables. pyp_graphics Visualize pedigrees and relationship matrices (NRM). pyp_reports Create reports from pedigree database (loaded in pyp_db). pyp_io Save and load NRM; read and write pedigrees used by other packages. pyp_utils Load, reorder and renumber pedigrees; set operations; other tools. pyp_metrics Compute inbreeding and relationships; identify related animals. pyp_netwoork Apply network analysis and graph theory to pedigrees. pyp_nrm Create, invert, and decompose and invert NRM, recursion in pedigrees. pyp_demog Demographic reports, age distributions, etc. Obtain Data Present resultsAnalyze data Containers Figure 1. Important PyPedal modules. Input and Output What is a pedigree? A PyPedal pedigree is a complex object that includes information about individual animals, data about the group of animals in a pedigree, and code for manipulating those data. NewPedigree Pedigree metadata Relationship matrix NewAnimal object ANIMAL 1 RECORD Animal ID: 1 Animal name: 7 Sire ID: 0 … Birth Date: 01011900 Sex: m CoI (f_a): 0.0 Founder: y … • The simplest way to get a pedigree is to read it from a text file: >>> p = pyp_newclasses.loadPedigree(options) >>> print p <PyPedal.pyp_newclasses.NewPedigree inst ance at 0x10604a560> • There are lots of other options, too. Input Output Plain text files Binary objects Simulated pedigrees Adjacency matrices SQLite databases GEDCOM 5.5 GENES 1.20 (DBASE III) Inbreeding and Relationships Table 1. Performance of inbreeding routines. Size (n) Method Time (s) Speedup 100 VanRaden < 1 - Meuwissen & Luo < 1 1x Modified M&L < 1 1x 1,000 VanRaden 6 - Meuwissen & Luo 1 6x Modified M&L < 1 6x 10,000 VanRaden 304 - Meuwissen & Luo 20 15x Modified M&L 22 14x 100,000 VanRaden 26,726 - Meuwissen & Luo 893 30x Modified M&L 978 27x • It’s a one-liner to compute coefficients of inbreeding: >>> fa = pyp_nrm.inbreeding(p, method='meu_luo') • VanRaden’s method also provides relationships: >>> fa, reln = pyp_nrm.inbreeding(p, method=’vanraden’) Figure 2. Average inbreeding of US Ayrshires by birth year. Figure 3. Sample page from a three- generation pedigree book. Other Features • New methods to take unions and intersections of pedigrees • A  B = unique animals in A or B or both • A  B = unique animals common to A and B • Other operations, such as subtraction, can be defined using these functions • A – B = A - (A  B) >>> difference = pedigree1 – pedigree2 • A + B = A  B >>> sum = pedigree1 + pedigree2