SlideShare a Scribd company logo
1 of 36
Pedigree Based
Methods
• Positional Cloning: Identification of a gene for a particular disease based on its
location in the genome, determined by a collection of methods including linkage
analysis, genomic (physical) mapping, and Bioinformatics
• Founder Gene Approach: Loss of genetic diversity or limited genetic diversity that
occurs when a small group of individuals from a genetically diverse population are
studied
Pedigree
Independent
Methods
• Candidate Gene Approach: Associations between genetic variations within pre-
specified genes of interest and phenotypes or disease states
• Genome Wide Association Studies: Examination of many common genetic variants
in different individuals to see if any variant is associated with disease phenotype
Sr No Name Web Address Reference
1 T1Dbase http://www.t1dbase.org (Hulbert et al. 2007)
2 COSMIC http://www.sanger.ac.uk/genetics/CGP/cosmic/ (Forbes et al. 2008)
3 The European Genome-Phenome
Archive
https://www.ebi.ac.uk/ega/ (Church et al. 2010)
4 ModSNP modsnp.expasy.org/ (Yip et al. 2004)
5 SwissVar http://swissvar.expasy.org/ (Mottaz et al. 2010)
6 HGMD http://www.hgmd.cf.ac.uk/ac/index.php (Stenson et al. 2003)
7 Catalog of published Genome Wide
Association Studies (NHGRI)
http://www.genome.gov/gwastudies/ (Gong et al. 2011)
3
Scope of a Genetic Association Study
 Candidate gene
◦ Known functional variants
◦ Variants with unknown function in exons, introns, regulatory regions
 Linkage candidate region
◦ Functional variants, or those with unknown function in candidate genes
◦ More general coverage of region using many markers
 Genome-wide
◦ Test for association with hundreds of thousands (millions) of SNPs
spread across the entire genome.
Background
 There are two main types of genetic association studies:
 population-based case–control studies
 family-based studies
 Can be hypothesis driven e.g CG or with out prior
hypothesis e.g GWAS
 Population-based (defined here as nonfamily-based)
case–control studies have become the most popular
design to find common polymorphisms thought to underlie
complex traits (also termed ‘common disease common
variant hypothesis’).
CGs
 Targeting the genes with previous role in
the trait in question
 If focus on few genes then is cost
effective
 Small number of marker are needed to
capture the most common variation
 Candidate genes can be selected from
biological pathways that harbor other
previously associated risk loci.
Goals
 Use bioinformatics databases to:
◦ Determine basic properties of genes
◦ Identify common genetic variants in and
around genes
◦ Characterize genetic variants in terms of
frequency and functionality
Possible Stages in Candidate-Gene Study
Design
Select a
Candidate
System
Select a
Candidate Genes
in System
Select Genetic
Variants in
Candidate Genes
Knowledge of the
biology of the phenotype
1. Expert Opinion
2. Literature Search
3. Pathway Analysis
4. (Positional)
1.Literature Search
2.Bioinformatic Databas
3.SNP Tagging
UCSC Genome Browser
(http://genome.ucsc.edu/)
For a gene of interest
 Determine basic properties:
◦ Location
◦ Size, # exons
 Identify genetic variants
◦ SNPs, in-dels, STRs
GWAS – Genome Wide
Association Studies
 Studies of genetic variation across
the (entire) human genome
 Designed to identify associations
between genetic markers &
observable traits, or the
presence/absence of a
disease or condition
 Often markers of modest effect
10
Complex Traits - Multifactorial Inheritance
 Examples
◦ Some cancers - Schizophrenia
◦ Type 1 diabetes
◦ Type 2 diabetes - Hypertension
◦ Alzheimer disease - Rheumatoid arthritis
◦ Inflammatory bowel disease - Asthma
Genetic
Variants
Non-genetic
factors
TraitTrait
11
Genetic Association Studies
 Short-term Goal: Identify genetic variants that explain differences in
phenotype among individuals in a study population
◦ Qualitative: disease status, presence/absence of congenital
defect
◦ Quantitative: blood glucose levels, % body fat
 If association found, then further study can follow to
◦ Understand mechanism of action and disease etiology in
individuals
◦ Characterize relevance and/or impact in more general population
 Long-term goal: to inform process of identifying and delivering better
prevention and treatment strategies
Steps
Specify case definition
Consider the literature for a consensus definition of
the disease of interest. Following standard
diagnostic
guidelines allows other groups to more easily
replicate initial findings, though it is not always the
most powerful approach for initial gene detection.
If a consensus definition does not exist, consider all
evidence and decide on a specific definition that
optimizes biological and clinical relevance.
Determine if the disease is heritable
 Decide from all available evidence in familial aggregation
studies whether there is sufficient evidence that the
disease of interest is heritable.
 Concordance rates: presence of the same trait in both
members of a pair of twins
 If the heritability of a disease or subphenotype appears to
be low (<20%) and the disease is common, it is likely that
very large sample sizes (in excess of 5,000 cases and
5,000 controls) will be required to find predisposing genetic
variants using a population-based approach.
• Control selection
 Should be age, gender and ethnicity specific
Catalog of GWAS Studies
http://www.genome.gov/26525384
Catalog of GWAS Studies
http://www.genome.gov/26525384
Manolio et al., Clin Invest 2008
Catalog of GWAS Studies
http://www.genome.gov/26525384
Genetic association studies
Direct genotyping occurs when an actual causal polymorphism is
typed. Indirect genotyping occurs when nearby genetic markers that are
highly correlated with the causal polymorphism are typed
Hirschhorn & Daly, Nat Rev Genet 2005
Candidate Gene or GWAS
Takes advantage of the
correlation between SNPs,
called linkage
disequilibrium (LD)
Genome-wide association studies (GWAS)
Copyright restrictions may apply.
Examples of Multistage Designs in
Genome-wide Association Studies
Pearson, T. A. et al. JAMA 2008;299:1335-1344
GWAS Microarray
Affymetrix, http://www.affymetrix.comAssay ~ 0.7 - 5M SNPs (keeps increasing)
Genotype calls
Good calls! Bad calls!
Quality controls
Quality control refers to the procedures used
to evaluate the genotyping performance of the samples
and the genotyping array.
As there can be degradation of input DNA, plating
errors and hybridization failures of genotyping chips,
it is important to review the performance
of the samples prior to definitive downstream
analysis with the genotypes.
The process of calling genotypes is not error free,
It is thus vital to identify and exclude SNPs with
potentially high rates of missingness or erroneous
genotypes.
Sample quality control
The extent of missing genotypes and heterozygosity
for a sample are useful indicators for poorly genotyped
samples.
Samples with anomalously high rates for either
of these two measures are often excluded from the
outset.
High rates of missingness generally imply
hybridization
problems, which may be caused by faulty arrays or
poor
quality DNA
Excess heterozygosity can indicate sample
Contamination
Sample quality control
Unintentional use of related samples or accidental
sample duplication in large scale studies
Such cryptic relatedness is easy to infer through
measuring the allele sharing
Typically the sample in each relation with the least
amount of missing genotypes is retained in the
study.
Family-based studies, the authenticity of the
pedigree relationships can be achieved by
calculating the extent of mendelian inconsistency
PedChek software
Exclude those are inconsistent
SNPs Quality control
 Remove SNPs with low call rate (e.g., <97%)
 Proportion of SNPs actually called by
software
 Remove SNPs / Individuals who have too much
missing data
 Hardy-Weinberg Equilibrium,
 Test for this (e.g., chi-squared test)
Remove those with very low minor allele
frequency
Population structure
 Population structure refers to the genetic differences that exist between
individuals from different groups, populations or geographical regions.
 There are a number of established statistical strategies for detecting
population structure, of which those commonly used in genome-wide
studies include genomic control (GC),
which estimates the
degree of inflation
of the test statistic
 A representation of how differences in genotypic (or allelic) frequencies across
different populations can introduce false signals of association
Selection of Markers for Association studies
Human genome consists of over 3 billion base pairs
Have about 28000 genes
individuals are identical for ~99.5% of their
sequence, with the small remaining part variable to
differing extents
 could variation have a role in explaining differences
in genetic susceptibility to disease?
 comparing variation between diseased (cases) and
healthy (control) individuals from the same population
If frequency of a variant at specific locus is >1% is
said to be a polymorphism
The most common class of polymorphisms SNPs,
which comprise ~90% of all human variation
Other types are larger blocks of
sequence variation (mini-/micro-satellites), Indel,
LD: non-random association of allele at two or more loci, that may or may not
be on the same chromosome
SNPs in LD?
dbSNP have about 10 millions
HapMap project determine taqSNPs which can be
used as a proxies for other in LD and reduces the
number of marker to be examined
SNP-SNP association, or linkage disequilibrium,
is fundamental to our ability to sample the whole
genome with relatively few SNPs.
Genome-wide association study of 14,000 cases of seven common
diseases and 3,000 shared controls (NATURE| Vol 447|7 June 2007)
using the Affymetrix GeneChip 500K Mapping Array Set
TaqMan Assay Process
TaqMan assay system and mechanism of action
•This is a best method for SNPs genotyping
•Robust, reliable and very easy to prepare
•Can be done in 384 well plate
•Very low genotyping error rate
•Reaction can be run on regular thermo
cycler but Real-Time PCR detection system
is necessary to scan the plates
Output of TaqMan Assay
Thanks…..

More Related Content

What's hot

Dna markers lecture
Dna markers lectureDna markers lecture
Dna markers lecture
Bruno Mmassy
 

What's hot (20)

Shotgun and clone contig method
Shotgun and clone contig methodShotgun and clone contig method
Shotgun and clone contig method
 
Exonuclease
ExonucleaseExonuclease
Exonuclease
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Proteomics
ProteomicsProteomics
Proteomics
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
 
Dna markers lecture
Dna markers lectureDna markers lecture
Dna markers lecture
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genomics
GenomicsGenomics
Genomics
 
Genomics(functional genomics)
Genomics(functional genomics)Genomics(functional genomics)
Genomics(functional genomics)
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
BLAST
BLASTBLAST
BLAST
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Molecular mapping
Molecular mappingMolecular mapping
Molecular mapping
 
An Introduction to Genomics
An Introduction to GenomicsAn Introduction to Genomics
An Introduction to Genomics
 
Tools of bioinforformatics by kk
Tools of bioinforformatics by kkTools of bioinforformatics by kk
Tools of bioinforformatics by kk
 
Map based cloning
Map based cloning Map based cloning
Map based cloning
 

Similar to Gene hunting strategies

A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
Antoaneta Vladimirova
 
Day2 145pm Crawford
Day2 145pm CrawfordDay2 145pm Crawford
Day2 145pm Crawford
Sean Paul
 
Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity
Zohaib HUSSAIN
 
Sophie F. summer Poster Final
Sophie F. summer Poster FinalSophie F. summer Poster Final
Sophie F. summer Poster Final
Sophie Friedheim
 

Similar to Gene hunting strategies (20)

Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
 
Digging into thousands of variants to find disease genes in Mendelian and com...
Digging into thousands of variants to find disease genes in Mendelian and com...Digging into thousands of variants to find disease genes in Mendelian and com...
Digging into thousands of variants to find disease genes in Mendelian and com...
 
Day2 145pm Crawford
Day2 145pm CrawfordDay2 145pm Crawford
Day2 145pm Crawford
 
GENETIC BASIS OF PSYCHIATRIC DISRODERS AND THE RELEVANCE OF CLINICAL PRACTICE
 GENETIC BASIS OF  PSYCHIATRIC DISRODERS AND THE RELEVANCE OF CLINICAL  PRACTICE GENETIC BASIS OF  PSYCHIATRIC DISRODERS AND THE RELEVANCE OF CLINICAL  PRACTICE
GENETIC BASIS OF PSYCHIATRIC DISRODERS AND THE RELEVANCE OF CLINICAL PRACTICE
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Clinical Assessment In Incorporating a Personal Genome
Clinical Assessment In Incorporating a Personal GenomeClinical Assessment In Incorporating a Personal Genome
Clinical Assessment In Incorporating a Personal Genome
 
NGS-report-amir.pdf
NGS-report-amir.pdfNGS-report-amir.pdf
NGS-report-amir.pdf
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtu
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
multiomics-ebook.pdf
multiomics-ebook.pdfmultiomics-ebook.pdf
multiomics-ebook.pdf
 
Whole Exome Sequencing .pptx
Whole Exome Sequencing .pptxWhole Exome Sequencing .pptx
Whole Exome Sequencing .pptx
 
Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity Molecular markers for measuring genetic diversity
Molecular markers for measuring genetic diversity
 
Family history
Family history Family history
Family history
 
Sophie F. summer Poster Final
Sophie F. summer Poster FinalSophie F. summer Poster Final
Sophie F. summer Poster Final
 

More from Ashfaq Ahmad

More from Ashfaq Ahmad (20)

10000 plus English Vocabulary
10000 plus English Vocabulary10000 plus English Vocabulary
10000 plus English Vocabulary
 
Personality and psychographics
Personality and psychographicsPersonality and psychographics
Personality and psychographics
 
Affinity chromatography
Affinity chromatographyAffinity chromatography
Affinity chromatography
 
Basics of spectroscopy
Basics of spectroscopyBasics of spectroscopy
Basics of spectroscopy
 
Spectroscopy basics
Spectroscopy basicsSpectroscopy basics
Spectroscopy basics
 
High performance liquid chromatography
High performance liquid chromatographyHigh performance liquid chromatography
High performance liquid chromatography
 
Affinity chromatography and gel filteration
Affinity chromatography and gel filterationAffinity chromatography and gel filteration
Affinity chromatography and gel filteration
 
Rflp presentation
Rflp presentationRflp presentation
Rflp presentation
 
Lecture 11 and 12 microbial_sem_6 (1)
Lecture 11 and 12 microbial_sem_6 (1)Lecture 11 and 12 microbial_sem_6 (1)
Lecture 11 and 12 microbial_sem_6 (1)
 
Lecture 9 and 10 microbial_sem_6
Lecture 9 and 10 microbial_sem_6Lecture 9 and 10 microbial_sem_6
Lecture 9 and 10 microbial_sem_6
 
Lecture 7 and 8 microbial_sem_6_20180307
Lecture 7 and 8 microbial_sem_6_20180307Lecture 7 and 8 microbial_sem_6_20180307
Lecture 7 and 8 microbial_sem_6_20180307
 
Lecture 5 and 6 microbial_sem_6_20180307
Lecture 5 and 6 microbial_sem_6_20180307Lecture 5 and 6 microbial_sem_6_20180307
Lecture 5 and 6 microbial_sem_6_20180307
 
Chromatography basics
Chromatography basicsChromatography basics
Chromatography basics
 
Research methodology notes
Research methodology notesResearch methodology notes
Research methodology notes
 
Lecture 2 microbial_sem_6_20180220
Lecture 2 microbial_sem_6_20180220Lecture 2 microbial_sem_6_20180220
Lecture 2 microbial_sem_6_20180220
 
Lecture 1 microbial_sem_6_20170213
Lecture 1 microbial_sem_6_20170213Lecture 1 microbial_sem_6_20170213
Lecture 1 microbial_sem_6_20170213
 
Western blotting
Western blottingWestern blotting
Western blotting
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Snp and its role in diseases
Snp and its role in diseasesSnp and its role in diseases
Snp and its role in diseases
 

Recently uploaded

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

Gene hunting strategies

  • 1.
  • 2. Pedigree Based Methods • Positional Cloning: Identification of a gene for a particular disease based on its location in the genome, determined by a collection of methods including linkage analysis, genomic (physical) mapping, and Bioinformatics • Founder Gene Approach: Loss of genetic diversity or limited genetic diversity that occurs when a small group of individuals from a genetically diverse population are studied Pedigree Independent Methods • Candidate Gene Approach: Associations between genetic variations within pre- specified genes of interest and phenotypes or disease states • Genome Wide Association Studies: Examination of many common genetic variants in different individuals to see if any variant is associated with disease phenotype Sr No Name Web Address Reference 1 T1Dbase http://www.t1dbase.org (Hulbert et al. 2007) 2 COSMIC http://www.sanger.ac.uk/genetics/CGP/cosmic/ (Forbes et al. 2008) 3 The European Genome-Phenome Archive https://www.ebi.ac.uk/ega/ (Church et al. 2010) 4 ModSNP modsnp.expasy.org/ (Yip et al. 2004) 5 SwissVar http://swissvar.expasy.org/ (Mottaz et al. 2010) 6 HGMD http://www.hgmd.cf.ac.uk/ac/index.php (Stenson et al. 2003) 7 Catalog of published Genome Wide Association Studies (NHGRI) http://www.genome.gov/gwastudies/ (Gong et al. 2011)
  • 3. 3 Scope of a Genetic Association Study  Candidate gene ◦ Known functional variants ◦ Variants with unknown function in exons, introns, regulatory regions  Linkage candidate region ◦ Functional variants, or those with unknown function in candidate genes ◦ More general coverage of region using many markers  Genome-wide ◦ Test for association with hundreds of thousands (millions) of SNPs spread across the entire genome.
  • 4. Background  There are two main types of genetic association studies:  population-based case–control studies  family-based studies  Can be hypothesis driven e.g CG or with out prior hypothesis e.g GWAS  Population-based (defined here as nonfamily-based) case–control studies have become the most popular design to find common polymorphisms thought to underlie complex traits (also termed ‘common disease common variant hypothesis’).
  • 5. CGs  Targeting the genes with previous role in the trait in question  If focus on few genes then is cost effective  Small number of marker are needed to capture the most common variation  Candidate genes can be selected from biological pathways that harbor other previously associated risk loci.
  • 6. Goals  Use bioinformatics databases to: ◦ Determine basic properties of genes ◦ Identify common genetic variants in and around genes ◦ Characterize genetic variants in terms of frequency and functionality
  • 7. Possible Stages in Candidate-Gene Study Design Select a Candidate System Select a Candidate Genes in System Select Genetic Variants in Candidate Genes Knowledge of the biology of the phenotype 1. Expert Opinion 2. Literature Search 3. Pathway Analysis 4. (Positional) 1.Literature Search 2.Bioinformatic Databas 3.SNP Tagging
  • 8. UCSC Genome Browser (http://genome.ucsc.edu/) For a gene of interest  Determine basic properties: ◦ Location ◦ Size, # exons  Identify genetic variants ◦ SNPs, in-dels, STRs
  • 9. GWAS – Genome Wide Association Studies  Studies of genetic variation across the (entire) human genome  Designed to identify associations between genetic markers & observable traits, or the presence/absence of a disease or condition  Often markers of modest effect
  • 10. 10 Complex Traits - Multifactorial Inheritance  Examples ◦ Some cancers - Schizophrenia ◦ Type 1 diabetes ◦ Type 2 diabetes - Hypertension ◦ Alzheimer disease - Rheumatoid arthritis ◦ Inflammatory bowel disease - Asthma Genetic Variants Non-genetic factors TraitTrait
  • 11. 11 Genetic Association Studies  Short-term Goal: Identify genetic variants that explain differences in phenotype among individuals in a study population ◦ Qualitative: disease status, presence/absence of congenital defect ◦ Quantitative: blood glucose levels, % body fat  If association found, then further study can follow to ◦ Understand mechanism of action and disease etiology in individuals ◦ Characterize relevance and/or impact in more general population  Long-term goal: to inform process of identifying and delivering better prevention and treatment strategies
  • 12. Steps Specify case definition Consider the literature for a consensus definition of the disease of interest. Following standard diagnostic guidelines allows other groups to more easily replicate initial findings, though it is not always the most powerful approach for initial gene detection. If a consensus definition does not exist, consider all evidence and decide on a specific definition that optimizes biological and clinical relevance.
  • 13. Determine if the disease is heritable  Decide from all available evidence in familial aggregation studies whether there is sufficient evidence that the disease of interest is heritable.  Concordance rates: presence of the same trait in both members of a pair of twins  If the heritability of a disease or subphenotype appears to be low (<20%) and the disease is common, it is likely that very large sample sizes (in excess of 5,000 cases and 5,000 controls) will be required to find predisposing genetic variants using a population-based approach. • Control selection  Should be age, gender and ethnicity specific
  • 14. Catalog of GWAS Studies http://www.genome.gov/26525384
  • 15. Catalog of GWAS Studies http://www.genome.gov/26525384
  • 16. Manolio et al., Clin Invest 2008
  • 17.
  • 18. Catalog of GWAS Studies http://www.genome.gov/26525384
  • 19. Genetic association studies Direct genotyping occurs when an actual causal polymorphism is typed. Indirect genotyping occurs when nearby genetic markers that are highly correlated with the causal polymorphism are typed Hirschhorn & Daly, Nat Rev Genet 2005 Candidate Gene or GWAS Takes advantage of the correlation between SNPs, called linkage disequilibrium (LD)
  • 21. Copyright restrictions may apply. Examples of Multistage Designs in Genome-wide Association Studies Pearson, T. A. et al. JAMA 2008;299:1335-1344
  • 24. Quality controls Quality control refers to the procedures used to evaluate the genotyping performance of the samples and the genotyping array. As there can be degradation of input DNA, plating errors and hybridization failures of genotyping chips, it is important to review the performance of the samples prior to definitive downstream analysis with the genotypes. The process of calling genotypes is not error free, It is thus vital to identify and exclude SNPs with potentially high rates of missingness or erroneous genotypes.
  • 25. Sample quality control The extent of missing genotypes and heterozygosity for a sample are useful indicators for poorly genotyped samples. Samples with anomalously high rates for either of these two measures are often excluded from the outset. High rates of missingness generally imply hybridization problems, which may be caused by faulty arrays or poor quality DNA Excess heterozygosity can indicate sample Contamination
  • 26. Sample quality control Unintentional use of related samples or accidental sample duplication in large scale studies Such cryptic relatedness is easy to infer through measuring the allele sharing Typically the sample in each relation with the least amount of missing genotypes is retained in the study. Family-based studies, the authenticity of the pedigree relationships can be achieved by calculating the extent of mendelian inconsistency PedChek software Exclude those are inconsistent
  • 27. SNPs Quality control  Remove SNPs with low call rate (e.g., <97%)  Proportion of SNPs actually called by software  Remove SNPs / Individuals who have too much missing data  Hardy-Weinberg Equilibrium,  Test for this (e.g., chi-squared test) Remove those with very low minor allele frequency
  • 28. Population structure  Population structure refers to the genetic differences that exist between individuals from different groups, populations or geographical regions.  There are a number of established statistical strategies for detecting population structure, of which those commonly used in genome-wide studies include genomic control (GC), which estimates the degree of inflation of the test statistic  A representation of how differences in genotypic (or allelic) frequencies across different populations can introduce false signals of association
  • 29. Selection of Markers for Association studies Human genome consists of over 3 billion base pairs Have about 28000 genes individuals are identical for ~99.5% of their sequence, with the small remaining part variable to differing extents  could variation have a role in explaining differences in genetic susceptibility to disease?  comparing variation between diseased (cases) and healthy (control) individuals from the same population If frequency of a variant at specific locus is >1% is said to be a polymorphism The most common class of polymorphisms SNPs, which comprise ~90% of all human variation Other types are larger blocks of sequence variation (mini-/micro-satellites), Indel,
  • 30. LD: non-random association of allele at two or more loci, that may or may not be on the same chromosome SNPs in LD? dbSNP have about 10 millions HapMap project determine taqSNPs which can be used as a proxies for other in LD and reduces the number of marker to be examined SNP-SNP association, or linkage disequilibrium, is fundamental to our ability to sample the whole genome with relatively few SNPs.
  • 31.
  • 32. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls (NATURE| Vol 447|7 June 2007) using the Affymetrix GeneChip 500K Mapping Array Set
  • 34. TaqMan assay system and mechanism of action •This is a best method for SNPs genotyping •Robust, reliable and very easy to prepare •Can be done in 384 well plate •Very low genotyping error rate •Reaction can be run on regular thermo cycler but Real-Time PCR detection system is necessary to scan the plates