SlideShare une entreprise Scribd logo
1  sur  19
ChIP-Sequencing
Course: Next Generation Sequencing Analysis
Introduction
6/3/2023 5:43 PM 2
Introduction Sample Preparation Data Analysis
Considerations References
Figure 1: DNA Organization.
Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May
2021.
A method used to identify genomic regions bound by specific proteins or
protein modifications, providing insights into gene regulation and
chromatin structure.
Sample Preparation
6/3/2023 5:43 PM 3
Chemical
treatment
(Formaldehyde)
TF, Modified
Histone, RNA pol
Introduction Sample Preparation Considerations Data Analysis References
Figure 2: Sample preparation for ChIP-Seq
Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
.
6/3/2023 5:43 PM 4
Sample Preparation
100-300 bp
Cell Disruption and
DNA fragmentation
Introduction Sample Preparation Considerations Data Analysis References
Figure 2 (contd..): Sample preparation for ChIP-Seq
Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
.
Target Enrichment
6/3/2023 5:43 PM 5
Immunoprecipitation
Introduction Sample Preparation Considerations Dara Analysis References
Figure 2 (contd..): Sample preparation for ChIP-Seq
Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
Sequencing
6/3/2023 5:43 PM 6
Cross-linked reversal and
Library preparation
Sequencing of target DNA
fragment
NovaSeq
6000 System
Introduction Sample Preparation Data Analysis Considerations References
Figure 2: Sample preparation for ChIP-Seq
Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
Experimental Design Considerations
1. Antibody selection
2. Chromatin fragmentation
3. Cross-linking conditions
4. Sufficient amount of starting material - 2 x 106 cells per
immunoprecipitation.
5. Control libraries
6. Reducing artifacts - normalization
7. Biological replicates ≥ 3.
6/3/2023 5:43 PM 7
Introduction Sample Preparation Data Analysis
Considerations References
Sequencing Considerations
Parameters Values
Read Length 50-150 bp
Sequencing Mode SE, PE
Sequencing Depth 20-40 M total read depth (for TF)
≥ 40 M for Histone marks
6/3/2023 5:43 PM 8
Table1: Sequencing considerations for ChIP-Seq
Introduction Sample Preparation Data Analysis
Considerations References
Sequencing Considerations
6/3/2023 5:43 PM 9
Figure 3: No. of peaks called vs. sequencing depth
Adapted from Landt et al., "ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia,"
Genome Research, 22(9), 1813-1831, 2012.
Introduction Sample Preparation Data Analysis
Considerations References
Data Analysis
6/3/2023 5:43 PM 10
Introduction Sample Preparation Data Analysis
Considerations References
Figure 4: ChIP-Seq data analysis pipeline
Quality
control
Read
mapping
Peak calling
Data
visualization
Functional
analysis
Motif analysis
Differential
analysis
Integration
with other
data types
Reproducibility
1. Preprocessing
• Quality Control (QC)
• Read trimming and filtering
• PCR duplicate removal
Important Quality Matrices
a) Per base sequence quality
b) GC content
c) Over represented sequences
6/3/2023 5:43 PM 11
FastQC, MultiQC
Introduction Sample Preparation Data Analysis
Considerations References
2. Alignment
6/3/2023 5:43 PM 12
Preprocessed reads are
mapped to the reference
genome using tools like BWA
or SAMtools
Input = FASTQ
Output = SAM,
BAM
BWA, Bowtie,
STAR,
NovoAlign
Figure 5: Alignment results from BWA
Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May
6, 2023.
Introduction Sample Preparation Data Analysis
Considerations References
3. Peak Calling
6/3/2023 5:43 PM 13
Identification of enriched
loci in the genome.
Output = BED
Format
MACS,
SICER, Bayes
Peak
Figure 6: Peaks calling summary statistics using MACS2
Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May
6, 2023.
Introduction Sample Preparation Data Analysis
Considerations References
4. Visualization
6/3/2023 5:43 PM 14
Figure 7: Peaks visualization by DROMPAplus
Adapted from Nokato et al., "Methods for ChIP-seq analysis: A practical workflow and advanced applications," Journal of
Biochemistry, 159(4), 335-345, 2016, doi: 10.1093/jb/mvv124.
Introduction Sample Preparation Data Analysis
Considerations References
4. Visualization
6/3/2023 5:43 PM 15
Figure 8: Peaks visualization by DROMPAplus
Adapted from Nokato et al., "Methods for ChIP-seq analysis: A practical workflow and advanced applications," Journal of
Biochemistry, 159(4), 335-345, 2016, doi: 10.1093/jb/mvv124.
Introduction Sample Preparation Data Analysis
Considerations References
4. Visualization
6/3/2023 5:43 PM 16
Peaks can be viewed
directly in genome
browser e.g. UCSC
Genome Browser
ChIPseeker,
IGV
Figure 9: Peaks visualization by UCSC Genome Browser
Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023.
Introduction Sample Preparation Data Analysis
Considerations References
5. Peak Annotation
6/3/2023 5:43 PM 17
ReMap, MGA,
RSAT,
rGADEM
Figure 10: Peaks annotation by HOMER
Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023.
Introduction Sample Preparation Data Analysis
Considerations References
5. Peak Annotation
6/3/2023 5:43 PM 18
Figure 11: Peaks annotation by HOMER
Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023.
Introduction Sample Preparation Data Analysis
Considerations References
References
1. Nakato, R., Shirahige, K., & Takahata, S. (2021). Methods for ChIP-seq
analysis: A practical workflow and advanced applications. Genes to Cells,
26(6), 371-382. doi: 10.1111/gtc.12863.
2. Landt, S.G., Marinov, G.K., Kundaje, A. et al. (2012). ChIP-seq guidelines
and practices of the ENCODE and modENCODE consortia. Genome Res.
22(9), 1813-1831. doi: 10.1101/gr.136184.111.
3. Zymo Research. (n.d.). Service Pipeline Documentation. GitHub.
https://github.com/Zymo-Research/service-pipeline-documentation
6/3/2023 5:43 PM 19
Introduction Sample Preparation Data Analysis
Considerations References

Contenu connexe

Similaire à ChIP-Sequencing

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Imgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialImgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialDeanna Church
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataIan Fore
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods Zohaib HUSSAIN
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlishAttique1
 
Lessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science PerspectivesLessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science PerspectivesPhilip Bourne
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...David Peyruc
 
58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreusAnnadurai B
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartAraport
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917GenomeInABottle
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Enrico Glaab
 
Making effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsMaking effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsOregon State University
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010ygc
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposalGenomeInABottle
 
Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Sean Ekins
 

Similaire à ChIP-Sequencing (20)

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Imgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorialImgc2011 bioinformatics tutorial
Imgc2011 bioinformatics tutorial
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic Data
 
EnVisioning Pathways
EnVisioning PathwaysEnVisioning Pathways
EnVisioning Pathways
 
Medical science
Medical scienceMedical science
Medical science
 
Next Generation Sequencing methods
Next Generation Sequencing methods Next Generation Sequencing methods
Next Generation Sequencing methods
 
Zarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modellerZarlish attique 187104 project assignment modeller
Zarlish attique 187104 project assignment modeller
 
Lessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science PerspectivesLessons in Modeling from 3-D Structural & Data Science Perspectives
Lessons in Modeling from 3-D Structural & Data Science Perspectives
 
dream
dreamdream
dream
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
 
58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus58.Comparative modelling of cellulase from Aspergillus terreus
58.Comparative modelling of cellulase from Aspergillus terreus
 
ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
Ngs part i 2013
Ngs part i 2013Ngs part i 2013
Ngs part i 2013
 
Giab for jax long read 190917
Giab for jax long read 190917Giab for jax long read 190917
Giab for jax long read 190917
 
Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)Exploiting technical replicate variance in omics data analysis (RepExplore)
Exploiting technical replicate variance in omics data analysis (RepExplore)
 
Making effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computationsMaking effective use of graphics processing units (GPUs) in computations
Making effective use of graphics processing units (GPUs) in computations
 
Cncp 2010
Cncp 2010Cncp 2010
Cncp 2010
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal140127 abrf interlaboratory study proposal
140127 abrf interlaboratory study proposal
 
Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery
 

Dernier

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Dernier (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

ChIP-Sequencing

  • 2. Introduction 6/3/2023 5:43 PM 2 Introduction Sample Preparation Data Analysis Considerations References Figure 1: DNA Organization. Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021. A method used to identify genomic regions bound by specific proteins or protein modifications, providing insights into gene regulation and chromatin structure.
  • 3. Sample Preparation 6/3/2023 5:43 PM 3 Chemical treatment (Formaldehyde) TF, Modified Histone, RNA pol Introduction Sample Preparation Considerations Data Analysis References Figure 2: Sample preparation for ChIP-Seq Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021. .
  • 4. 6/3/2023 5:43 PM 4 Sample Preparation 100-300 bp Cell Disruption and DNA fragmentation Introduction Sample Preparation Considerations Data Analysis References Figure 2 (contd..): Sample preparation for ChIP-Seq Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021. .
  • 5. Target Enrichment 6/3/2023 5:43 PM 5 Immunoprecipitation Introduction Sample Preparation Considerations Dara Analysis References Figure 2 (contd..): Sample preparation for ChIP-Seq Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
  • 6. Sequencing 6/3/2023 5:43 PM 6 Cross-linked reversal and Library preparation Sequencing of target DNA fragment NovaSeq 6000 System Introduction Sample Preparation Data Analysis Considerations References Figure 2: Sample preparation for ChIP-Seq Adapted from Henrik's Lab. " ChIP seq - Chromatin Immunoprecipitation sequencing”, YouTube, 12 May 2021.
  • 7. Experimental Design Considerations 1. Antibody selection 2. Chromatin fragmentation 3. Cross-linking conditions 4. Sufficient amount of starting material - 2 x 106 cells per immunoprecipitation. 5. Control libraries 6. Reducing artifacts - normalization 7. Biological replicates ≥ 3. 6/3/2023 5:43 PM 7 Introduction Sample Preparation Data Analysis Considerations References
  • 8. Sequencing Considerations Parameters Values Read Length 50-150 bp Sequencing Mode SE, PE Sequencing Depth 20-40 M total read depth (for TF) ≥ 40 M for Histone marks 6/3/2023 5:43 PM 8 Table1: Sequencing considerations for ChIP-Seq Introduction Sample Preparation Data Analysis Considerations References
  • 9. Sequencing Considerations 6/3/2023 5:43 PM 9 Figure 3: No. of peaks called vs. sequencing depth Adapted from Landt et al., "ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia," Genome Research, 22(9), 1813-1831, 2012. Introduction Sample Preparation Data Analysis Considerations References
  • 10. Data Analysis 6/3/2023 5:43 PM 10 Introduction Sample Preparation Data Analysis Considerations References Figure 4: ChIP-Seq data analysis pipeline Quality control Read mapping Peak calling Data visualization Functional analysis Motif analysis Differential analysis Integration with other data types Reproducibility
  • 11. 1. Preprocessing • Quality Control (QC) • Read trimming and filtering • PCR duplicate removal Important Quality Matrices a) Per base sequence quality b) GC content c) Over represented sequences 6/3/2023 5:43 PM 11 FastQC, MultiQC Introduction Sample Preparation Data Analysis Considerations References
  • 12. 2. Alignment 6/3/2023 5:43 PM 12 Preprocessed reads are mapped to the reference genome using tools like BWA or SAMtools Input = FASTQ Output = SAM, BAM BWA, Bowtie, STAR, NovoAlign Figure 5: Alignment results from BWA Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023. Introduction Sample Preparation Data Analysis Considerations References
  • 13. 3. Peak Calling 6/3/2023 5:43 PM 13 Identification of enriched loci in the genome. Output = BED Format MACS, SICER, Bayes Peak Figure 6: Peaks calling summary statistics using MACS2 Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023. Introduction Sample Preparation Data Analysis Considerations References
  • 14. 4. Visualization 6/3/2023 5:43 PM 14 Figure 7: Peaks visualization by DROMPAplus Adapted from Nokato et al., "Methods for ChIP-seq analysis: A practical workflow and advanced applications," Journal of Biochemistry, 159(4), 335-345, 2016, doi: 10.1093/jb/mvv124. Introduction Sample Preparation Data Analysis Considerations References
  • 15. 4. Visualization 6/3/2023 5:43 PM 15 Figure 8: Peaks visualization by DROMPAplus Adapted from Nokato et al., "Methods for ChIP-seq analysis: A practical workflow and advanced applications," Journal of Biochemistry, 159(4), 335-345, 2016, doi: 10.1093/jb/mvv124. Introduction Sample Preparation Data Analysis Considerations References
  • 16. 4. Visualization 6/3/2023 5:43 PM 16 Peaks can be viewed directly in genome browser e.g. UCSC Genome Browser ChIPseeker, IGV Figure 9: Peaks visualization by UCSC Genome Browser Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023. Introduction Sample Preparation Data Analysis Considerations References
  • 17. 5. Peak Annotation 6/3/2023 5:43 PM 17 ReMap, MGA, RSAT, rGADEM Figure 10: Peaks annotation by HOMER Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023. Introduction Sample Preparation Data Analysis Considerations References
  • 18. 5. Peak Annotation 6/3/2023 5:43 PM 18 Figure 11: Peaks annotation by HOMER Adapted from Zymo Research, https://github.com/Zymo-Research/service-pipeline-documentation, Accessed May 6, 2023. Introduction Sample Preparation Data Analysis Considerations References
  • 19. References 1. Nakato, R., Shirahige, K., & Takahata, S. (2021). Methods for ChIP-seq analysis: A practical workflow and advanced applications. Genes to Cells, 26(6), 371-382. doi: 10.1111/gtc.12863. 2. Landt, S.G., Marinov, G.K., Kundaje, A. et al. (2012). ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 22(9), 1813-1831. doi: 10.1101/gr.136184.111. 3. Zymo Research. (n.d.). Service Pipeline Documentation. GitHub. https://github.com/Zymo-Research/service-pipeline-documentation 6/3/2023 5:43 PM 19 Introduction Sample Preparation Data Analysis Considerations References

Notes de l'éditeur

  1. Cross-linking between proteins and DNA in ChIP-seq samples is typically reversed by using heat and/or a chemical agent to break the cross-linking bonds and release the protein-DNA complexes. The library preparation step in ChIP-seq (chromatin immunoprecipitation sequencing) involves converting the fragmented DNA (or chromatin) obtained from the ChIP-seq sample into a sequencing library, which can be used for high-throughput sequencing. The library preparation step typically includes the following key steps: End repair: The fragmented DNA ends are repaired to generate blunt ends, suitable for ligation to sequencing adapters. Adaptor ligation: DNA sequencing adapters are ligated to the repaired DNA fragments. These adapters contain sequences that are required for the subsequent steps of the sequencing process. Size selection: The adapter-ligated DNA fragments are size-selected to remove any unligated adapters or fragments that are too small or too large for sequencing. PCR amplification: The size-selected DNA fragments are amplified by PCR (polymerase chain reaction) to generate sufficient material for sequencing. PCR primers specific to the adapter sequences are used to selectively amplify only the adapter-ligated fragments. Quality control: The resulting library is evaluated for quality and quantity using various methods, such as gel electrophoresis, qPCR (quantitative PCR), or fluorometry. The type of library for ChIP-seq (chromatin immunoprecipitation sequencing) can be either single-end or paired-end, depending on the sequencing platform and experimental design. In single-end sequencing, only one end of the DNA fragment is sequenced, while in paired-end sequencing, both ends of the DNA fragment are sequenced. Paired-end sequencing generates more information per fragment and allows for more accurate mapping of reads to the reference genome. Most commonly, ChIP-seq libraries are prepared as paired-end libraries, as this allows for more accurate identification of the precise binding location of the protein of interest. However, single-end sequencing may be used in some cases where cost or experimental constraints prohibit the use of paired-end sequencing.
  2. 1. An ideal antibody for ChIP-seq should have high specificity, sensitivity, and affinity for the protein of interest. It should be able to recognize the native conformation of the protein and not cross-react with other proteins in the sample. Additionally, the antibody should be able to capture the protein-DNA complexes in a highly efficient and reproducible manner. 2. If the chromatin is over-fragmented, then the DNA fragments may become too short, leading to decreased specificity and accuracy of the ChIP-seq assay. On the other hand, if the chromatin is under-fragmented, then the DNA fragments may become too large, leading to lower resolution of the assay and decreased ability to identify binding sites. 3. Crosslinking is a critical step in ChIP-seq (chromatin immunoprecipitation sequencing) as it plays a crucial role in preserving the protein-DNA interactions within the chromatin and ensuring accurate and reliable results. The conditions used for crosslinking, such as the concentration of formaldehyde, duration of crosslinking, and temperature, are all critical factors that can significantly affect the quality and specificity of the ChIP-seq data. 4. Ensure that you have a sufficient amount of starting material because the ChIP will only enrich for a small proportion. For a standard protocol, you want approximately 2 x 106 cells per immunoprecipitation. If it is difficult to obtain that many samples from your experiment, consider using low input methods. Ultimately, higher amounts of starting material yield more consistent and reproducible protein-DNA enrichments. 5. A ChIP-Seq peak should be compared with the same region of the genome in a matched control sample because only a fraction of the DNA in our ChIP sample corresponds to actual signal amidst background noise. Control libraries are an essential component of ChIP-seq (chromatin immunoprecipitation sequencing) experiments. In a ChIP-seq experiment, the goal is to identify the genomic regions bound by a specific protein of interest. However, this cannot be accomplished without taking into account the background noise and non-specific binding events that can occur during the experiment. Control libraries provide a baseline for comparison with the experimental libraries, allowing the identification of regions that are specifically enriched for the protein of interest versus regions that are non-specifically bound or enriched due to experimental noise. The most commonly used control library is a "mock IP" or "IgG" control, which involves performing the entire ChIP-seq protocol using an antibody that does not recognize any of the proteins of interest in the sample. 6. There are a number of artifacts that tend to generate pileups of reads that could be interpreted as a false positive peaks. These include: Open chromatin regions that are fragmented more easily than closed regions due to the accessibility of the DNA The presence of repetitive sequences An uneven distribution of sequence reads across the genome due to DNA composition ‘hyper-ChIPable’ regions: loci that are commonly enriched in ChIP datasets. Certain genomic regions are more susceptible to immunoprecipitation, therefore show increased ChIP signals for unrelated DNA-binding and chromatin-binding proteins.
  3. Single-end reads are sufficient in most cases. Paired-end is good (and necessary) for allele-specific chromatin events, and investigations of transposable elements. Sequence the input controls to equal or higher depth than your ChIP samples. A minimum of 40M total read depth; more is better for detecting some histone marks
  4. During the PCR amplification step of library preparation, some DNA fragments may be over-amplified, resulting in multiple identical copies of the same fragment. These PCR duplicates can bias the estimation of the true fragment frequency and affect the accuracy of peak calling and differential binding analysis. Overrepresented sequences are sequences that are found in high abundance in a ChIP-seq dataset. These sequences can arise from a variety of sources, such as sequencing adapters, PCR duplicates, or genomic regions with high GC content. Overrepresented sequences are an important quality metric in ChIP-seq preprocessing because they can indicate potential issues with the sequencing library, such as poor sequencing quality, contamination, or bias. High levels of overrepresented sequences can lead to reduced sequencing depth, false positive peaks, and decreased sensitivity and specificity of peak calling algorithms. Identifying and removing overrepresented sequences is an important step in ChIP-seq preprocessing to ensure the accuracy and reliability of downstream analysis. This can be done using bioinformatics tools that detect and filter out sequences that exceed a certain threshold of frequency or similarity to known contaminants or artifacts.
  5. 1. The alignment step in ChIP-seq (chromatin immunoprecipitation sequencing) is the process of mapping the sequencing reads generated from the ChIP and control libraries to a reference genome or transcriptome. The goal of the alignment step is to assign each read to its original genomic location with high accuracy and specificity, so that the genomic regions with significant binding enrichment can be identified and analyzed. The alignment step typically involves several sub-steps, including read quality control, adapter trimming, sequence alignment, and read sorting and indexing. Different software tools and algorithms can be used for these sub-steps, depending on the type and quality of the sequencing data, the genome or transcriptome of interest, and the specific research questions.
  6. Peak calling is a key step in the analysis of ChIP-seq (chromatin immunoprecipitation sequencing) data, which aims to identify genomic regions with significant enrichment of ChIP-seq signal over the control or background signal. These enriched regions, also called peaks, represent putative binding sites of the protein or factor of interest on the chromatin. Some common peak calling algorithms include MACS (Model-based Analysis of ChIP-Seq), SICER (Spatial Clustering for Identification of ChIP-Enriched Regions), and MAnorm (Model-based Analysis of Nucleosome Organization and Relationship to Transcription). These algorithms may also incorporate downstream analysis steps such as peak annotation, motif analysis, and gene ontology enrichment analysis. BED (Browser Extensible Data) format is a commonly used file format for representing genomic intervals, such as the genomic coordinates of ChIP-seq peaks, gene exons, or genomic variants. The BED file format is tab-delimited, and each line in the file represents a single genomic interval. A BED file typically contains at least three columns, representing the chromosome name, start position, and end position of the interval. Optionally, additional columns can be included to represent the name of the interval, the strand orientation, and additional metadata such as score, p-value, or functional annotations. The basic BED format has the following three mandatory columns: Chromosome: The name of the chromosome or contig where the interval is located. Start: The starting position of the interval on the chromosome, using 0-based coordinates. End: The ending position of the interval on the chromosome, using 1-based coordinates. FRiP (Fraction of Reads in Peaks) is a commonly used quality metric for ChIP-seq data analysis. It measures the fraction of aligned reads that fall within peaks, which are genomic regions with a high density of ChIP-seq signal. The FRiP score is calculated by dividing the number of reads that fall within called peaks by the total number of aligned reads. A high FRiP score indicates that a large proportion of aligned reads are in peaks, suggesting high enrichment of the target protein or histone modification. FRiP scores are often used to compare the quality of different ChIP-seq experiments, and a typical cutoff for a high-quality ChIP-seq experiment is a FRiP score of at least 20%. However, the appropriate cutoff may depend on the specific biological question and the type of sample being analyzed.
  7. In ChIP-seq data analysis, two types of peaks are commonly observed: sharp peaks and broad peaks. Sharp peaks are typically narrow and well-defined, indicating the precise location of a protein-DNA interaction, such as a transcription factor binding site or a histone modification. Sharp peaks are characterized by a high peak summit and a steep drop-off on either side of the peak summit. Broad peaks, on the other hand, are wider and more diffuse than sharp peaks, indicating a more extended region of protein-DNA interaction, such as a histone modification that spans a large genomic region. Broad peaks are characterized by a lower peak summit and a more gradual drop-off on either side of the peak summit. The distinction between sharp and broad peaks is important because different peak calling algorithms may be better suited to identify one type of peak versus the other, and different downstream analyses may be required depending on the type of peak. For example, motif discovery algorithms may be more effective at identifying transcription factor binding motifs within sharp peaks, while functional annotation tools may be better suited to identifying biological pathways associated with broad peaks.
  8.  identifies the genomic region and feature a peak overlaps with, such as exon, intron, promoter of a specific gene, or intergenic, etc. It also identifies the nearest TSS to the peak, including the distance and gene The peak score in the peak annotation file generated by HOMER is a score assigned to each peak based on the strength of the signal in that region. HOMER uses a statistical model to calculate the peak score, which takes into account the distribution of signal intensity across the genome and the size of the peak. The peak score is a useful metric for ranking peaks by their strength and for comparing the strength of peaks across different samples. In HOMER, peaks with higher scores are considered to have stronger signals and are more likely to be biologically meaningful. TSS= Transcription start site