This document describes an integrative regulatory genomics approach to prioritize target genes for systemic lupus erythematosus (SLE) using multiple datasets. RNA sequencing data from SLE patients was integrated with SLE genome-wide association study data, blood eQTL data, enhancer-promoter correlation data, and promoter capture Hi-C interaction data. Several thousand genes were found to be differentially expressed in SLE patient blood. Non-coding SLE risk variants were mapped to these genes using four approaches, identifying 51 genes genetically associated with SLE. These genes were highly enriched for known SLE pathways and could be prioritized as potential therapeutic targets.
Integrative regulatory genomics for target gene prioritisation in SLE
1. Integrative regulatory genomics for
target gene prioritisation in SLE
Enrico Ferrero1,2
1Autoimmunity Transplantation and Inflammation Bioinformatics, Novartis Institutes for BioMedical Research, Novartis Campus, 4056 Basel, Switzerland
2Previous address: Computational Biology, GSK, GSK Medicine Research Centre, Stevenage SG1 2NY, United Kingdom
01. Background
Several drug discovery programmes fail because of a weak linkage between
target and disease.
Genetic variation in disease can be used to identify promising targets, but our
understanding of how genetic variation influences gene expression is limited.
Regulatory genomic data such as expression quantitative trait loci (eQTL),
correlations and physical interactions between enhancers and promoters can
be used to map non-coding genetic variants to their target genes, highlighting
potential therapeutic targets (Figure 1).
02. Data
RNA-seq data from blood of systemic lupus erythematosus (SLE) patients
and healthy controls [1];
Single nucleotide polymorphisms (SNPs) from SLE genome-wide association
studies (GWASs) from the GWAS catalog [2];
Blood eQTL data from GTEx [3];
FANTOM5 correlations between enhancers and promoters across cell types
and tissues [4];
Promoter-capture Hi-C interactions between enhancers and promoters from
blood cell types [5].
04. References
1. Hung et al. (2015) The Ro60 autoantigen binds endogenous retroelements and regulates
inflammatory gene expression. Science.
2. MacArthur et al. (2017) The new NHGRI-EBI Catalog of published genome-wide association
studies. Nucleic Acids Res.
3. GTEx Consortium (2017) Genetic effects on gene expression across human tissues. Nature.
4. Andersson et al. (2014) An atlas of active enhancers across human cell types and
tissues. Nature.
5. Javierre et al. (2016) Lineage-specific genome architecture links enhancers and non-coding
disease variants to target gene promoters. Cell.
Gene Direction SNP P-value Location Method
JAK2 Upregulated rs1887428 1 x 10-6 JAK2 5’UTR Direct overlap
C2 Upregulated rs1270942 2 x 10-165 CFB intron GTEx eQTL
TAX1BP1 Upregulated rs849142 1 x 9-11 JAZF1 intron Promoter capture Hi-C
03. Results
Several thousands of genes are differentially expressed in the blood of SLE
patients when compared to healthy controls (Figure 2).
Most SLE GWAS SNPs are found in non-coding regions of genes (Figure 3).
Four methods are used to map SLE GWAS variants to differentially
expressed genes (DEGs) in the blood of SLE patients (Figure 4):
Direct overlap: DEGs in SLE vs healthy blood with SLE GWAS
SNPs in their coding regions (14);
GTEx eQTL: DEGs in SLE vs healthy blood with blood eQTL
that are SLE non-coding GWAS SNPs (17);
FANTOM5 correlations: DEGs in SLE vs healthy blood with
non-coding SLE GWAS SNPs in enhancers correlated with
gene promoter (7);
Promoter-capture Hi-C: DEGs in SLE vs healthy blood with
non-coding SLE GWAS SNPs in mapped enhancers physically
interacting with gene promoter (13).
The set of genes differentially expressed in and genetically associated with
SLE are highly enriched for well-known SLE pathological processes such as
interferon response, antigen processing and presentation and co-stimulatory
pathways (Figure 5).
DEGs linked to SLE GWAS SNPs through different approaches can be
prioritized and followed up on as potential therapeutic targets for SLE
(Table1).
Figure 1. Overview of the
integrative regulatory
genomics workflow. Four
approaches (direct overlap,
GTEx eQTL, FANTOM5
correlations and promoter-
capture Hi-C) are used
sequentially to map SLE GWAS
SNPs to genes differentially
expressed in SLE, leveraging
public regulatory genomic data.
Figure 2. RNA-seq differential expression analysis. MA plot of
the differential expression analysis of RNA extracted from the
blood of SLE patients and healthy controls, showing a large
numbers of genes being differentially expressed (4829
upregulated and 2709 downregulated at 5% FDR).
Figure 3. Genomic location of SLE GWAS SNPs.
Bar plot of SLE GWAS SNPs genomic locations,
highlighting that a large majority number of variants fall
in non-coding regions.
Figure 4. Number of DEGs
genetically linked to SLE
as identified by the four
approaches. Bar plot
summarizing results of the
mapping of SLE GWAS
SNPs to SLE DEGs, showing
that the great majority of
genes (~66%) was retrieved
using integrative regulatory
genomics approaches.
Figure 5. Gene
Ontology biological
process functional
enrichment. Genes
differentially expressed
in SLE and genetically
linked to the disease
are highly enriched for
biological mechanisms
known to be
dysregulated in SLE
such as interferon
response and immune
cell activation.
Table 1. Some examples of SLE GWAS variants mapped to SLE DEGs using the four approaches. The table reports the
putative target gene; the directionality of the target gene expression in SLE patients; the SNP; the p-value of the association of the
SNP with SLE; the genomic location of the SNP; the method used to assign the SNP to the target gene.