IntegrateChIPATAC.Rmd
In previous vignettes, we have introduced intregrating RNA-seq with ATAC-seq and ChIP-seq separately. But we should keep in mind that ATAC-seq and ChIP-seq are highly consistent and complementary. Most TFs require open chromatin for binding to DNA recognition sites, ATAC-seq peaks generally overlap with TF ChIP-seq peaks but are often broader. Thus, TF ChIP-seq and ATAC-seq can mutually validate the quality and reliability of each other within the same experimental system.
To obtain more accurate gene regulation results, here we will integrate RNA-seq with ChIP-seq and ATAC-seq simultaneously.
The data used here contains RNA-seq and ChIP-seq datasets from RUNX represses Pmp22 to drive neurofibromagenesis:
The integration is based on results of “IntegrateChIP” vignette and “IntegrateATAC” vignette, load the results first.
# library
suppressWarnings(suppressMessages(library(DESeq2)))
suppressWarnings(suppressMessages(library(DEbPeak)))
# load RNA-seq and ChIP-seq integration results
load(file = "/home/songyabing/R/learn/tmp/DEbPeak/RNAandChIP.RData")
# load RNA-seq and ATAC-seq integration results
load(file = "/home/songyabing/R/learn/tmp/DEbPeak/RNAandATAC.RData")
debpeak.res = DEbCA(de.res = debchip.dds.results.ordered, chip.peak.res = peak.anno.df,
atac.peak.res = atac.peak.anno.df,
peak.anno.key = "Promoter", merge.key = "SYMBOL")
## Differential expression analysis with DESeq2!
head(debpeak.res)
## SYMBOL geneId_ChIP Peak_ChIP annotation_ChIP
## 1 0610009E02Rik <NA> <NA> <NA>
## 2 0610012G03Rik <NA> <NA> <NA>
## 3 0610039K10Rik <NA> <NA> <NA>
## 4 1110002J07Rik 68488 chr10:66905070-66905215 Promoter (12-13kb)
## 5 1110004F10Rik <NA> <NA> <NA>
## 6 1110008P14Rik <NA> <NA> <NA>
## anno_ChIP ENSEMBL_ChIP GENENAME_ChIP log2FoldChange abundance
## 1 <NA> <NA> <NA> NA NA
## 2 <NA> <NA> <NA> 1.837668 59.17919
## 3 <NA> <NA> <NA> NA NA
## 4 Promoter <NA> RIKEN cDNA 1110002J07 gene NA NA
## 5 <NA> <NA> <NA> NA NA
## 6 <NA> <NA> <NA> 1.013546 125.76554
## signif regulation Type1 geneId_ATAC Peak_ATAC
## 1 NA <NA> <NA> 100125929 chr2:26446213-26446286
## 2 3.073773 Up_regulated UP 106264 chr16:31948207-31948280
## 3 NA <NA> <NA> 68386 chr2:163644811-163644884
## 4 NA <NA> ChIP 68488 chr10:66935911-66935984
## 5 NA <NA> <NA> 56372 chr7:116039769-116039842
## 6 2.277566 Up_regulated UP <NA> <NA>
## annotation_ATAC anno_ATAC ENSEMBL_ATAC GENENAME_ATAC
## 1 Promoter (<=1kb) Promoter ENSMUSG00000086714 RIKEN cDNA 0610009E02 gene
## 2 Promoter (<=1kb) Promoter ENSMUSG00000107002 RIKEN cDNA 0610012G03 gene
## 3 Promoter (<=1kb) Promoter ENSMUSG00000058812 RIKEN cDNA 0610039K10 gene
## 4 Promoter (15-16kb) Promoter <NA> RIKEN cDNA 1110002J07 gene
## 5 Promoter (<=1kb) Promoter ENSMUSG00000030663 RIKEN cDNA 1110004F10 gene
## 6 <NA> <NA> <NA> <NA>
## Type geneId
## 1 ATAC 100125929
## 2 UPbATAC 106264
## 3 ATAC 68386
## 4 ChIPbATAC 68488
## 5 ATAC 56372
## 6 UP <NA>
# DE and ChIP venn plot
# debpeak.plot = PlotDEbPeak(debpeak.res, peak.type = "Peak", show_percentage=FALSE)
# debpeak.plot
debpeak.plot = InteVenn(inte.res = debpeak.res, inte.type = "DEbPeak", peak.type = "Peak",
gene.col = "SYMBOL",show_percentage = FALSE)
debpeak.plot
There are five categories for users to choose to perform functional enrichment: ChIP
, ATAC
, UP
, DOWN
, ChIPbATAC
, UPbChIP
, DOWNbChIP
, UPbPeak
, DOWNbPeak
, UPbATAC
, DOWNbATAC
. Here, we will use DOWNbPeak
and UPbPeak
as examples.
# functional enrichment on direct targets
# debpeak.up.fe.results = DEbPeakFE(de.peak = debpeak.res, peak.fe.key = "UPbPeak", gene.type = "ENTREZID", species="Mouse",save = F)
debpeak.up.fe.results = InteFE(inte.res = debpeak.res, fe.key = "UPbPeak", inte.type = "DEbPeak",
gene.type = "ENTREZID", species="Mouse",save = F)
##
## conduct ALL GO enrichment analysis on: UPbPeak
## wrong orderBy parameter; set to default `orderBy = "x"`
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
## wrong orderBy parameter; set to default `orderBy = "x"`
##
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
The results:
# the result table
head(debpeak.up.fe.results[["GO"]][["table"]])
## ONTOLOGY ID Description GeneRatio
## GO:0043218 CC GO:0043218 compact myelin 2/9
## GO:0043209 CC GO:0043209 myelin sheath 2/9
## GO:0070382 CC GO:0070382 exocytic vesicle 2/9
## GO:0043256 CC GO:0043256 laminin complex 1/9
## GO:0019898 CC GO:0019898 extrinsic component of membrane 2/9
## GO:0030133 CC GO:0030133 transport vesicle 2/9
## BgRatio pvalue p.adjust qvalue geneID Count
## GO:0043218 19/23271 2.265877e-05 0.001087621 0.000500878 Mbp/Pmp22 2
## GO:0043209 213/23271 2.877487e-03 0.043690148 0.020120463 Mbp/Pmp22 2
## GO:0070382 261/23271 4.282761e-03 0.043690148 0.020120463 Bsn/Syt6 2
## GO:0043256 12/23271 4.632203e-03 0.043690148 0.020120463 Pmp22 1
## GO:0019898 320/23271 6.366508e-03 0.043690148 0.020120463 Bsn/Syt6 2
## GO:0030133 339/23271 7.118989e-03 0.043690148 0.020120463 Bsn/Syt6 2
# the result plot
debpeak.up.fe.results[["GO"]][["plot"]]
# functional enrichment on direct targets
# debpeak.down.fe.results = DEbPeakFE(de.peak = debpeak.res, peak.fe.key = "DOWNbPeak",
# gene.type = "ENTREZID", species="Mouse",save = F)
debpeak.down.fe.results = InteFE(inte.res = debpeak.res, fe.key = "DOWNbPeak", inte.type = "DEbPeak",
gene.type = "ENTREZID", species="Mouse",save = F)
## conduct ALL GO enrichment analysis on: DOWNbPeak
## wrong orderBy parameter; set to default `orderBy = "x"`
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
## wrong orderBy parameter; set to default `orderBy = "x"`
##
## Scale for y is already present.
## Adding another scale for y, which will replace the existing scale.
The results:
# the result table
head(debpeak.down.fe.results[["GO"]][["table"]])
## ONTOLOGY ID Description
## GO:0045064 BP GO:0045064 T-helper 2 cell differentiation
## GO:0010811 BP GO:0010811 positive regulation of cell-substrate adhesion
## GO:0045785 BP GO:0045785 positive regulation of cell adhesion
## GO:0045860 BP GO:0045860 positive regulation of protein kinase activity
## GO:0042832 BP GO:0042832 defense response to protozoan
## GO:0010810 BP GO:0010810 regulation of cell-substrate adhesion
## GeneRatio BgRatio pvalue p.adjust qvalue
## GO:0045064 2/13 17/23328 3.880421e-05 0.01212719 0.006013313
## GO:0010811 3/13 128/23328 4.433038e-05 0.01212719 0.006013313
## GO:0045785 4/13 435/23328 7.460788e-05 0.01212719 0.006013313
## GO:0045860 4/13 440/23328 7.798836e-05 0.01212719 0.006013313
## GO:0042832 2/13 36/23328 1.786843e-04 0.01717888 0.008518210
## GO:0010810 3/13 214/23328 2.034275e-04 0.01717888 0.008518210
## geneID Count
## GO:0045064 Bcl3/Il4ra 2
## GO:0010811 Col8a1/Fbln2/Sdc4 3
## GO:0045785 Col8a1/Fbln2/Il4ra/Sdc4 4
## GO:0045860 Akap13/Ccnd1/Ern1/Sdc4 4
## GO:0042832 Bcl3/Il4ra 2
## GO:0010810 Col8a1/Fbln2/Sdc4 3
# the result plot
debpeak.down.fe.results[["GO"]][["plot"]]
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-conda-linux-gnu (64-bit)
## Running under: CentOS Linux 7 (Core)
##
## Matrix products: default
## BLAS/LAPACK: /home/softwares/anaconda3/envs/r4.0/lib/libopenblasp-r0.3.12.so
##
## locale:
## [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=zh_CN.UTF-8 LC_COLLATE=zh_CN.UTF-8
## [5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=zh_CN.UTF-8
## [7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] org.Mm.eg.db_3.12.0 AnnotationDbi_1.52.0
## [3] DEbPeak_1.4.0 DESeq2_1.30.1
## [5] SummarizedExperiment_1.20.0 Biobase_2.50.0
## [7] MatrixGenerics_1.2.1 matrixStats_0.58.0
## [9] GenomicRanges_1.42.0 GenomeInfoDb_1.26.7
## [11] IRanges_2.24.1 S4Vectors_0.28.1
## [13] BiocGenerics_0.42.0
##
## loaded via a namespace (and not attached):
## [1] rsvd_1.0.3
## [2] ggvenn_0.1.9
## [3] apeglm_1.12.0
## [4] Rsamtools_2.6.0
## [5] rsvg_2.1
## [6] foreach_1.5.1
## [7] rprojroot_2.0.2
## [8] crayon_1.4.1
## [9] V8_3.4.2
## [10] MASS_7.3-58
## [11] nlme_3.1-152
## [12] backports_1.2.1
## [13] sva_3.38.0
## [14] GOSemSim_2.25.0
## [15] rlang_1.1.0
## [16] XVector_0.30.0
## [17] readxl_1.4.2
## [18] irlba_2.3.5
## [19] limma_3.46.0
## [20] GOstats_2.56.0
## [21] BiocParallel_1.24.1
## [22] rjson_0.2.20
## [23] bit64_4.0.5
## [24] glue_1.6.2
## [25] DiffBind_3.0.15
## [26] mixsqp_0.3-43
## [27] pheatmap_1.0.12
## [28] parallel_4.0.3
## [29] DEFormats_1.18.0
## [30] base64url_1.4
## [31] tcltk_4.0.3
## [32] DOSE_3.23.2
## [33] haven_2.5.2
## [34] tidyselect_1.2.0
## [35] rio_0.5.27
## [36] XML_3.99-0.6
## [37] tidyr_1.3.0
## [38] ggpubr_0.4.0
## [39] GenomicAlignments_1.26.0
## [40] xtable_1.8-4
## [41] ggnetwork_0.5.12
## [42] magrittr_2.0.3
## [43] evaluate_0.14
## [44] ggplot2_3.4.2
## [45] cli_3.6.1
## [46] zlibbioc_1.36.0
## [47] hwriter_1.3.2
## [48] rstudioapi_0.14
## [49] bslib_0.3.1
## [50] GreyListChIP_1.22.0
## [51] fastmatch_1.1-3
## [52] BiocSingular_1.6.0
## [53] xfun_0.30
## [54] askpass_1.1
## [55] clue_0.3-59
## [56] gson_0.0.9
## [57] cluster_2.1.1
## [58] caTools_1.18.2
## [59] tidygraph_1.2.0
## [60] tibble_3.2.1
## [61] ggrepel_0.9.1
## [62] Biostrings_2.58.0
## [63] png_0.1-7
## [64] withr_2.5.0
## [65] bitops_1.0-6
## [66] ggforce_0.3.3
## [67] RBGL_1.66.0
## [68] plyr_1.8.6
## [69] cellranger_1.1.0
## [70] GSEABase_1.52.1
## [71] pcaPP_2.0-1
## [72] dqrng_0.2.1
## [73] coda_0.19-4
## [74] pillar_1.9.0
## [75] gplots_3.1.1
## [76] GlobalOptions_0.1.2
## [77] cachem_1.0.4
## [78] GenomicFeatures_1.42.2
## [79] fs_1.5.0
## [80] GetoptLong_1.0.5
## [81] clusterProfiler_4.7.1
## [82] DelayedMatrixStats_1.12.3
## [83] vctrs_0.6.2
## [84] generics_0.1.0
## [85] plot3D_1.4
## [86] tools_4.0.3
## [87] foreign_0.8-81
## [88] NOISeq_2.34.0
## [89] munsell_0.5.0
## [90] tweenr_1.0.2
## [91] fgsea_1.16.0
## [92] DelayedArray_0.16.3
## [93] fastmap_1.1.0
## [94] compiler_4.0.3
## [95] abind_1.4-5
## [96] rtracklayer_1.50.0
## [97] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
## [98] GenomeInfoDbData_1.2.4
## [99] gridExtra_2.3
## [100] edgeR_3.32.1
## [101] lattice_0.20-45
## [102] ggnewscale_0.4.7
## [103] AnnotationForge_1.32.0
## [104] utf8_1.2.1
## [105] dplyr_1.1.2
## [106] BiocFileCache_1.14.0
## [107] jsonlite_1.8.4
## [108] scales_1.2.1
## [109] graph_1.68.0
## [110] carData_3.0-4
## [111] sparseMatrixStats_1.2.1
## [112] TFEA.ChIP_1.10.0
## [113] genefilter_1.72.1
## [114] car_3.0-11
## [115] doParallel_1.0.16
## [116] latticeExtra_0.6-29
## [117] R.utils_2.12.0
## [118] brew_1.0-6
## [119] checkmate_2.0.0
## [120] rmarkdown_2.14
## [121] openxlsx_4.2.3
## [122] pkgdown_1.6.1
## [123] cowplot_1.1.1
## [124] textshaping_0.3.6
## [125] forcats_1.0.0
## [126] downloader_0.4
## [127] BSgenome_1.58.0
## [128] igraph_1.4.99.9024
## [129] survival_3.2-10
## [130] numDeriv_2016.8-1.1
## [131] yaml_2.2.1
## [132] plotrix_3.8-2
## [133] systemfonts_1.0.4
## [134] ashr_2.2-47
## [135] SQUAREM_2021.1
## [136] htmltools_0.5.2
## [137] memoise_2.0.0
## [138] VariantAnnotation_1.36.0
## [139] locfit_1.5-9.4
## [140] graphlayouts_0.7.1
## [141] batchtools_0.9.15
## [142] PCAtools_2.2.0
## [143] viridisLite_0.4.0
## [144] rrcov_1.7-0
## [145] digest_0.6.27
## [146] assertthat_0.2.1
## [147] rappdirs_0.3.3
## [148] emdbook_1.3.12
## [149] RSQLite_2.2.5
## [150] amap_0.8-18
## [151] yulab.utils_0.0.4
## [152] debugme_1.1.0
## [153] misc3d_0.9-1
## [154] data.table_1.14.2
## [155] blob_1.2.1
## [156] R.oo_1.24.0
## [157] ragg_0.4.0
## [158] labeling_0.4.2
## [159] splines_4.0.3
## [160] ggupset_0.3.0
## [161] RCurl_1.98-1.3
## [162] broom_1.0.4
## [163] hms_1.1.3
## [164] colorspace_2.0-0
## [165] BiocManager_1.30.16
## [166] shape_1.4.6
## [167] sass_0.4.1
## [168] GEOquery_2.58.0
## [169] Rcpp_1.0.9
## [170] mvtnorm_1.1-2
## [171] circlize_0.4.15
## [172] enrichplot_1.10.2
## [173] fansi_0.4.2
## [174] tzdb_0.3.0
## [175] truncnorm_1.0-8
## [176] ChIPseeker_1.33.0.900
## [177] R6_2.5.0
## [178] grid_4.0.3
## [179] lifecycle_1.0.3
## [180] ShortRead_1.48.0
## [181] zip_2.1.1
## [182] curl_4.3
## [183] ggsignif_0.6.3
## [184] jquerylib_0.1.3
## [185] robustbase_0.95-0
## [186] DO.db_2.9
## [187] Matrix_1.5-4
## [188] qvalue_2.22.0
## [189] desc_1.3.0
## [190] org.Hs.eg.db_3.12.0
## [191] RColorBrewer_1.1-2
## [192] iterators_1.0.13
## [193] stringr_1.5.0
## [194] DOT_0.1
## [195] ggpie_0.2.5
## [196] beachmat_2.6.4
## [197] polyclip_1.10-0
## [198] biomaRt_2.46.3
## [199] purrr_1.0.1
## [200] shadowtext_0.0.9
## [201] gridGraphics_0.5-1
## [202] mgcv_1.8-34
## [203] ComplexHeatmap_2.13.1
## [204] openssl_1.4.3
## [205] patchwork_1.0.0
## [206] bdsmatrix_1.3-4
## [207] codetools_0.2-18
## [208] invgamma_1.1
## [209] GO.db_3.12.1
## [210] gtools_3.8.2
## [211] prettyunits_1.1.1
## [212] dbplyr_2.3.2
## [213] R.methodsS3_1.8.1
## [214] gtable_0.3.0
## [215] DBI_1.1.1
## [216] highr_0.8
## [217] ggfun_0.0.6
## [218] httr_1.4.5
## [219] KernSmooth_2.23-18
## [220] stringi_1.5.3
## [221] progress_1.2.2
## [222] reshape2_1.4.4
## [223] farver_2.1.0
## [224] annotate_1.68.0
## [225] viridis_0.6.1
## [226] Rgraphviz_2.34.0
## [227] xml2_1.3.4
## [228] bbmle_1.0.24
## [229] systemPipeR_1.24.3
## [230] boot_1.3-28
## [231] readr_2.1.4
## [232] geneplotter_1.68.0
## [233] ggplotify_0.1.0
## [234] Category_2.56.0
## [235] DEoptimR_1.0-11
## [236] bit_4.0.4
## [237] scatterpie_0.1.7
## [238] jpeg_0.1-8.1
## [239] ggraph_2.0.5
## [240] pkgconfig_2.0.3
## [241] rstatix_0.7.0
## [242] knitr_1.37