DEbPeak.Rd
Integrate Differential Expression Results with Peak Annotation/Differential Expression Results.
DEbPeak( de.res, peak.res, peak.mode = c("consensus", "diff"), peak.anno.key = c("Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic", "All"), signif = "padj", signif.threshold = 0.05, l2fc.threshold = 1, label.key = NULL, peak.signif = "padj", peak.signif.threshold = 0.05, peak.l2fc.threshold = 1, merge.key = c("geneId", "ENSEMBL", "SYMBOL"), species = c("Human", "Mouse", "Rat", "Fly", "Arabidopsis", "Yeast", "Zebrafish", "Worm", "Bovine", "Pig", "Chicken", "Rhesus", "Canine", "Xenopus", "Anopheles", "Chimp", "E coli strain Sakai", "Myxococcus xanthus DK 1622"), enhancer = FALSE, seq.style = "Ensembl", gtf.file = NULL, dis.threshold = 250000, n.cores = 1 )
de.res | Dataframe contains all genes of differential expression analysis. |
---|---|
peak.res | Dataframe contains all peak annotation ( |
peak.mode | The source of peak results, choose from consensus (peak annotation) and diff (differential analysis). Default: consensus. |
peak.anno.key | Peak location, chosen from "Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic","All".
Used when |
signif | Significance criterion for RNA-seq results. For DESeq2 results, can be chosen from padj, pvalue. For edgeR results, can be chosen from FDR, PValue. Default: padj. |
signif.threshold | Significance threshold for RNA-seq to get differentially expressed genes. Default: 0.05. |
l2fc.threshold | Log2 fold change threshold for RNA-seq to get differentially expressed genes. Default: 1. |
label.key | Which column to use as label. Default: NULL (use rownames of |
peak.signif | Significance criterion for peak-associated results. For DESeq2 results, can be chosen from padj, pvalue. For edgeR results, can be chosen from FDR, PValue. Default: padj. |
peak.signif.threshold | Significance threshold for peak-associated results to get differentially accessible/binding peaks. Default: 0.05. |
peak.l2fc.threshold | Log2 fold change threshold for peak-associated results to get differentially accessible/binding peaks. Default: 1. |
merge.key | The columns used for merging, chosen from geneId (ENTREZID), ENSEMBL, SYMBOL. Default: geneId. |
species | Species used, chosen from "Human","Mouse","Rat","Fly","Arabidopsis", "Yeast","Zebrafish","Worm","Bovine","Pig","Chicken","Rhesus", "Canine","Xenopus","Anopheles","Chimp","E coli strain Sakai","Myxococcus xanthus DK 1622". Default: "Human". |
enhancer | Logical value. Whether the data is enhancer-related data. Default: FALSE. |
seq.style | The style of sequence, chosen from UCSC, NCBI, Ensembl, None. This should be compatible with the genome and gtf file you used to generate count matrix and peak files. Default: "UCSC". |
gtf.file | GTF file used to create TxDb object. Useful when specie you used is not available in |
dis.threshold | Distance threshold. Default: 250000. |
n.cores | The number of cores to be used for this job. Default:1. |
Dataframe contains integrated results. The results will contain five categories: UPbPeak, DOWNbPeak, UP, DOWN, Peak when peak.mode
is consensus
and Down_Up, Up_Up, Down_Down, Up_Down when peak.mode
is diff. RNA in front.
library(DEbPeak) library(DESeq2) library(openxlsx) ### consensus mode # ChIP-Seq data peak.file <- system.file("extdata", "debchip_peaks.bed", package = "DEbPeak") peak.df <- GetConsensusPeak(peak.file = peak.file) peak.profile <- PeakProfile(peak.df, species = "Mouse", by = "gene", region.type = "body", nbin = 800)#> >> preparing promoter regions... 2023-07-02 17时12分20秒 #> >> preparing tag matrix... 2023-07-02 17时12分20秒 #> >> preparing start_site regions by ... 2023-07-02 17时12分20秒 #> >> preparing tag matrix... 2023-07-02 17时12分20秒 #> >> generating figure... 2023-07-02 17时12分28秒#> >> done... 2023-07-02 17时12分28秒#> >> binning method is used...2023-07-02 17时12分28秒 #> >> preparing start_site regions by gene... 2023-07-02 17时12分28秒 #> >> preparing tag matrix by binning... 2023-07-02 17时12分28秒 #> >> Running bootstrapping for tag matrix... 2023-07-02 17时12分37秒 #> >> binning method is used...2023-07-02 17时12分37秒 #> >> preparing body regions by gene... 2023-07-02 17时12分37秒 #> >> preparing tag matrix by binning... 2023-07-02 17时12分37秒 #> >> preparing matrix with extension from (TSS-20%)~(TTS+20%)... 2023-07-02 17时12分37秒 #> >> 1 peaks(0.1536098%), having lengths smaller than 800bp, are filtered... 2023-07-02 17时12分40秒 #> >> Running bootstrapping for tag matrix... 2023-07-02 17时13分21秒peak.anno <- AnnoPeak( peak.df = peak.df, species = "Mouse", seq.style = "UCSC", up.dist = 20000, down.dist = 20000 )#> >> preparing features information... 2023-07-02 17时13分22秒 #> >> identifying nearest features... 2023-07-02 17时13分22秒 #> >> calculating distance from peak to TSS... 2023-07-02 17时13分22秒 #> >> assigning genomic annotation... 2023-07-02 17时13分22秒 #> >> adding gene annotation... 2023-07-02 17时13分25秒#>#> >> assigning chromosome lengths 2023-07-02 17时13分25秒 #> >> done... 2023-07-02 17时13分25秒#> Warning: Removed 6 rows containing non-finite values (`stat_count()`).# RNA-Seq data count.file <- system.file("extdata", "debchip_count.txt", package = "DEbPeak") meta.file <- system.file("extdata", "debchip_meta.txt", package = "DEbPeak") count.matrix <- read.table(file = count.file, header = TRUE, sep = "\t") meta.info <- read.table(file = meta.file, header = TRUE) # create DESeqDataSet object dds <- DESeq2::DESeqDataSetFromMatrix( countData = count.matrix, colData = meta.info, design = ~condition )#> Warning: some variables in design formula are characters, converting to factors# set control level dds$condition <- relevel(dds$condition, ref = "NF") # conduct differential expressed genes analysis dds <- DESeq(dds)#>#>#>#>#>#># extract results dds.results <- results(dds, contrast = c("condition", "RX", "NF")) dds.results.ordered <- dds.results[order(dds.results$log2FoldChange, decreasing = TRUE), ] # Integrated with RNA-Seq debchip.res <- DEbPeak( de.res = dds.results.ordered, peak.res = peak.anno[["df"]], peak.anno.key = "Promoter", merge.key = "SYMBOL" )#>### diff mode # ATAC-Seq data peak.matrix.file <- system.file("extdata", "RA_ATAC_count.txt", package = "DEbPeak") peak.meta.file <- system.file("extdata", "RA_ATAC_meta.txt", package = "DEbPeak") peak.matrix <- read.table(file = peak.matrix.file, header = TRUE, sep = "\t") peak.meta <- read.table(file = peak.meta.file, header = TRUE) dds.peak <- DESeq2::DESeqDataSetFromMatrix( countData = peak.matrix, colData = peak.meta, design = ~condition )#> Warning: some variables in design formula are characters, converting to factors# set control level dds.peak$condition <- relevel(dds.peak$condition, ref = "WT") # conduct differential expressed genes analysis dds.peak <- DESeq(dds.peak)#>#>#>#>#>#> #>#>#># extract results dds.peak.results <- results(dds.peak, contrast = c("condition", "KO", "WT")) dds.peak.results.ordered <- dds.peak.results[order(dds.peak.results$log2FoldChange, decreasing = TRUE), ] # RNA-seq results # rna.diff.file <- system.file("extdata", "RA_RNA_diff.xlsx", package = "DEbPeak") rna.diff.file <- system.file("extdata", "RA_RNA_diff.txt", package = "DEbPeak") # rna.diff <- openxlsx::read.xlsx(xlsxFile = rna.diff.file, rowNames = T, check.names = FALSE) rna.diff <- read.table(file = rna.diff.file, header = TRUE, sep = "\t") # integrate differential expression results of RNA-seq and ATAC-seq debatac.res <- DEbPeak( de.res = rna.diff, peak.res = dds.peak.results.ordered, peak.mode = "diff", peak.anno.key = "All", l2fc.threshold = 0, peak.l2fc.threshold = 0, species = "Mouse", merge.key = "SYMBOL" )#>#>#> Warning: 1.63% of input gene IDs are fail to map...#>#>#> Warning: 13.86% of input gene IDs are fail to map...