ConductDESeq2.RdConduct Differential Analysis with DESeq2.
ConductDESeq2( counts.folder, count.matrix.file = NULL, meta.file, group.key = NULL, count.type = c("htseq-count", "featurecounts"), min.count = 10, ref.group = NULL, out.folder = NULL, data.type = c("RNA", "ChIP", "ATAC"), peak.anno.key = c("Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic", "All"), qc.ndepth = 10, transform.method = c("rlog", "vst", "ntd"), var.genes = NULL, batch = NULL, outlier.detection = T, rpca.method = c("PcaGrid", "PcaHubert"), k = 2, pca.x = "PC1", pca.y = "PC2", pca.z = "PC3", loding.pc = 1:5, loading.gene.num = 10, loading.ncol = 2, enrich.loading.pc = 1:5, enrich.loading.gene = 200, gene.type = c("ENSEMBL", "ENTREZID", "SYMBOL"), enrich.type = c("ALL", "GO", "KEGG"), go.type = c("ALL", "BP", "MF", "CC"), enrich.pvalue = 0.05, enrich.qvalue = 0.05, org.db = "org.Mm.eg.db", organism = "mmu", padj.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none"), show.term = 15, str.width = 30, signif = "padj", signif.threshold = 0.05, l2fc.threshold = 1, gene.map = NULL, gtf.file = NULL, norm.type = c("DESeq2", "TMM", "CPM", "RPKM", "TPM"), log.counts = TRUE, deg.label.df = NULL, deg.label.key = NULL, deg.label.num = 2, deg.label.color = NULL, fe.gene.key = NULL, gmt.file, gene.sets = NULL, minGSSize = 10, maxGSSize = 500, gsea.pvalue = 0.05 )
| counts.folder | Folder contains all sample's count file. Count file should be SampleName.txt. |
|---|---|
| count.matrix.file | File contains count matrix, if provided, use this instead of |
| meta.file | File contains sample metadata. |
| group.key | Column in |
| count.type | The source of count file, chosen from htseq-count, featurecounts. Default: htseq-count. |
| min.count | A feature is considered to be detected if the corresponding number of read counts is > |
| ref.group | Reference group name. When set NULL, select first element of groups. Default: NULL. |
| out.folder | Folder to save enrichment results. Default: wording directory. |
| data.type | Input data type, choose from RNA, ChIP, ATAC. Default: RNA. |
| peak.anno.key | Peak location, chosen from "Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic","All". Default: "Promoter". |
| qc.ndepth | Number of different sequencing depths to be simulated and plotted apart from the real depth. Default: 10. This parameter is only used by type "saturation". |
| transform.method | Data transformation methods, chosen from rlog, vst and ntd. Default: rlog. |
| var.genes | Select genes with larger variance for PCA analysis. Default: all genes. |
| batch | Batch column to conduct batch correction. Default value is NULL, do not conduct batch correction. |
| outlier.detection | Logical value. If TRUE, conduct outlier detection with robust PCA. |
| rpca.method | robust PCA method, chosen from |
| k | number of principal components to compute, for |
| pca.x | The principal component to display on the x axis. Default: PC1. |
| pca.y | The principal component to display on the y axis. Default: PC2. |
| pca.z | The principal component to display on the z axis. Default: PC3. |
| loding.pc | Specify PC to create loding plot. Default: 1:5. |
| loading.gene.num | Specify gene number of PC to create loding plot. Default: 10. |
| loading.ncol | The columns of loading bar or heatmap. Default: 2. |
| enrich.loading.pc | Specify PC to conduct enrichment analysis. Default: 1:5. |
| enrich.loading.gene | Specify gene number of PC to conduct enrichment analysis. Default: 200. |
| gene.type | Gene name type. Chosen from ENSEMBL, ENTREZID,SYMBOL. Default: ENSEMBL. |
| enrich.type | Enrichment type, chosen from ALL, GO, KEGG. Default: ALL. |
| go.type | GO enrichment type, chosen from ALL, BP, MF, CC. Default: ALL. |
| enrich.pvalue | Cutoff value of pvalue. Default: 0.05. |
| enrich.qvalue | Cutoff value of qvalue. Default: 0.05. |
| org.db | Organism database. Default: org.Mm.eg.db. |
| organism | Supported organism listed in 'http://www.genome.jp/kegg/catalog/org_list.html'. Default: mmu. |
| padj.method | One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". Default: BH. |
| show.term | Number of enrichment term to show. Default: 15. |
| str.width | Length of enrichment term in plot. Default: 30. |
| signif | Significance criterion. For DESeq2 results, can be chosen from padj, pvalue. For edgeR results, can be chosen from FDR, PValue. Default: padj. |
| signif.threshold | Significance threshold to get differentially expressed genes or accessible/binding peaks. Default: 0.05. |
| l2fc.threshold | Log2 fold change threshold to get differentially expressed genes or accessible/binding peaks. Default: 1. |
| gene.map | Use data frame instead of |
| gtf.file | Gene annotation file used to get gene length, used if |
| norm.type | Normalization method, chosen from DESeq2, TMM, CPM, RPKM, TPM. Default: DESeq2. |
| log.counts | Logical value, if TRUE, export log2(normalized.counts + 1), else export normalized.counts. Default: TRUE. |
| deg.label.df | Label data frame, at least contains Gene column. Default: NULL. When set NULL, use |
| deg.label.key | Which column to use as label. Default: NULL (use Gene column of |
| deg.label.num | Gene number to label, choose according to log2FoldChange. When |
| deg.label.color | Color vector for labels. Default: NULL. |
| fe.gene.key | Column name to conduct enrichment analysis. Default: NULL. |
| gmt.file | Gene Matrix Transposed file format. |
| gene.sets | Gene sets information, containing two columns: gs_name, entrez_gene. Default: NULL. |
| minGSSize | Minimal size of each geneSet for analyzing. Default: 10. |
| maxGSSize | Maximal size of genes annotated for testing. Default: 500. |
| gsea.pvalue | Cutoff value of pvalue. Default: 0.05. |
# library(DESeq2) # library(DEbPeak) # count.file <- system.file("extdata", "snon_count.txt", package = "DEbPeak") # meta.file <- system.file("extdata", "snon_meta.txt", package = "DEbPeak") # gmt.file <- system.file("extdata", "m5.go.bp.v2022.1.Mm.entrez.gmt", package = "DEbPeak") # ConductDESeq2(count.matrix.file = count.file, meta.file = meta.file, group.key = "condition", # count.type = "htseq-count", ref.group = "WT", signif = "pvalue", l2fc.threshold = 0.3, gmt.file = gmt.file)