ConductDESeq2.Rd
Conduct Differential Analysis with DESeq2.
ConductDESeq2( counts.folder, count.matrix.file = NULL, meta.file, group.key = NULL, count.type = c("htseq-count", "featurecounts"), min.count = 10, ref.group = NULL, out.folder = NULL, data.type = c("RNA", "ChIP", "ATAC"), peak.anno.key = c("Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic", "All"), qc.ndepth = 10, transform.method = c("rlog", "vst", "ntd"), var.genes = NULL, batch = NULL, outlier.detection = T, rpca.method = c("PcaGrid", "PcaHubert"), k = 2, pca.x = "PC1", pca.y = "PC2", pca.z = "PC3", loding.pc = 1:5, loading.gene.num = 10, loading.ncol = 2, enrich.loading.pc = 1:5, enrich.loading.gene = 200, gene.type = c("ENSEMBL", "ENTREZID", "SYMBOL"), enrich.type = c("ALL", "GO", "KEGG"), go.type = c("ALL", "BP", "MF", "CC"), enrich.pvalue = 0.05, enrich.qvalue = 0.05, org.db = "org.Mm.eg.db", organism = "mmu", padj.method = c("BH", "holm", "hochberg", "hommel", "bonferroni", "BY", "fdr", "none"), show.term = 15, str.width = 30, signif = "padj", signif.threshold = 0.05, l2fc.threshold = 1, gene.map = NULL, gtf.file = NULL, norm.type = c("DESeq2", "TMM", "CPM", "RPKM", "TPM"), log.counts = TRUE, deg.label.df = NULL, deg.label.key = NULL, deg.label.num = 2, deg.label.color = NULL, fe.gene.key = NULL, gmt.file, gene.sets = NULL, minGSSize = 10, maxGSSize = 500, gsea.pvalue = 0.05 )
counts.folder | Folder contains all sample's count file. Count file should be SampleName.txt. |
---|---|
count.matrix.file | File contains count matrix, if provided, use this instead of |
meta.file | File contains sample metadata. |
group.key | Column in |
count.type | The source of count file, chosen from htseq-count, featurecounts. Default: htseq-count. |
min.count | A feature is considered to be detected if the corresponding number of read counts is > |
ref.group | Reference group name. When set NULL, select first element of groups. Default: NULL. |
out.folder | Folder to save enrichment results. Default: wording directory. |
data.type | Input data type, choose from RNA, ChIP, ATAC. Default: RNA. |
peak.anno.key | Peak location, chosen from "Promoter", "5' UTR", "3' UTR", "Exon", "Intron", "Downstream", "Distal Intergenic","All". Default: "Promoter". |
qc.ndepth | Number of different sequencing depths to be simulated and plotted apart from the real depth. Default: 10. This parameter is only used by type "saturation". |
transform.method | Data transformation methods, chosen from rlog, vst and ntd. Default: rlog. |
var.genes | Select genes with larger variance for PCA analysis. Default: all genes. |
batch | Batch column to conduct batch correction. Default value is NULL, do not conduct batch correction. |
outlier.detection | Logical value. If TRUE, conduct outlier detection with robust PCA. |
rpca.method | robust PCA method, chosen from |
k | number of principal components to compute, for |
pca.x | The principal component to display on the x axis. Default: PC1. |
pca.y | The principal component to display on the y axis. Default: PC2. |
pca.z | The principal component to display on the z axis. Default: PC3. |
loding.pc | Specify PC to create loding plot. Default: 1:5. |
loading.gene.num | Specify gene number of PC to create loding plot. Default: 10. |
loading.ncol | The columns of loading bar or heatmap. Default: 2. |
enrich.loading.pc | Specify PC to conduct enrichment analysis. Default: 1:5. |
enrich.loading.gene | Specify gene number of PC to conduct enrichment analysis. Default: 200. |
gene.type | Gene name type. Chosen from ENSEMBL, ENTREZID,SYMBOL. Default: ENSEMBL. |
enrich.type | Enrichment type, chosen from ALL, GO, KEGG. Default: ALL. |
go.type | GO enrichment type, chosen from ALL, BP, MF, CC. Default: ALL. |
enrich.pvalue | Cutoff value of pvalue. Default: 0.05. |
enrich.qvalue | Cutoff value of qvalue. Default: 0.05. |
org.db | Organism database. Default: org.Mm.eg.db. |
organism | Supported organism listed in 'http://www.genome.jp/kegg/catalog/org_list.html'. Default: mmu. |
padj.method | One of "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". Default: BH. |
show.term | Number of enrichment term to show. Default: 15. |
str.width | Length of enrichment term in plot. Default: 30. |
signif | Significance criterion. For DESeq2 results, can be chosen from padj, pvalue. For edgeR results, can be chosen from FDR, PValue. Default: padj. |
signif.threshold | Significance threshold to get differentially expressed genes or accessible/binding peaks. Default: 0.05. |
l2fc.threshold | Log2 fold change threshold to get differentially expressed genes or accessible/binding peaks. Default: 1. |
gene.map | Use data frame instead of |
gtf.file | Gene annotation file used to get gene length, used if |
norm.type | Normalization method, chosen from DESeq2, TMM, CPM, RPKM, TPM. Default: DESeq2. |
log.counts | Logical value, if TRUE, export log2(normalized.counts + 1), else export normalized.counts. Default: TRUE. |
deg.label.df | Label data frame, at least contains Gene column. Default: NULL. When set NULL, use |
deg.label.key | Which column to use as label. Default: NULL (use Gene column of |
deg.label.num | Gene number to label, choose according to log2FoldChange. When |
deg.label.color | Color vector for labels. Default: NULL. |
fe.gene.key | Column name to conduct enrichment analysis. Default: NULL. |
gmt.file | Gene Matrix Transposed file format. |
gene.sets | Gene sets information, containing two columns: gs_name, entrez_gene. Default: NULL. |
minGSSize | Minimal size of each geneSet for analyzing. Default: 10. |
maxGSSize | Maximal size of genes annotated for testing. Default: 500. |
gsea.pvalue | Cutoff value of pvalue. Default: 0.05. |
# library(DESeq2) # library(DEbPeak) # count.file <- system.file("extdata", "snon_count.txt", package = "DEbPeak") # meta.file <- system.file("extdata", "snon_meta.txt", package = "DEbPeak") # gmt.file <- system.file("extdata", "m5.go.bp.v2022.1.Mm.entrez.gmt", package = "DEbPeak") # ConductDESeq2(count.matrix.file = count.file, meta.file = meta.file, group.key = "condition", # count.type = "htseq-count", ref.group = "WT", signif = "pvalue", l2fc.threshold = 0.3, gmt.file = gmt.file)