PeakMatrix.Rd
Prepare Count Matrix and Sample Metadata for Peak-related data.
PeakMatrix( meta.file, count.matrix = FALSE, min.overlap = 2, summits = 200, use.summarizeOverlaps = TRUE, filter = 1, blacklist = TRUE, sub.control = TRUE, used.cols = c("SampleID", "Condition"), out.folder = NULL, species = c("Human", "Mouse", "Rat", "Fly", "Arabidopsis", "Yeast", "Zebrafish", "Worm", "Bovine", "Pig", "Chicken", "Rhesus", "Canine", "Xenopus", "Anopheles", "Chimp", "E coli strain Sakai", "Myxococcus xanthus DK 1622"), seq.style = c("UCSC", "NCBI", "Ensembl", "None"), gtf.file = NULL, up.dist = 3000, down.dist = 3000, ... )
meta.file | Sample metadata contains peak related information (eg: sample, peakPath, bamPath, condition) or peak count matrix file. Should be tab-separated. |
---|---|
count.matrix | Logical value, whether the |
min.overlap | Only include peaks in at least this many peaksets in the main binding matrix. Default: 2.
Parameter of |
summits | If the value is greater than zero, all consensus peaks will be re-centered around a consensus summit,
with the value of summits indicating how many base pairs to include upstream and downstream of the
summit (so all consensus peaks will be of the same width, namely |
use.summarizeOverlaps | Logical value, indicating that |
filter | Filter intervals with low read counts based on RPKM values. Default: 1.
Parameter of |
blacklist | Species-specific abnormal regions to be removed. Choose from "DBA_BLACKLIST_HG19", "DBA_BLACKLIST_HG38", "DBA_BLACKLIST_GRCH37",
"DBA_BLACKLIST_GRCH38", "DBA_BLACKLIST_MM9", "DBA_BLACKLIST_MM10", "DBA_BLACKLIST_CE10", "DBA_BLACKLIST_CE11", "DBA_BLACKLIST_DM3", "DBA_BLACKLIST_DM6",
TRUE (auto-detection genome), a GRanges object containing the blacklisted regions. Default:TRUE.
Parameter of |
sub.control | Logical value, whether Control read counts are subtracted for each site in each sample.
Default: TRUE. Parameter of |
used.cols | Used columns used to create sample metadata. If specified, sampleID should be placed first.
Default: c("SampleID", "Condition"). Used when |
out.folder | Output folder to save created count matrix and sample metadata. Default: NULL (current working directory). |
species | Species used, chosen from "Human","Mouse","Rat","Fly","Arabidopsis","Yeast","Zebrafish","Worm","Bovine","Pig","Chicken","Rhesus", "Canine","Xenopus","Anopheles","Chimp","E coli strain Sakai","Myxococcus xanthus DK 1622". Default: "Human". |
seq.style | The style of sequence, chosen from UCSC, NCBI, Ensembl, None. This should be compatible with the genome and gtf file you used to generate count matrix and peak files. Default: "UCSC". |
gtf.file | GTF file used to create TxDb object. Useful when specie you used is not available in |
up.dist | The upstream distance from the TSS. Default: 3000bp. |
down.dist | The downstream distance from the TSS. Default: 3000bp. |
... | Parameters for |
A dataframe contains count matrix, peak annotation and sample metadata (if provided used.cols
). And
all save the corresponding results to consensus_peak_matrix.txt, consensus_peak_anno.txt and peak_metadata.txt files.
# library(DEbPeak) # library(DESeq2) # metadata file contains peak and bam information # beaware of the PeakCaller type (determine the score column) # meta.file = 'path/to/metadata' # PeakMatrix(meta.file = meta.file, species = "Human", seq.style = "UCSC", # up.dist = 20000, down.dist = 20000)