Prepare Count Matrix and Sample Metadata for Peak-related data.

PeakMatrix(
  meta.file,
  count.matrix = FALSE,
  min.overlap = 2,
  summits = 200,
  use.summarizeOverlaps = TRUE,
  filter = 1,
  blacklist = TRUE,
  sub.control = TRUE,
  used.cols = c("SampleID", "Condition"),
  out.folder = NULL,
  species = c("Human", "Mouse", "Rat", "Fly", "Arabidopsis", "Yeast", "Zebrafish",
    "Worm", "Bovine", "Pig", "Chicken", "Rhesus", "Canine", "Xenopus", "Anopheles",
    "Chimp", "E coli strain Sakai", "Myxococcus xanthus DK 1622"),
  seq.style = c("UCSC", "NCBI", "Ensembl", "None"),
  gtf.file = NULL,
  up.dist = 3000,
  down.dist = 3000,
  ...
)

Arguments

meta.file	Sample metadata contains peak related information (eg: sample, peakPath, bamPath, condition) or peak count matrix file. Should be tab-separated.
count.matrix	Logical value, whether the `meta.file` is a count matrix file. Default: FALSE.
min.overlap	Only include peaks in at least this many peaksets in the main binding matrix. Default: 2. Parameter of `dba`. Used when `count.matrix` is FALSE.
summits	If the value is greater than zero, all consensus peaks will be re-centered around a consensus summit, with the value of summits indicating how many base pairs to include upstream and downstream of the summit (so all consensus peaks will be of the same width, namely `2 * summits + 1`). Default: 200. Parameter of `dba.count`. Used when `count.matrix` is FALSE.
use.summarizeOverlaps	Logical value, indicating that `summarizeOverlaps` should be used for counting instead of the built-in counting code. Default: TRUE. Parameter of `dba.count`. Used when `count.matrix` is FALSE.
filter	Filter intervals with low read counts based on RPKM values. Default: 1. Parameter of `dba.count`. Used when `count.matrix` is FALSE.
blacklist	Species-specific abnormal regions to be removed. Choose from "DBA_BLACKLIST_HG19", "DBA_BLACKLIST_HG38", "DBA_BLACKLIST_GRCH37", "DBA_BLACKLIST_GRCH38", "DBA_BLACKLIST_MM9", "DBA_BLACKLIST_MM10", "DBA_BLACKLIST_CE10", "DBA_BLACKLIST_CE11", "DBA_BLACKLIST_DM3", "DBA_BLACKLIST_DM6", TRUE (auto-detection genome), a GRanges object containing the blacklisted regions. Default:TRUE. Parameter of `dba.blacklist`. Used when `count.matrix` is FALSE.
sub.control	Logical value, whether Control read counts are subtracted for each site in each sample. Default: TRUE. Parameter of `dba.count`. Used when `count.matrix` is FALSE.
used.cols	Used columns used to create sample metadata. If specified, sampleID should be placed first. Default: c("SampleID", "Condition"). Used when `count.matrix` is FALSE.
out.folder	Output folder to save created count matrix and sample metadata. Default: NULL (current working directory).
species	Species used, chosen from "Human","Mouse","Rat","Fly","Arabidopsis","Yeast","Zebrafish","Worm","Bovine","Pig","Chicken","Rhesus", "Canine","Xenopus","Anopheles","Chimp","E coli strain Sakai","Myxococcus xanthus DK 1622". Default: "Human".
seq.style	The style of sequence, chosen from UCSC, NCBI, Ensembl, None. This should be compatible with the genome and gtf file you used to generate count matrix and peak files. Default: "UCSC".
gtf.file	GTF file used to create TxDb object. Useful when specie you used is not available in `species`. Default: NULL.
up.dist	The upstream distance from the TSS. Default: 3000bp.
down.dist	The downstream distance from the TSS. Default: 3000bp.
...	Parameters for `annotatePeak`.

Value

A dataframe contains count matrix, peak annotation and sample metadata (if provided used.cols). And all save the corresponding results to consensus_peak_matrix.txt, consensus_peak_anno.txt and peak_metadata.txt files.

Examples

# library(DEbPeak)
# library(DESeq2)
# metadata file contains peak and bam information
# beaware of the PeakCaller type (determine the score column)
# meta.file = 'path/to/metadata'
# PeakMatrix(meta.file = meta.file, species = "Human",  seq.style = "UCSC",
#            up.dist = 20000, down.dist = 20000)