Skip to contents

Download bam.

Usage

DownloadBam(
  gsm.df,
  out.folder = NULL,
  download.method = c("prefetch", "download.file", "ascp"),
  bam.type = c("10x", "other"),
  prefetch.path = NULL,
  prefetch.paras = "-X 100G",
  samdump.path = NULL,
  samdump.paras = "",
  quiet = FALSE,
  timeout = 3600,
  ascp.path = NULL,
  max.rate = "300m",
  rename = TRUE,
  parallel = TRUE,
  use.cores = NULL
)

Arguments

gsm.df

Dataframe contains GSM and Run numbers, obtained from ExtractRun.

out.folder

Output folder. Default: NULL (current working directory).

download.method

Method to download sra files, chosen from "prefetch", "download.file", "ascp". Default: "prefetch".

bam.type

The source of bam files to download, choose from 10x (e.g. CellRanger) or other. Used when download.method is "prefetch". Default: 10x.

prefetch.path

Path to prefetch. Default: NULL (conduct automatic detection).

prefetch.paras

Parameters for prefetch. This should not contain --type or -T values. Default: "-X 100G".

samdump.path

Path to sam-dump, used when bam.type is other. Default: NULL (conduct automatic detection).

samdump.paras

Parameters for sam-dump. Default: "".

quiet

Logical value, whether to show downloading progress. Used when download.method is "download.file". Default: FALSE (show).

timeout

Maximum request time. Used when download.method is "download.file". Default: 3600.

ascp.path

Path to ascp (/path/bin/ascp), please ensure that the relative path of asperaweb_id_dsa.openssh file (/path/bin/ascp/../etc/asperaweb_id_dsa.openssh). Default: NULL (conduct automatic detection).

max.rate

Max transfer rate. Used when download.method is "ascp". Default: 300m.

rename

Logical value, whether to rename the download sra files. Recommended when download.method is "ascp". Default: FALSE (show).

parallel

Logical value, whether to download parallelly. Used when download.method is "ascp" or "download.file". Default: TRUE.

use.cores

The number of cores used. Used when download.method is "ascp" or "download.file". Default: NULL (the minimum value of nrow(gsm.df) and parallel::detectCores()).

Value

Dataframe contains failed runs or NULL.

Examples

if (FALSE) {
GSE138266.runs <- ExtractRun(acce = "GSE138266", platform = "GPL18573")
# prefetch
GSE138266.down <- DownloadBam(
  gsm.df = GSE138266.runs, bam.type = "10x",
  prefetch.path = "/path/to/prefetch", out.folder = "/path/to/output"
)
# download.file
GSE138266.down <- DownloadBam(
  gsm.df = GSE138266.runs, download.method = "download.file",
  timeout = 3600, out.folder = "/path/to/output",
  parallel = TRUE, use.cores = 2
)
# ascp
GSE138266.down <- DownloadBam(
  gsm.df = GSE138266.runs, download.method = "ascp",
  ascp.path = "/path/to/ascp", max.rate = "300m",
  rename = TRUE, out.folder = "/path/to/output",
  parallel = TRUE, use.cores = 2
)
}