Download FASTQ Files.
Usage
DownloadFastq(
gsm.df,
out.folder = NULL,
download.method = c("download.file", "ascp"),
quiet = FALSE,
timeout = 3600,
ascp.path = NULL,
max.rate = "300m",
parallel = TRUE,
use.cores = NULL,
format.10x = TRUE,
remove.raw = FALSE
)
Arguments
- gsm.df
Dataframe contains GSM and Run numbers, obtained from
ExtractRun
.- out.folder
Output folder. Default: NULL (current working directory).
- download.method
Method to download fastq files, chosen from "download.file" and "ascp". Default: "download.file".
- quiet
Logical value, whether to show downloading progress. Used when
download.method
is "download.file". Default: FALSE (show).- timeout
Maximum request time. Used when
download.method
is "download.file". Default: 3600.- ascp.path
Path to ascp (/path/bin/ascp), please ensure that the relative path of asperaweb_id_dsa.openssh file (/path/bin/ascp/../etc/asperaweb_id_dsa.openssh). Default: NULL (conduct automatic detection).
- max.rate
Max transfer rate. Used when
download.method
is "ascp". Default: 300m.- parallel
Logical value, whether to download parallelly. Default: TRUE.
- use.cores
The number of cores used. Default: NULL (the minimum value of
nrow(gsm.df)
andparallel::detectCores()
).- format.10x
Logical value, whether to format split fastqs to 10x standard format. Default: TRUE.
- remove.raw
Logical value, whether to remove old split fastqs (unformatted), used when
format.10x
is TRUE. Default: FALSE.
Examples
GSE130636.runs <- ExtractRun(acce = "GSE130636", platform = "GPL20301")
#> Extract all GSM with acce: GSE130636 and platform: GPL20301
#> Found 1 file(s)
#> GSE130636_series_matrix.txt.gz
#> Rows: 0 Columns: 7
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (7): ID_REF, GSM3745992, GSM3745993, GSM3745994, GSM3745995, GSM3745996,...
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> File stored at:
#> /var/folders/_4/k4qmvf7s2gx_6789px8n_sxh0000gn/T//Rtmp1TqNsD/GPL20301.soft
#> 6 GSMs to process
# a small test
GSE130636.runs <- GSE130636.runs[GSE130636.runs$run %in% c("SRR9004325", "SRR9004326"), ]
# use download.file
download.file.res <- DownloadFastq(
gsm.df = gsm.df, out.folder = "/path/to/output",
download.method = "download.file", parallel = TRUE, use.cores = 2
)
#> Error in nrow(gsm.df): object 'gsm.df' not found
# use ascp
ascp.res <- DownloadFastq(
gsm.df = gsm.df, out.folder = "/home/songyabing/data/projects/tmp/GEfetch2R",
download.method = "ascp", ascp.path = "~/.aspera/connect/bin/ascp", parallel = TRUE, use.cores = 2
)
#> Error in nrow(gsm.df): object 'gsm.df' not found