DownloadObjects
2023-07-23
DownloadObjects.Rmd
Introduction
scfetch
provides functions for users to download
processed single-cell RNA-seq data from Zenodo, CELLxGENE and Human Cell Atlas, including
RDS
, RData
, h5ad
,
h5
, loom
objects.
Until now, the public resources supported and the returned results:
Resources | URL | Download Type | Returned results |
---|---|---|---|
Zenodo | https://zenodo.org/ | count matrix, rds, rdata, h5ad, et al. | NULL or failed datasets |
CELLxGENE | https://cellxgene.cziscience.com/ | rds, h5ad | NULL or failed datasets |
Human Cell Atlas | https://www.humancellatlas.org/ | rds, rdata, h5, h5ad, loom | NULL or failed projects |
Zenodo
Zenodo contains various types of
processed objects, such as SeuratObject
which has been
clustered and annotated, AnnData
which contains processed
results generated by scanpy
.
Extract metadata
scfetch
provides ExtractZenodoMeta
to
extract dataset metadata, including dataset title, description,
available files and corresponding md5. Please note that when the dataset
is restricted access, the returned dataframe will be empty.
# library
library(scfetch)
## Setting options('download.file.method.GEOquery'='auto')
## Setting options('GEOquery.inmemory.gpl'=FALSE)
## Registered S3 method overwritten by 'SeuratDisk':
## method from
## as.sparse.H5Group Seurat
# single doi
zebrafish.df <- ExtractZenodoMeta(doi = "10.5281/zenodo.7243603")
head(zebrafish.df)
## title
## 1 zebrafish scRNA data set objects
## 2 zebrafish scRNA data set objects
## description
## 1 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 2 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## url
## 1 https://zenodo.org/api/files/c343bdeb-0ed8-4d34-bcb2-7a3fe53a27ef/zebrafish_data.h5ad
## 2 https://zenodo.org/api/files/c343bdeb-0ed8-4d34-bcb2-7a3fe53a27ef/zebrafish_data.RData
## filename md5 license
## 1 zebrafish_data.h5ad 124f2229128918b411a7dc7931558f97 CC-BY-4.0
## 2 zebrafish_data.RData a08c3ebd285b370fcf34cf2f8f9bdb59 CC-BY-4.0
# vector dois
multi.dois <- ExtractZenodoMeta(doi = c("1111", "10.5281/zenodo.7243603", "10.5281/zenodo.7244441"))
## 1111 are not valid dois, please check!
head(multi.dois)
## title
## 1 zebrafish scRNA data set objects
## 2 zebrafish scRNA data set objects
## 3 frog scRNA data set objects
## 4 frog scRNA data set objects
## description
## 1 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 2 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 3 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Briggs et al. 2018 and Qiu et al. 2022.</p>
## 4 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Briggs et al. 2018 and Qiu et al. 2022.</p>
## url
## 1 https://zenodo.org/api/files/c343bdeb-0ed8-4d34-bcb2-7a3fe53a27ef/zebrafish_data.h5ad
## 2 https://zenodo.org/api/files/c343bdeb-0ed8-4d34-bcb2-7a3fe53a27ef/zebrafish_data.RData
## 3 https://zenodo.org/api/files/76d3460d-c2dc-4034-ab94-079db9542fe4/frog_data.h5ad
## 4 https://zenodo.org/api/files/76d3460d-c2dc-4034-ab94-079db9542fe4/frog_data.RData
## filename md5 license
## 1 zebrafish_data.h5ad 124f2229128918b411a7dc7931558f97 CC-BY-4.0
## 2 zebrafish_data.RData a08c3ebd285b370fcf34cf2f8f9bdb59 CC-BY-4.0
## 3 frog_data.h5ad 7be7d6ff024ab2c8579b4d0edb2428e3 CC-BY-4.0
## 4 frog_data.RData c80f46320c0cff9e341bed195f12c3b1 CC-BY-4.0
Download object
After manually check the extracted metadata, users can
download the specified objects with
ParseZenodo
. The downloaded objects are controlled by
file.ext
and the provided object formats should be
in lower case (e.g. rds/rdata/h5ad).
The returned result is a dataframe containing failed objects. If not
NULL
, users can re-run ParseZenodo
by setting
doi.df
to the returned result.
multi.dois.parse <- ParseZenodo(
doi = c("1111", "10.5281/zenodo.7243603", "10.5281/zenodo.7244441"),
file.ext = c("rdata", "rds"), out.folder = "/Users/soyabean/Desktop/tmp/scdown/download_zenodo"
)
The structure of downloaded objects:
tree /Users/soyabean/Desktop/tmp/scdown/download_zenodo
## /Users/soyabean/Desktop/tmp/scdown/download_zenodo
## ├── frog_data.RData
## └── zebrafish_data.RData
##
## 1 directory, 2 files
CELLxGENE
The CELLxGENE is a
web server contains 910 single-cell datasets, users can
explore, download and upload own datasets. The downloaded datasets
provided by CELLxGENE
have two formats: h5ad (AnnData v0.8)
and
rds (Seurat v4)
.
Show available datasets
scfetch
provides ShowCELLxGENEDatasets
to
extract dataset metadata, including dataset title, description, contact,
organism, ethnicity, sex, tissue, disease, assay, suspension type, cell
type, et al.
# all available datasets
all.cellxgene.datasets <- ShowCELLxGENEDatasets()
Summary attributes
scfetch
provides StatDBAttribute
to summary
attributes of CELLxGENE:
StatDBAttribute(df = all.cellxgene.datasets, filter = c("organism", "sex"), database = "CELLxGENE")
## $organism
## Value Num Key
## 1 homo sapiens 706 organism
## 2 mus musculus 188 organism
## 3 callithrix jacchus 26 organism
## 4 sus scrofa domesticus 3 organism
## 5 macaca mulatta 2 organism
##
## $sex
## Value Num Key
## 1 male 774 sex
## 2 female 557 sex
## 3 unknown 71 sex
Extract metadata
scfetch
provides ExtractCELLxGENEMeta
to
filter dataset metadata, the available values of attributes can be
obtained with StatDBAttribute
except cell
number:
# human 10x v2 and v3 datasets
human.10x.cellxgene.meta <- ExtractCELLxGENEMeta(
all.samples.df = all.cellxgene.datasets,
assay = c("10x 3' v2", "10x 3' v3"), organism = "Homo sapiens"
)
## Use all self_reported_ethnicity as input!
## Use all sex as input!
## Use all tissue as input!
## Use all disease as input!
## Use all suspension_type as input!
## Use all cell_type as input!
head(human.10x.cellxgene.meta)
## title
## 1 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse
## 2 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse
## 3 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse
## 4 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse
## 5 Single-cell transcriptomic atlas for adult human retina
## 6 Single-cell transcriptomic atlas for adult human retina
## description
## 1 High-throughput transcriptomic and epigenomic profiling of over 450,000 single nuclei in human, marmoset monkey, and mouse primary motor cortex (M1)
## 2 High-throughput transcriptomic and epigenomic profiling of over 450,000 single nuclei in human, marmoset monkey, and mouse primary motor cortex (M1)
## 3 High-throughput transcriptomic and epigenomic profiling of over 450,000 single nuclei in human, marmoset monkey, and mouse primary motor cortex (M1)
## 4 High-throughput transcriptomic and epigenomic profiling of over 450,000 single nuclei in human, marmoset monkey, and mouse primary motor cortex (M1)
## 5 The retina is the innermost tissue of the eyes of human and most other vertebrates. It receives the information of the visual images like the film of a camera and then translates the images into neural signals. As the neural signals get transduced to and processed by the brain, the visual perception is created. Different cell types in the retina functions in various ways to accomplish the whole procedure.\n\nIn the human retina, the major cell types could be further classified into subtypes based on their morphology, physiology features or functions, and molecular markers. To date, it is estimated that there are over 70 cell types in the human retina according to a previous primate study. Reliable classification of those neurons is required to manage the complexity of such level. Single-nuclei RNA-seq were carried out to profile well-characterized healthy human retina from six individual donors using the 10x Genomics technologies. Each donor retina was dissected into three geographic regions: the fovea, macula, and peripheral retina and flash frozen afterwards. A fractionation protocol was developed to enrich nuclei from rare neuron cell types, including bipolar cells, amacrine cells, and retinal ganglion cells. In total, over 60 cell types are identified in our dataset, making the currently most comprehensive single-cell profiling of adult human retina.
## 6 The retina is the innermost tissue of the eyes of human and most other vertebrates. It receives the information of the visual images like the film of a camera and then translates the images into neural signals. As the neural signals get transduced to and processed by the brain, the visual perception is created. Different cell types in the retina functions in various ways to accomplish the whole procedure.\n\nIn the human retina, the major cell types could be further classified into subtypes based on their morphology, physiology features or functions, and molecular markers. To date, it is estimated that there are over 70 cell types in the human retina according to a previous primate study. Reliable classification of those neurons is required to manage the complexity of such level. Single-nuclei RNA-seq were carried out to profile well-characterized healthy human retina from six individual donors using the 10x Genomics technologies. Each donor retina was dissected into three geographic regions: the fovea, macula, and peripheral retina and flash frozen afterwards. A fractionation protocol was developed to enrich nuclei from rare neuron cell types, including bipolar cells, amacrine cells, and retinal ganglion cells. In total, over 60 cell types are identified in our dataset, making the currently most comprehensive single-cell profiling of adult human retina.
## contact contact_email collection_created_at
## 1 Ed S. Lein edl@alleninstitute.org 1692673168
## 2 Ed S. Lein edl@alleninstitute.org 1692673168
## 3 Ed S. Lein edl@alleninstitute.org 1692673168
## 4 Ed S. Lein edl@alleninstitute.org 1692673168
## 5 Rui Chen ruichen@bcm.edu 1692663268
## 6 Rui Chen ruichen@bcm.edu 1692663268
## collection_id collection_owner
## 1 367d95c0-0eb0-4dae-8276-9407239421ee google-oauth2|104470766926663023506
## 2 367d95c0-0eb0-4dae-8276-9407239421ee google-oauth2|104470766926663023506
## 3 367d95c0-0eb0-4dae-8276-9407239421ee google-oauth2|104470766926663023506
## 4 367d95c0-0eb0-4dae-8276-9407239421ee google-oauth2|104470766926663023506
## 5 af893e86-8e9f-41f1-a474-ef05359b1fb7 google-oauth2|104470766926663023506
## 6 af893e86-8e9f-41f1-a474-ef05359b1fb7 google-oauth2|104470766926663023506
## collection_visibility assay cell_count
## 1 PUBLIC 10x 3' v3 24213
## 2 PUBLIC 10x 3' v3 29486
## 3 PUBLIC 10x 3' v3 10739
## 4 PUBLIC 10x 3' v3 29050
## 5 PUBLIC 10x 3' v3 18011
## 6 PUBLIC 10x 3' v3 11617
## cell_type
## 1 glutamatergic neuron
## 2 GABAergic neuron
## 3 astrocyte, endothelial cell, leptomeningeal cell, microglial cell, oligodendrocyte, oligodendrocyte precursor cell, pericyte, smooth muscle cell
## 4 glutamatergic neuron
## 5 Mueller cell, astrocyte, microglial cell, retinal pigment epithelial cell
## 6 retinal ganglion cell
## collection_id created_at
## 1 367d95c0-0eb0-4dae-8276-9407239421ee 1692674902
## 2 367d95c0-0eb0-4dae-8276-9407239421ee 1692674901
## 3 367d95c0-0eb0-4dae-8276-9407239421ee 1692674843
## 4 367d95c0-0eb0-4dae-8276-9407239421ee 1692674911
## 5 af893e86-8e9f-41f1-a474-ef05359b1fb7 1692663529
## 6 af893e86-8e9f-41f1-a474-ef05359b1fb7 1692663437
## dataset_assets
## 1 0, 0, 0, 0, 28d23a79-3db8-4ada-8256-33f2c38b704d, 28d23a79-3db8-4ada-8256-33f2c38b704d, 28d23a79-3db8-4ada-8256-33f2c38b704d, 28d23a79-3db8-4ada-8256-33f2c38b704d, raw.h5ad, local.h5ad, , local.rds, RAW_H5AD, H5AD, CXG, RDS, 650868ac-0224-48a6-a254-645918c4e435, a04658c4-5650-462e-ba3b-3465ebad88cf, 6ca97026-bd3f-47fe-9c8d-1c6b86ab87fe, 4b139da9-9d44-4d60-9d61-62b50cf6fb1f, s3://corpora-data-prod/28d23a79-3db8-4ada-8256-33f2c38b704d/raw.h5ad, s3://corpora-data-prod/28d23a79-3db8-4ada-8256-33f2c38b704d/local.h5ad, s3://hosted-cellxgene-prod/28d23a79-3db8-4ada-8256-33f2c38b704d.cxg/, s3://corpora-data-prod/28d23a79-3db8-4ada-8256-33f2c38b704d/local.rds, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## 2 0, 0, 0, 0, 5da06c8c-536a-4c43-99e5-ca06a703b86a, 5da06c8c-536a-4c43-99e5-ca06a703b86a, 5da06c8c-536a-4c43-99e5-ca06a703b86a, 5da06c8c-536a-4c43-99e5-ca06a703b86a, raw.h5ad, local.h5ad, local.rds, , RAW_H5AD, H5AD, RDS, CXG, 2c9ac929-7dbe-42b9-8889-9af49a0477bf, 2a37533d-0cb5-4f98-8050-9ef24ea89a59, e6a8d7d3-60d3-4c64-bdb1-4c678a69f635, e4befd79-a7d4-43aa-807b-508e189276f8, s3://corpora-data-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a/raw.h5ad, s3://corpora-data-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a/local.h5ad, s3://corpora-data-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a/local.rds, s3://hosted-cellxgene-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a.cxg/, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## 3 0, 0, 0, 0, 94a4c4e0-e4b2-444f-ab6f-2240a72d959a, 94a4c4e0-e4b2-444f-ab6f-2240a72d959a, 94a4c4e0-e4b2-444f-ab6f-2240a72d959a, 94a4c4e0-e4b2-444f-ab6f-2240a72d959a, raw.h5ad, local.h5ad, local.rds, , RAW_H5AD, H5AD, RDS, CXG, 17a00e46-6918-44da-8d96-79bc68c8624e, b029a3dc-82ac-4c9b-b302-2b8f80b57ec2, 0cbfb895-8497-4538-afbe-afa956c01e58, b1c679f1-0688-41c1-bfe8-f79565861efc, s3://corpora-data-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a/raw.h5ad, s3://corpora-data-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a/local.h5ad, s3://corpora-data-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a/local.rds, s3://hosted-cellxgene-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a.cxg/, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## 4 0, 0, 0, 0, 42bbd836-bd07-426f-96d7-051760533d05, 42bbd836-bd07-426f-96d7-051760533d05, 42bbd836-bd07-426f-96d7-051760533d05, 42bbd836-bd07-426f-96d7-051760533d05, raw.h5ad, local.h5ad, , local.rds, RAW_H5AD, H5AD, CXG, RDS, 644c098c-6bf3-4179-aab0-8f5756342e96, f491d392-9b9e-4ee3-b3c6-bbd4a2fd914a, 4afd36c4-6663-4be4-af63-cb284d9f40d7, a6d48f1e-4eb5-4215-97df-74ab3da676b9, s3://corpora-data-prod/42bbd836-bd07-426f-96d7-051760533d05/raw.h5ad, s3://corpora-data-prod/42bbd836-bd07-426f-96d7-051760533d05/local.h5ad, s3://hosted-cellxgene-prod/42bbd836-bd07-426f-96d7-051760533d05.cxg/, s3://corpora-data-prod/42bbd836-bd07-426f-96d7-051760533d05/local.rds, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## 5 0, 0, 0, 0, c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a, c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a, c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a, c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a, raw.h5ad, local.h5ad, , local.rds, RAW_H5AD, H5AD, CXG, RDS, 3336d4aa-863b-45c5-9af7-50f7a13943eb, 0438b121-f3c4-4104-b00c-06e498d99b7d, 31aaf6a5-d93f-4cb9-a392-97e4b27f3394, bd9d0982-f25b-4f7a-b640-f088c8f38ff3, s3://corpora-data-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a/raw.h5ad, s3://corpora-data-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a/local.h5ad, s3://hosted-cellxgene-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a.cxg/, s3://corpora-data-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a/local.rds, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## 6 0, 0, 0, 0, 01e64822-234b-4af2-9e03-652152e89c7c, 01e64822-234b-4af2-9e03-652152e89c7c, 01e64822-234b-4af2-9e03-652152e89c7c, 01e64822-234b-4af2-9e03-652152e89c7c, raw.h5ad, local.h5ad, , local.rds, RAW_H5AD, H5AD, CXG, RDS, 9c23fa04-eab2-4b51-8faa-9107432e010d, 3c2920b5-3490-4eb4-ba54-84489c293da3, 5d222a3c-a268-4083-86f5-38ea42604632, 00dbdd7a-a461-4160-af80-80d914334f86, s3://corpora-data-prod/01e64822-234b-4af2-9e03-652152e89c7c/raw.h5ad, s3://corpora-data-prod/01e64822-234b-4af2-9e03-652152e89c7c/local.h5ad, s3://hosted-cellxgene-prod/01e64822-234b-4af2-9e03-652152e89c7c.cxg/, s3://corpora-data-prod/01e64822-234b-4af2-9e03-652152e89c7c/local.rds, 0, 0, 0, 0, TRUE, TRUE, TRUE, TRUE
## dataset_deployments
## 1 https://cellxgene.cziscience.com/e/b6203114-e133-458a-aed5-eed1028378b4.cxg/
## 2 https://cellxgene.cziscience.com/e/f7a068f1-0fdb-48e8-8029-db870ff11d9e.cxg/
## 3 https://cellxgene.cziscience.com/e/9b686bb6-1427-4e13-b451-7ee961115cf9.cxg/
## 4 https://cellxgene.cziscience.com/e/6acb6637-ac08-4a65-b2d1-581e51dc7ccf.cxg/
## 5 https://cellxgene.cziscience.com/e/ed419b4e-db9b-40f1-8593-68fdf8dfb076.cxg/
## 6 https://cellxgene.cziscience.com/e/aad97cb5-f375-45ef-ae9d-178e7f5d5180.cxg/
## development_stage
## 1 50-year-old human stage, 60-year-old human stage, early adult stage, post-juvenile adult stage
## 2 50-year-old human stage, 60-year-old human stage, early adult stage, post-juvenile adult stage
## 3 50-year-old human stage, 60-year-old human stage, early adult stage, post-juvenile adult stage
## 4 50-year-old human stage, 60-year-old human stage, early adult stage, post-juvenile adult stage
## 5 65-year-old human stage, 73-year-old human stage, 78-year-old human stage, 83-year-old human stage, 84-year-old human stage
## 6 65-year-old human stage, 73-year-old human stage, 78-year-old human stage, 83-year-old human stage, 84-year-old human stage
## disease
## 1 normal
## 2 normal
## 3 normal
## 4 normal
## 5 normal
## 6 normal
## donor_id
## 1 H18.30.001, H18.30.002, bi005, bi006, F003, F004, F005, F007, F008, M002, M003, M007, M008, F006, M004, M006
## 2 H18.30.001, H18.30.002, bi005, bi006, F003, F004, F005, F007, F008, M002, M003, M007, M008, F006, M004, M006
## 3 H18.30.001, H18.30.002, bi005, bi006, F003, F004, F005, F007, F008, M002, M003, M007, M008, F006, M004, M006
## 4 H18.30.001, H18.30.002, Q19.26.002, Q19.26.003, Q19.26.008, bi005, bi006, F003, F004, F005, F007, F008, M002, M003, M007, M008, F006, M004, M006
## 5 19D014, 19D013, 19D015, 19D016, D001-12, 17D013
## 6 19D014, 19D013, 19D015, 19D016, D001-12, 17D013
## id is_primary_data is_valid
## 1 28d23a79-3db8-4ada-8256-33f2c38b704d SECONDARY TRUE
## 2 5da06c8c-536a-4c43-99e5-ca06a703b86a BOTH TRUE
## 3 94a4c4e0-e4b2-444f-ab6f-2240a72d959a BOTH TRUE
## 4 42bbd836-bd07-426f-96d7-051760533d05 BOTH TRUE
## 5 c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a SECONDARY TRUE
## 6 01e64822-234b-4af2-9e03-652152e89c7c PRIMARY TRUE
## mean_genes_per_cell
## 1 4394.937
## 2 3577.922
## 3 2000.006
## 4 4332.335
## 5 2028.297
## 6 4518.387
## name
## 1 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse 3-species integration excitory neurons
## 2 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse 3-species integration inhibitory neurons
## 3 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse 3-species integration non-nuerons
## 4 Evolution of cellular diversity in primary motor cortex of human, marmoset monkey, and mouse 4-species integration excitory neurons
## 5 Non-neuronal cells in human retina
## 6 Retinal ganglion cells in human retina
## organism published
## 1 Callithrix jacchus, Homo sapiens, Mus musculus TRUE
## 2 Callithrix jacchus, Homo sapiens, Mus musculus TRUE
## 3 Callithrix jacchus, Homo sapiens, Mus musculus TRUE
## 4 Callithrix jacchus, Homo sapiens, Macaca mulatta, Mus musculus TRUE
## 5 Homo sapiens TRUE
## 6 Homo sapiens TRUE
## published_at revision schema_version self_reported_ethnicity
## 1 1613586676 0 3.1.0 European, na
## 2 1613586727 0 3.1.0 European, na
## 3 1613586706 0 3.1.0 European, na
## 4 1613586654 0 3.1.0 European, na
## 5 1635545664 0 3.1.0 European
## 6 1635545664 0 3.1.0 European
## sex suspension_type
## 1 female, male nucleus
## 2 female, male nucleus
## 3 female, male nucleus
## 4 female, male, unknown nucleus
## 5 female, male nucleus
## 6 female, male nucleus
## tissue tombstone
## 1 primary motor cortex FALSE
## 2 primary motor cortex FALSE
## 3 primary motor cortex FALSE
## 4 primary motor cortex FALSE
## 5 fovea centralis, macula lutea proper, peripheral region of retina FALSE
## 6 fovea centralis, macula lutea proper, peripheral region of retina FALSE
## updated_at processing_status.created_at processing_status.cxg_status
## 1 1692674902 0 UPLOADED
## 2 1692674901 0 UPLOADED
## 3 1692674843 0 UPLOADED
## 4 1692674911 0 UPLOADED
## 5 1692663529 0 UPLOADED
## 6 1692663437 0 UPLOADED
## processing_status.dataset_id processing_status.h5ad_status
## 1 28d23a79-3db8-4ada-8256-33f2c38b704d UPLOADED
## 2 5da06c8c-536a-4c43-99e5-ca06a703b86a UPLOADED
## 3 94a4c4e0-e4b2-444f-ab6f-2240a72d959a UPLOADED
## 4 42bbd836-bd07-426f-96d7-051760533d05 UPLOADED
## 5 c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a UPLOADED
## 6 01e64822-234b-4af2-9e03-652152e89c7c UPLOADED
## processing_status.id processing_status.processing_status
## 1 NA SUCCESS
## 2 NA SUCCESS
## 3 NA SUCCESS
## 4 NA SUCCESS
## 5 NA SUCCESS
## 6 NA SUCCESS
## processing_status.rds_status processing_status.updated_at
## 1 UPLOADED 0
## 2 UPLOADED 0
## 3 UPLOADED 0
## 4 UPLOADED 0
## 5 UPLOADED 0
## 6 UPLOADED 0
## processing_status.upload_progress processing_status.upload_status
## 1 1 UPLOADED
## 2 1 UPLOADED
## 3 1 UPLOADED
## 4 1 UPLOADED
## 5 1 UPLOADED
## 6 1 UPLOADED
## processing_status.validation_status batch_condition
## 1 VALID
## 2 VALID
## 3 VALID
## 4 VALID
## 5 VALID
## 6 VALID
## x_approximate_distribution dataset_id
## 1 <NA> 28d23a79-3db8-4ada-8256-33f2c38b704d
## 2 <NA> 5da06c8c-536a-4c43-99e5-ca06a703b86a
## 3 <NA> 94a4c4e0-e4b2-444f-ab6f-2240a72d959a
## 4 <NA> 42bbd836-bd07-426f-96d7-051760533d05
## 5 <NA> c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a
## 6 <NA> 01e64822-234b-4af2-9e03-652152e89c7c
## rds_id
## 1 4b139da9-9d44-4d60-9d61-62b50cf6fb1f
## 2 e6a8d7d3-60d3-4c64-bdb1-4c678a69f635
## 3 0cbfb895-8497-4538-afbe-afa956c01e58
## 4 a6d48f1e-4eb5-4215-97df-74ab3da676b9
## 5 bd9d0982-f25b-4f7a-b640-f088c8f38ff3
## 6 00dbdd7a-a461-4160-af80-80d914334f86
## rds_s3_uri
## 1 s3://corpora-data-prod/28d23a79-3db8-4ada-8256-33f2c38b704d/local.rds
## 2 s3://corpora-data-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a/local.rds
## 3 s3://corpora-data-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a/local.rds
## 4 s3://corpora-data-prod/42bbd836-bd07-426f-96d7-051760533d05/local.rds
## 5 s3://corpora-data-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a/local.rds
## 6 s3://corpora-data-prod/01e64822-234b-4af2-9e03-652152e89c7c/local.rds
## rds_user_submitted h5ad_id
## 1 TRUE a04658c4-5650-462e-ba3b-3465ebad88cf
## 2 TRUE 2a37533d-0cb5-4f98-8050-9ef24ea89a59
## 3 TRUE b029a3dc-82ac-4c9b-b302-2b8f80b57ec2
## 4 TRUE f491d392-9b9e-4ee3-b3c6-bbd4a2fd914a
## 5 TRUE 0438b121-f3c4-4104-b00c-06e498d99b7d
## 6 TRUE 3c2920b5-3490-4eb4-ba54-84489c293da3
## h5ad_s3_uri
## 1 s3://corpora-data-prod/28d23a79-3db8-4ada-8256-33f2c38b704d/local.h5ad
## 2 s3://corpora-data-prod/5da06c8c-536a-4c43-99e5-ca06a703b86a/local.h5ad
## 3 s3://corpora-data-prod/94a4c4e0-e4b2-444f-ab6f-2240a72d959a/local.h5ad
## 4 s3://corpora-data-prod/42bbd836-bd07-426f-96d7-051760533d05/local.h5ad
## 5 s3://corpora-data-prod/c23e69c2-b4ae-44dd-8fac-78ca71ff4e7a/local.h5ad
## 6 s3://corpora-data-prod/01e64822-234b-4af2-9e03-652152e89c7c/local.h5ad
## h5ad_user_submitted
## 1 TRUE
## 2 TRUE
## 3 TRUE
## 4 TRUE
## 5 TRUE
## 6 TRUE
Download object
After manually check the extracted metadata, users can
download the specified objects with
ParseCELLxGENE
. The downloaded objects are controlled by
file.ext
(choose from "rds"
and
"h5ad"
).
The returned result is a dataframe containing failed datasets. If not
NULL
, users can re-run ParseCELLxGENE
by
setting meta
to the returned result.
ParseCELLxGENE(
meta = human.10x.cellxgene.meta[1:5, ], file.ext = "rds",
out.folder = "/Users/soyabean/Desktop/tmp/scdown/download_cellxgene"
)
The structure of downloaded objects:
tree /Users/soyabean/Desktop/tmp/scdown/download_cellxgene
## /Users/soyabean/Desktop/tmp/scdown/download_cellxgene
## ├── DCM.ACM.heart.cell.atlas..Adipocytes.rds
## ├── DCM.ACM.heart.cell.atlas..Cardiomyocytes.rds
## ├── DCM.ACM.heart.cell.atlas..Endothelial.cells.rds
## ├── DCM.ACM.heart.cell.atlas..Fibroblasts.rds
## └── DCM.ACM.heart.cell.atlas..Lymphoids.rds
##
## 1 directory, 5 files
Human Cell Atlas
The Human Cell Atlas
aims to map every cell type in the human body, it contains 397
projects, most of which are from Homo sapiens
(also includes projects from Mus musculus
,
Macaca mulatta
and
canis lupus familiaris
).
Show available datasets
scfetch
provides ShowHCAProjects
to extract
detailed project metadata, including project title, description,
organism, sex, organ/organPart, disease, assay, preservation method,
sample type, suspension type, cell type, development stage, et al.
There are 397 unique projects under five
different catalogs (dcp29
, dcp30
,
dcp1
, lm2
, lm3
):
all.hca.projects <- ShowHCAProjects()
head(all.hca.projects)
## projectTitle
## 1 1.3 Million Brain Cells from E18 Mice
## 2 A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra
## 3 A Cellular Atlas of Pitx2-Dependent Cardiac Development.
## 4 A Human Liver Cell Atlas reveals Heterogeneity and Epithelial Progenitors
## 5 A Partial Picture of the Single-Cell Transcriptomics of Human IgA Nephropathy
## 6 A Protocol for Revealing Oral Neutrophil Heterogeneity by Single-Cell Immune Profiling in Human Saliva
## projectId projectShortname
## 1 74b6d569-3b11-42ef-b6b1-a0454522b4a0 1M Neurons
## 2 53c53cd4-8127-4e12-bc7f-8fe1610a715c ProstateCellAtlas
## 3 7027adc6-c9c9-46f3-84ee-9badc3a4f53b Pitx2DevelopingHeart
## 4 94e4ee09-9b4b-410a-84dc-a751ad36d0df LiverCellAtlasHeterogeneity
## 5 c5b475f2-76b3-4a8e-8465-f3b69828fec3 Tang-Human-IgAN-GEXSCOPE
## 6 60ea42e1-af49-42f5-8164-d641fdb696bc PRJNA640427_human_neutrophils
## projectDescription
## 1 Cortex, hippocampus, and subventricular zone were purchased from BrainBits (C57EHCV). They were from 2 E18 C57BL/6 mice dissected on the same day, shipped overnight on ice, and stored at 4C until being prepared for scRNA-Seq. Brain tissues were dissociated following the Demonstrated Protocol for Mouse Embryonic Neural Tissue (https://support.10xgenomics.com/single-cell/sample-prep/doc/demonstrated-protocol-dissociation-of-mouse-embryonic-neural-tissue-for-single-cell-rna-sequencing). 69 scRNA-Seq libraries were made from first mouse brain 2 days after the dissection. Another 64 scRNA-Seq libraries were made from second mouse brain 6 days after the dissection.
## 2 A comprehensive cellular anatomy of normal human prostate is essential for solving the cellular origins of benign prostatic hyperplasia and prostate cancer. The tools used to analyze the contribution of individual cell types are not robust. We provide a cellular atlas of the young adult human prostate and prostatic urethra using an iterative process of single-cell RNA sequencing (scRNA-seq) and flow cytometry on ∼98,000 cells taken from different anatomical regions. Immunohistochemistry with newly derived cell type-specific markers revealed the distribution of each epithelial and stromal cell type on whole mounts, revising our understanding of zonal anatomy. Based on discovered cell surface markers, flow cytometry antibody panels were designed to improve the purification of each cell type, with each gate confirmed by scRNA-seq. The molecular classification, anatomical distribution, and purification tools for each cell type in the human prostate create a powerful resource for experimental design in human prostate disease.
## 3 Single-cell RNA sequencing was applied to study the role of Pitx2 expression in heart development in mice. Over 75,000 single cardiac cell transcriptomes between two key developmental timepoints in control and Pitx2-null embryos were amplified and sequenced.
## 4 We perfomed single-cell RNA-sequnecing of around 10,000 cells from normal human liver tissue to construct a human liver cell atlas. We reveal previously unknown subtypes in different cell type compartments. Overall design: Single cells were isolated from human liver resection specimens and then sorted by FACS into 384 well plates in a unbiased way and on the basis of cell surface markers for distinct cell types. ScRNA-seq was done using the mCelSeq2 protocol.
## 5 Aa comprehensive scRNA-seq analysis of human renal biopsies from IgAN. We showed for the first time that IgAN mesangial cells displayed increased expression of several novel genes including MALAT1, GADD45B, SOX4, and EDIL3, which were related to cell proliferation and matrix accumulation. The overexpressed genes in tubule cells of IgAN were mainly enriched in inflammatory pathways including TNF signaling, IL-17 signaling, and NOD-like receptor signaling. Furthermore, we compared the results of 4 IgAN patients with the published scRNA-Seq data of healthy kidney tissues of three human donors in order to further validate the findings in our study. The results also verified that the overexpressed genes in tubule cells from IgAN patients were mainly enriched in inflammatory pathways including TNF signaling, IL-17 signaling, and NOD-like receptor signaling. The receptor-ligand crosstalk analysis revealed potential interactions between mesangial cells and other cells in IgAN. IgAN patients with overt proteinuria displayed elevated genes participating in several signaling pathways compared with microproteinuria group. It needs to be mentioned that based on number of mesangial cells and other kidney cells analyzed in this study, the results of our study are preliminary and needs to be confirmed on larger number of cells from larger number of patients and controls in future studies. Therefore, these results offer new insight into pathogenesis and identify new therapeutic targets for IgAN.
## 6 Neutrophils are the most abundant white blood cells in the human body responsible for fighting viral, bacterial and fungi infections. Out of the 100 billion neutrophils produced daily, it is estimated that 10 % of these cells end up in oral biofluids. Because saliva is a fluid accessible through non-invasive techniques, it is an optimal source of cells and molecule surveillance in health and disease. While neutrophils are abundant in saliva, scientific advancements in neutrophil biology have been hampered likely due to their short life span, inability to divide once terminally differentiated, sensitivity to physical stress, and low RNA content. Here, we devise a protocol aiming to understand neutrophil heterogeneity by improving isolation methods, single-cell RNA extraction, sequencing and bioinformatic pipelines. Advanced flow cytometry 3D analysis, and machine learning validated our gating system model, by including positive neutrophil markers and excluding other immune cells and uncovered neutrophil heterogeneity. Considering specific cell markers, unique mitochondrial content, stringent and less stringent filtering strategies, our transcriptome single cell findings unraveled novel neutrophil subpopulations. Collectively, this methodology accelerates the discovery of salivary immune landscapes, with the promise of improving the understanding of diversification mechanisms, clinical diagnostics in health and disease, and guide targeted therapies.
## publications
## 1 Massively parallel digital transcriptional profiling of single cells
## 2 A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra
## 3 A cellular atlas of Pit2x-dependent cardiac development.
## 4 A human liver cell atlas reveals heterogeneity and epithelial progenitors.
## 5 A Partial Picture of the Single-Cell Transcriptomics of Human IgA Nephropathy
## 6 A Protocol for Revealing Oral Neutrophil Heterogeneity by Single-Cell Immune Profiling in Human Saliva
## laboratory
## 1 Human Cell Atlas Data Coordination Platform
## 2 Human Cell Atlas Data Coordination Platform|Strand Lab
## 3 Department of Molecular Physiology and Biophysics|Human Cell Atlas|Program in Developmental Biology
## 4 NA
## 5 Centre for Inflammatory Diseases|Department of Hematology|Department of Medical Records & Information|Department of Nephrology|Department of Nephrology;Centre for Inflammatory Diseases|Department of Organ Transplantation|Department of Pathology|Department of Ultrasound|Human Cell Atlas Data Coordination Platform
## 6 Human Cell Atlas
## accessions accessible estimatedCellCount
## 1 GSE93421, SRP096558, PRJNA360949 TRUE 1330000
## 2 GSE117403 TRUE 108700
## 3 SRP198380, GSE131181, PRJNA542873 TRUE 75000
## 4 SRP174502, GSE124395, PRJNA511895 TRUE 10000
## 5 GSE171314, SRP313266, PRJNA719108 TRUE 20570
## 6 SRP271375, PRJNA640427 TRUE 1145
## sampleEntityType organ
## 1 specimens brain
## 2 specimens prostate gland
## 3 specimens heart
## 4 organoids, specimens liver, liver
## 5 specimens kidney
## 6 specimens oral cavity
## organPart
## 1 cortex
## 2 peripheral zone of prostate|transition zone of prostate
## 3 NA
## 4 , NA
## 5 NA
## 6 saliva
## sampleID
## 1 E18_20160930_Brain|E18_20161004_Brain
## 2 D17PrPz|D17PrTz|D27PrPz|D27PrTz|D35PrPz|D35PrTz
## 3 SAMN11640779_het|SAMN11640779_wt|SAMN11640780|SAMN11640781_het|SAMN11640781_wt|SAMN11640782_het|SAMN11640782_wt|SAMN11640783|SAMN11640784|SAMN11640785_het|SAMN11640785_wt|SAMN11640786_het|SAMN11640786_wt|SAMN11640787|SAMN11640788|SAMN11640789|SAMN11640790
## 4 SAMN10645790|SAMN10645791|SAMN10645843|SAMN10645847|SAMN10645915|SAMN10645916|SAMN10645917|SAMN10645918|SAMN10645919|SAMN10645920|SAMN10645921|SAMN10645922|SAMN10645923|SAMN10645924, SAMN10645729|SAMN10645734|SAMN10645735|SAMN10645736|SAMN10645737|SAMN10645738|SAMN10645739|SAMN10645741|SAMN10645744|SAMN10645745|SAMN10645746|SAMN10645747|SAMN10645748|SAMN10645749|SAMN10645750|SAMN10645751|SAMN10645756|SAMN10645758|SAMN10645761|SAMN10645773|SAMN10645782|SAMN10645810|SAMN10645830|SAMN10645831|SAMN10645832|SAMN10645833|SAMN10645834|SAMN10645835|SAMN10645836|SAMN10645837|SAMN10645838|SAMN10645839|SAMN10645840|SAMN10645848|SAMN10645849|SAMN10645850|SAMN10645851|SAMN10645852|SAMN10645853|SAMN10645854|SAMN10645855|SAMN10645856|SAMN10645857|SAMN10645858|SAMN10645860|SAMN10645861|SAMN10645862|SAMN10645863|SAMN10645864|SAMN10645873|SAMN10645880|SAMN10645881|SAMN10645882|SAMN10645883|SAMN10645884|SAMN10645885|SAMN10645886|SAMN10645888|SAMN10645892|SAMN10645911|SAMN10645935|SAMN10645945|SAMN10645954|SAMN10645955|SAMN10645956|SAMN10645962|SAMN10645963|SAMN10645964|SAMN10645965|SAMN10645966|SAMN10645967|SAMN10645968|SAMN10645969|SAMN10645970|SAMN10645971|SAMN10645972|SAMN10645973|SAMN10645974|SAMN10645977|SAMN10645978|SAMN10645979|SAMN10645980|SAMN10645989|SAMN10646005|SAMN10646007|SAMN10646011|SAMN10646013|SAMN10646019|SAMN10646032|SAMN10646036|SAMN10646037|SAMN10646038|SAMN10646039|SAMN10646040|SAMN10646041|SAMN10646042|SAMN10646043|SAMN10646044|SAMN10646045|SAMN10646046
## 5 kidney_1_igan|kidney_2_igan|kidney_3_igan|kidney_4_igan|kidney_5_healthy
## 6 saliva_sample1|saliva_sample2|saliva_sample3
## disease preservationMethod donorCount
## 1 normal fresh 2
## 2 normal fresh 3
## 3 normal NA 17
## 4 , hepatocellular carcinoma|normal , NA 23
## 5 IgA glomerulonephritis|normal NA 5
## 6 normal NA 3
## developmentStage genusSpecies biologicalSex
## 1 mouse embryo stage Mus musculus unknown
## 2 human adult stage Homo sapiens male
## 3 Theiler stage 17|Theiler stage 21 Mus musculus unknown
## 4 human adult stage Homo sapiens unknown
## 5 human adult stage Homo sapiens female|male
## 6 human adult stage Homo sapiens male
## selectedCellType
## 1 neuron
## 2 basal cell of prostate epithelium|epithelial cell of prostate|fibroblast of connective tissue of prostate|luminal cell of prostate epithelium|prostate epithelial cell|prostate stromal cell|smooth muscle cell of prostate
## 3 NA
## 4 NA
## 5 NA
## 6 neutrophil
## catalog entryId
## 1 dcp29 74b6d569-3b11-42ef-b6b1-a0454522b4a0
## 2 dcp29 53c53cd4-8127-4e12-bc7f-8fe1610a715c
## 3 dcp29 7027adc6-c9c9-46f3-84ee-9badc3a4f53b
## 4 dcp29 94e4ee09-9b4b-410a-84dc-a751ad36d0df
## 5 dcp29 c5b475f2-76b3-4a8e-8465-f3b69828fec3
## 6 dcp29 60ea42e1-af49-42f5-8164-d641fdb696bc
## sourceId
## 1 53e00b4b-4351-4131-b873-7931f3d4037f
## 2 ccc0d8c3-ac34-44bb-9bb0-55e9ab67d858
## 3 0481ef14-8c6f-4e16-bc71-211423bc8d90
## 4 c95bf55d-7f4e-47b0-af97-198823f21aaa
## 5 be6cf599-d084-4c5e-a89f-95308dbac4cd
## 6 6abbe02d-b89d-49b2-b639-b45f21d5baa7
## sourceSpec
## 1 tdr:datarepo-aa6a9210:snapshot/hca_prod_74b6d5693b1142efb6b1a0454522b4a0__20220117_dcp2_20220307_dcp14:/1
## 2 tdr:datarepo-9e63ca34:snapshot/hca_prod_53c53cd481274e12bc7f8fe1610a715c__20220117_dcp2_20220307_dcp14:/0
## 3 tdr:datarepo-489f5a00:snapshot/hca_prod_7027adc6c9c946f384ee9badc3a4f53b__20220117_dcp2_20220120_dcp12:/0
## 4 tdr:datarepo-071fb08c:snapshot/hca_prod_94e4ee099b4b410a84dca751ad36d0df__20220519_dcp2_20220804_dcp19:/0
## 5 tdr:datarepo-b3b1e92f:snapshot/hca_prod_c5b475f276b34a8e8465f3b69828fec3__20230331_dcp2_20230331_dcp26:/0
## 6 tdr:datarepo-41cca7ce:snapshot/hca_prod_60ea42e1af4942f58164d641fdb696bc__20220117_dcp2_20230314_dcp25:/1
## workflow
## 1
## 2 optimus_post_processing_v1.0.0|optimus_v4.2.3, ,
## 3 optimus_post_processing_v1.0.0|optimus_v4.2.2, ,
## 4 analysis_protocol_normalization|analysis_protocol_quantification, ,
## 5 analysis_protocol, ,
## 6
## libraryConstructionApproach nucleicAcidSource instrumentManufacturerModel
## 1 10x 3' v2, single cell, , Illumina HiSeq 4000
## 2 , 10X v2 sequencing, , single cell, , , Illumina NextSeq 500
## 3 , 10X 3' v2 sequencing, , single cell, , , Illumina NextSeq 500
## 4 , CEL-seq2, , single cell, , , Illumina HiSeq 2500
## 5 , GEXSCOPE technology, , single cell, , , Illumina HiSeq X
## 6 Smart-seq2, single cell, , Illumina NovaSeq 6000
## pairedEnd cellLineID cellLineType cellLinemodelOrgan
## 1 , FALSE
## 2 , , FALSE
## 3 , , FALSE
## 4 , , FALSE
## 5 , , TRUE
## 6 , TRUE
## organoidsID
## 1
## 2
## 3
## 4 SAMN10645790|SAMN10645791|SAMN10645843|SAMN10645847|SAMN10645915|SAMN10645916|SAMN10645917|SAMN10645918|SAMN10645919|SAMN10645920|SAMN10645921|SAMN10645922|SAMN10645923|SAMN10645924
## 5
## 6
## organoidsmodelOrgan organoidsmodelOrganPart lastModifiedDate
## 1 2021-10-20T09:03:20.617000Z
## 2 2021-09-30T17:38:55.685000Z
## 3 2021-09-30T17:33:55.642000Z
## 4 liver NA 2022-04-21T15:03:35.422000Z
## 5 2023-03-20T11:06:31.139000Z
## 6 2021-09-30T17:38:34.675000Z
Summary attributes
scfetch
provides StatDBAttribute
to summary
attributes of Human Cell
Atlas:
StatDBAttribute(df = all.hca.projects, filter = c("organism", "sex"), database = "HCA")
## $organism
## Value Num Key
## 1 homo sapiens 371 organism
## 2 mus musculus 48 organism
## 3 canis lupus familiaris 1 organism
## 4 macaca mulatta 1 organism
##
## $sex
## Value Num Key
## 1 female 276 sex
## 2 male 268 sex
## 3 unknown 132 sex
## 4 mixed 4 sex
Extract metadata
scfetch
provides ExtractHCAMeta
to filter
projects metadata, the available values of attributes can be obtained
with StatDBAttribute
except cell
number:
# human 10x v2 and v3 datasets
hca.human.10x.projects <- ExtractHCAMeta(
all.projects.df = all.hca.projects, organism = "Homo sapiens",
protocol = c("10x 3' v2", "10x 3' v3")
)
## Use all biologicalSex as input!
## Use all organ as input!
## Use all organPart as input!
## Use all disease as input!
## Use all sampleEntityType as input!
## Use all preservationMethod as input!
## Use all nucleicAcidSource as input!
## Use all selectedCellType as input!
## Use all instrumentManufacturerModel as input!
head(hca.human.10x.projects)
## projectTitle
## 1 A Single-Cell Atlas of the Human Healthy Airways.
## 2 A Single-Cell Transcriptomic Atlas of Human Skin Aging.
## 3 A human breast atlas integrating single-cell proteomics and transcriptomics
## 4 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates
## 5 A molecular single-cell lung atlas of lethal COVID-19
## 6 A multi-omic single-cell landscape of human gynecologic malignancies
## projectId projectShortname
## 1 ef1e3497-515e-4bbe-8d4c-10161854b699 healthyAirwaysAtlas
## 2 923d3231-7295-4184-b3f6-c3082766a8c7 AgingSkinAtlas
## 3 9b876d31-0739-4e96-9846-f76e6a427279 breastTranscriptomeAtlas
## 4 2fe3c60b-ac1a-4c61-9b59-f6556c0fce63 FetalLungImmune
## 5 d7845650-f6b1-4b1c-b2fe-c0795416ba7b LethalCovidLungAtlas
## 6 9746f4e0-d3b2-4543-89b3-10288162851b regnerGynecologicMalignancies
## projectDescription
## 1 Rationale: The respiratory tract constitutes an elaborated line of defense that is based on a unique cellular ecosystem. Single-cell profiling methods enable the investigation of cell population distributions and transcriptional changes along the airways.Methods: We have explored the cellular heterogeneity of the human airway epithelium in 10 healthy living volunteers by single-cell RNA profiling. 77,969 cells were collected at 35 distinct locations, from the nose to the 12th division of the airway tree.Results: The resulting atlas is composed of a high percentage of epithelial cells (89.1%), but also immune (6.2%) and stromal (4.7%) cells with distinct cellular proportions in different regions of the airways. It reveals differential gene expression between identical cell types (suprabasal, secretory, and multiciliated cells) from the nose (MUC4, PI3, SIX3) and tracheobronchial (SCGB1A1, TFF3) airways. By contrast, cell-type specific gene expression is stable across all tracheobronchial samples. Our atlas improves the description of ionocytes, pulmonary neuro-endocrine (PNEC) and brush cells, and identifies a related population of NREP-positive cells. We also report the association of KRT13 with dividing cells that are reminiscent of previously described mouse “hillock” cells, and with squamous cells expressing SCEL, SPRR1A/B.Conclusions: Robust characterization of a single-cell cohort in healthy airways establishes a valuable resource for future investigations. The precise description of the continuum existing\nfrom the nasal epithelium to successive divisions of the airways and the stable gene expression profile of these regions better defines conditions under which relevant tracheobronchial proxies of human respiratory diseases can be developed.
## 2 Skin undergoes constant self-renewal, and its functional decline is a visible consequence of aging. Understanding human skin aging requires in-depth knowledge of the molecular and functional properties of various skin cell types. We performed single-cell RNA sequencing of human eyelid skin from healthy individuals across different ages and identified eleven canonical cell types, as well as six subpopulations of basal cells. Further analysis revealed progressive accumulation of photoaging-related changes and increased chronic inflammation with age. Transcriptional factors involved in the developmental process underwent early-onset decline during aging. Furthermore, inhibition of key transcription factors HES1 in fibroblasts and KLF6 in keratinocytes not only compromised cell proliferation, but also increased inflammation and cellular senescence during aging. Lastly, we found that genetic activation of HES1 or pharmacological treatment with quercetin alleviated cellular senescence of dermal fibroblasts. These findings provide a single-cell molecular framework of human skin aging, providing a rich resource for developing therapeutic strategies against aging-related skin disorders.
## 3 The breast is a dynamic organ whose response to physiological and pathophysiological conditions alters its disease susceptibility, yet the specific effects of these clinical variables on cell state remain poorly annotated. We present a unified, high-resolution breast atlas by integrating single-cell RNA-seq, mass cytometry, and cyclic immunofluorescence, encompassing a myriad of states. We define cell subtypes within the alveolar, hormone-sensing, and basal epithelial lineages, delineating associations of several subtypes with cancer risk factors, including age, parity, and BRCA2 germline mutation. Of particular interest is a subset of alveolar cells termed basal-luminal (BL) cells, which exhibit poor transcriptional lineage fidelity, accumulate with age, and carry a gene signature associated with basal-like breast cancer. We further utilize a medium-depletion approach to identify molecular factors regulating cell-subtype proportion in organoids. Together, these data are a rich resource to elucidate diverse mammary cell states. Overall design: A total of 16 breast samples were assayed (4 samples from reductive mammoplasties and 12 from prophylactic mastectomies).
## 4 We present a multiomic cell atlas of human lung development that combines single cell RNA and ATAC sequencing, high throughput spatial transcriptomics and single cell imaging. Coupling single cell methods with spatial analysis has allowed a comprehensive cellular survey of the epithelial, mesenchymal, endothelial and erythrocyte/leukocyte compartments from 5-22 post conception weeks. We identify new cell states in all compartments. These include developmental-specific secretory progenitors that resemble cells in adult fibrotic lungs and a new subtype of neuroendocrine cell related to human small cell lung cancer; observations which strengthen the connections between development and disease/regeneration. Our datasets are available for the community to download and interact with through our web interface ( https://fetal-lung.cellgeni.sanger.ac.uk ). Finally, to illustrate its general utility, we use our cell atlas to generate predictions about cell-cell signalling and transcription factor hierarchies which we test using organoid models. Highlights Spatiotemporal atlas of human lung development from 5-22 post conception weeks identifies 147 cell types/states. Tracking the developmental origins of multiple cell compartments, including new progenitor states. Functional diversity of fibroblasts in distinct anatomical signalling niches. Resource applied to interrogate and experimentally test the transcription factor code controlling neuroendocrine cell heterogeneity and the origins of small cell lung cancer.
## 5 Respiratory failure is the leading cause of death in patients with severe SARS-CoV-2 infection, but the host response at the lung tissue level is poorly understood. Here we performed single-nucleus RNA sequencing of about 116,000 nuclei from the lungs of nineteen individuals who died of COVID-19 and underwent rapid autopsy and seven control individuals. Integrated analyses identified substantial alterations in cellular composition, transcriptional cell states, and cell-to-cell interactions, thereby providing insight into the biology of lethal COVID-19. The lungs from individuals with COVID-19 were highly inflamed, with dense infiltration of aberrantly activated monocyte-derived macrophages and alveolar macrophages, but had impaired T cell responses. Monocyte/macrophage-derived interleukin-1β and epithelial cell-derived interleukin-6 were unique features of SARS-CoV-2 infection compared to other viral and bacterial causes of pneumonia. Alveolar type 2 cells adopted an inflammation-associated transient progenitor cell state and failed to undergo full transition into alveolar type 1 cells, resulting in impaired lung regeneration. Furthermore, we identified expansion of recently described CTHRC1+ pathological fibroblasts3 contributing to rapidly ensuing pulmonary fibrosis in COVID-19. Inference of protein activity and ligand–receptor interactions identified putative drug targets to disrupt deleterious circuits. This atlas enables the dissection of lethal COVID-19, may inform our understanding of long-term complications of COVID-19 survivors, and provides an important resource for therapeutic development.
## 6 Deconvolution of regulatory mechanisms that drive transcriptional programs in cancer cells is key to understanding tumor biology. Herein, we present matched transcriptome (scRNA-seq) and chromatin accessibility (scATAC-seq) profiles at single-cell resolution from human ovarian and endometrial tumors processed immediately following surgical resection. This dataset reveals the complex cellular heterogeneity of these tumors and enabled us to quantitatively link variation in chromatin accessibility to gene expression. We show that malignant cells acquire previously unannotated regulatory elements to drive hallmark cancer pathways. Moreover, malignant cells from within the same patients show substantial variation in chromatin accessibility linked to transcriptional output, highlighting the importance of intratumoral heterogeneity. Finally, we infer the malignant cell type-specific activity of transcription factors. By defining the regulatory logic of cancer cells, this work reveals an important reliance on oncogenic regulatory elements and highlights the ability of matched scRNA-seq/scATAC-seq to uncover clinically relevant mechanisms of tumorigenesis in gynecologic cancers.
## publications
## 1 A Single-Cell Atlas of the Human Healthy Airways.
## 2 A Single-Cell Transcriptomic Atlas of Human Skin Aging.
## 3 A human breast atlas integrating single-cell proteomics and transcriptomics
## 4 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates
## 5 A molecular single-cell lung atlas of lethal COVID-19
## 6 A multi-omic single-cell landscape of human gynecologic malignancies.
## laboratory
## 1 Memorial Sloan Kettering Cancer Center, New York, New York|Université Côte d'Azur, CNRS, Institut Pharmacologie Moléculaire et Cellulaire, Sophia-Antipolis, France.|Université Côte d'Azur, Centre Hospitalier Universitaire de Nice, Fédération Hospitalo-Universitaire OncoAge, CNRS, Inserm, Institute for Research on Cancer and Aging Nice Team 3, Pulmonology Department, Nice, France.
## 2 NA
## 3 NA
## 4 NA
## 5 NA
## 6 NA
## accessions accessible
## 1 EGAS00001004082 TRUE
## 2 TRUE
## 3 SRP329970, GSE180878, PRJNA749859 TRUE
## 4 E-MTAB-11265, E-MTAB-11278, E-MTAB-11267, E-MTAB-11266 TRUE
## 5 GSE171524, PRJNA719842 TRUE
## 6 SRP309991, GSE173682, PRJNA699347 TRUE
## estimatedCellCount sampleEntityType organ
## 1 77969 specimens bronchus|nose|trachea
## 2 35678 specimens skin of body
## 3 52682 specimens breast
## 4 NA organoids, specimens lung, heart|lung
## 5 116000 specimens pair of lungs
## 6 150000 specimens intestine|ovary|uterus
## organPart
## 1 epithelium of bronchus|epithelium of trachea|inferior nasal concha|terminal bronchus epithelium
## 2 NA
## 3 NA
## 4 , heart atrium|lung|lung epithelium
## 5 NA
## 6 endometrium|NA
## sampleID
## 1 D322_Biop_Int1|D322_Biop_Nas1|D322_Biop_Pro1|D326_Biop_Int1|D326_Biop_Pro1|D326_Brus_Dis1|D337_Brus_Dis1|D339_Biop_Int1|D339_Biop_Nas1|D339_Biop_Pro1|D339_Brus_Dis1|D344_Biop_Int1|D344_Biop_Nas1|D344_Biop_Pro1|D344_Brus_Dis1|D353_Biop_Int2|D353_Biop_Pro1|D353_Brus_Dis1|D353_Brus_Nas1|D354_Biop_Int2|D354_Biop_Pro1|D354_Brus_Dis1|D363_Biop_Int2|D363_Biop_Pro1|D363_Brus_Dis1|D363_Brus_Nas1|D367_Biop_Int1|D367_Biop_Pro1|D367_Brus_Dis1|D367_Brus_Nas1|D372_Biop_Int1|D372_Biop_Int2|D372_Biop_Pro1|D372_Brus_Dis1|D372_Brus_Nas1
## 2 HRS118996|HRS118997|HRS118998|HRS118999|HRS119000|HRS119001|HRS119002|HRS119003|HRS119004
## 3 SAMN20422990|SAMN20422991|SAMN20422992|SAMN20422993|SAMN20422994|SAMN20422995|SAMN20422996|SAMN20422997|SAMN20422998|SAMN20422999|SAMN20423000|SAMN20423001|SAMN20423002|SAMN20423003|SAMN20423004|SAMN20423005
## 4 HDBR-N13393_ASCL1_OE|HDBR-N13393_Ctrl_OE|HDBR-N13393_NEUROD1_OE|HDBR-N13393_NEUROG3_OE|HDBR-N13393_PAX9_OE|HDBR-N13393_RFX6_OE|HDBR-N13393_TFAP2A_OE|HDBR-N13393_deltaNp63alpha_OE|HDBR1915_ASCL1_OE|HDBR1915_Ctrl_OE|HDBR1915_NEUROD1_OE|HDBR1915_NEUROG3_OE|HDBR1915_PAX9_OE|HDBR1915_RFX6_OE|HDBR1915_TFAP2A_OE|HDBR1915_deltaNp63alpha_OE|HDBR2174_ASCL1_OE|HDBR2174_Ctrl_OE|HDBR2174_NEUROD1_OE|HDBR2174_NEUROG3_OE|HDBR2174_PAX9_OE|HDBR2174_RFX6_OE|HDBR2174_TFAP2A_OE|HDBR2174_deltaNp63alpha_OE, 15413-LNG--FO-3_specimen|15415-LNG--FO-2_specimen|15417-LNG-0-FO-4_specimen|15424-LNG-0-FO-3_specimen|15428-LNG-0-FO-2_specimen|15737-FLNG-1(distal)_specimen|15739-FLNG-1(distal)_specimen|15739-FLNG-3(proximal)_specimen|15773-FLNG-3(proximal)_specimen|5478STDY7698210specimen|5698STDY7839908specimen|5698STDY7839910specimen|5698STDY7839918specimen|5891STDY8062349specimen|5891STDY8062350specimen|5891STDY8062351specimen|5891STDY8062352specimen|5891STDY8062353specimen|5891STDY8062354specimen|5891STDY8062355specimen|5891STDY8062356specimen|5891STDY9030806_8specimen|5891STDY9030807_9specimen|Hst4-LNG--FO-1_specimen|Hst5-LNG--FO-1_specimen|Hst7-LNG--FO-2_specimen|SIGAA10specimen|SIGAB10specimen|SIGAC10specimen|SIGAE6specimen|SIGAF6specimen|SIGAG12specimen|SIGAG6specimen|SIGAH10specimen|SIGAH12specimen|SIGAH4specimen|WSSS8011222specimen|WSSS8012016specimen|WSSS_F_LNG8713176specimen|WSSS_F_LNG8713177specimen|WSSS_F_LNG8713178specimen_15168proximal|WSSS_F_LNG8713179specimen_15168distal|WSSS_F_LNG8713184specimen_15233proximal|WSSS_F_LNG8713185specimen_15233distal|WSSS_F_LNG8713186specimen_proximal|WSSS_F_LNG8713187specimen_distal
## 5 C51ctr_specimen|C52ctr_specimen|C53ctr_specimen|C54ctr_specimen|C55ctr_specimen|C56ctr_specimen|C57ctr_specimen|L01cov_specimen|L03cov_specimen|L04cov_specimen|L04cov_specimen2|L05cov_specimen|L06cov_specimen|L07cov_specimen|L08cov_specimen|L09cov_specimen|L10cov_specimen|L11cov_specimen|L12cov_specimen|L13cov_specimen|L15cov_specimen|L16cov_specimen|L17cov_specimen|L18cov_specimen|L19cov_specimen|L21cov_specimen|L22cov_specimen
## 6 specimen_Patient 10_3CCF1L|specimen_Patient _11_3E4D1L|specimen_Patient_1_3533EL|specimen_Patient_2_3571DL|specimen_Patient_3_36186L|specimen_Patient_4_36639L|specimen_Patient_5_366C5L|specimen_Patient_6_37EACL|specimen_Patient_7_38FE7L|specimen_Patient_8_3BAE2L|specimen_Patient_9_3E5CFL
## disease
## 1 normal
## 2 normal
## 3 normal
## 4 , normal
## 5 COVID-19|lung disease|NA
## 6 endometrioid tumor|metastatic malignant neoplasm|ovarian neoplasm
## preservationMethod donorCount
## 1 NA 10
## 2 NA 9
## 3 NA 16
## 4 , NA 38
## 5 NA 26
## 6 NA 11
## developmentStage
## 1 human adult stage
## 2 adolescent stage|human adult stage
## 3 human adult stage
## 4 11th week post-fertilization human stage|12th week post-fertilization human stage|13th week post-fertilization human stage|14th week post-fertilization human stage|15th week post-fertilization human stage|16th week post-fertilization human stage|17th week post-fertilization human stage|18th week post-fertilization human stage|19th week post-fertilization human stage|20th week post-fertilization human stage|21st week post-fertilization human stage|22nd week post-fertilization human stage|9th week post-fertilization human stage|Carnegie stage 14|Carnegie stage 17|Carnegie stage 22|Carnegie stage 23
## 5 human adult stage
## 6 human adult stage
## genusSpecies biologicalSex selectedCellType catalog
## 1 Homo sapiens female|male NA, NA, NA dcp29
## 2 Homo sapiens female NA dcp29
## 3 Homo sapiens female NA dcp29
## 4 Homo sapiens female|male|unknown NA, NA dcp29
## 5 Homo sapiens female|male NA dcp29
## 6 Homo sapiens female NA, NA, NA dcp29
## entryId sourceId
## 1 ef1e3497-515e-4bbe-8d4c-10161854b699 130bc665-c364-467c-a5d1-2e6dbdb6ad13
## 2 923d3231-7295-4184-b3f6-c3082766a8c7 445d8c04-718b-4367-9616-38d106e4431b
## 3 9b876d31-0739-4e96-9846-f76e6a427279 d5994701-3915-4357-a3f7-d1b3d8ebb0ee
## 4 2fe3c60b-ac1a-4c61-9b59-f6556c0fce63 f52e9f35-cbbb-43f9-bb87-94f9d2788bf6
## 5 d7845650-f6b1-4b1c-b2fe-c0795416ba7b ab92ba53-7fad-4398-b370-6cf18499e626
## 6 9746f4e0-d3b2-4543-89b3-10288162851b ac789345-a9a0-4a29-921b-2d552b4f53d5
## sourceSpec
## 1 tdr:datarepo-4cca88b5:snapshot/hca_prod_ef1e3497515e4bbe8d4c10161854b699__20220118_dcp2_20230314_dcp25:/0
## 2 tdr:datarepo-64e86c6c:snapshot/hca_prod_923d323172954184b3f6c3082766a8c7__20220906_dcp2_20230314_dcp25:/0
## 3 tdr:datarepo-9f5be9ac:snapshot/hca_prod_9b876d3107394e969846f76e6a427279__20220906_dcp2_20230314_dcp25:/0
## 4 tdr:datarepo-573f4ced:snapshot/hca_prod_2fe3c60bac1a4c619b59f6556c0fce63__20220606_dcp2_20230314_dcp25:/0
## 5 tdr:datarepo-94746fdf:snapshot/hca_prod_d7845650f6b14b1cb2fec0795416ba7b__20220119_dcp2_20230314_dcp25:/0
## 6 tdr:datarepo-f4518d09:snapshot/hca_prod_9746f4e0d3b2454389b310288162851b__20230526_dcp2_20230530_dcp28:/0
## workflow
## 1 ap_matrixNormalization|ap_rawMatrixGeneration_hg19|ap_rawMatrixGeneration_hg38, ,
## 2
## 3 analysis_protocol_1|analysis_protocol_2, ,
## 4 processed_matrix_ap|raw_matrix_ap|visium_matrix_ap, , ,
## 5 analysis_file_generation|quality_control_filtering, ,
## 6 analysis_protocol_1|analysis_protocol_2, ,
## libraryConstructionApproach
## 1 , 10x 3' v2,
## 2 10x 3' v2,
## 3 , 10x 3' v2,
## 4 , , 10x 3' v2|10x 3' v3|10x 5' v1|Visium Spatial Gene Expression|scATAC-seq,
## 5 , 10x 3' v3,
## 6 , 10x 3' v3|10x scATAC-seq,
## nucleicAcidSource
## 1 , single cell,
## 2 single cell,
## 3 , single cell,
## 4 , , single cell,
## 5 , single nucleus,
## 6 , single cell|single nucleus,
## instrumentManufacturerModel pairedEnd cellLineID
## 1 , , Illumina NextSeq 500|Illumina NextSeq 550 , , FALSE
## 2 , Illumina NovaSeq 6000 , FALSE
## 3 , , Illumina HiSeq X , , FALSE
## 4 , , , Illumina HiSeq 4000|Illumina NovaSeq 6000 , , , FALSE
## 5 , , Illumina NovaSeq 6000 , , FALSE
## 6 , , Illumina NextSeq 500 , , FALSE
## cellLineType cellLinemodelOrgan
## 1
## 2
## 3
## 4
## 5
## 6
## organoidsID
## 1
## 2
## 3
## 4 HDBR-N13393_ASCL1_OE|HDBR-N13393_Ctrl_OE|HDBR-N13393_NEUROD1_OE|HDBR-N13393_NEUROG3_OE|HDBR-N13393_PAX9_OE|HDBR-N13393_RFX6_OE|HDBR-N13393_TFAP2A_OE|HDBR-N13393_deltaNp63alpha_OE|HDBR-N13393_organoid|HDBR1915_ASCL1_OE|HDBR1915_Ctrl_OE|HDBR1915_NEUROD1_OE|HDBR1915_NEUROG3_OE|HDBR1915_PAX9_OE|HDBR1915_RFX6_OE|HDBR1915_TFAP2A_OE|HDBR1915_deltaNp63alpha_OE|HDBR1915_organoid|HDBR2174_ASCL1_OE|HDBR2174_Ctrl_OE|HDBR2174_NEUROD1_OE|HDBR2174_NEUROG3_OE|HDBR2174_PAX9_OE|HDBR2174_RFX6_OE|HDBR2174_TFAP2A_OE|HDBR2174_deltaNp63alpha_OE|HDBR2174_organoid
## 5
## 6
## organoidsmodelOrgan organoidsmodelOrganPart lastModifiedDate
## 1 2021-12-01T16:14:19.802000Z
## 2 2022-08-08T11:11:06.854000Z
## 3 2022-08-23T12:48:58.239000Z
## 4 lung epithelial cell of lung 2022-05-20T09:34:37.057000Z
## 5 2021-09-30T17:49:31.807000Z
## 6 2023-05-19T21:22:40.516000Z
Download object
After manually check the extracted metadata, users can
download the specified objects with
ParseHCA
. The downloaded objects are controlled by
file.ext
(choose from "rds"
,
"rdata"
, "h5"
, "h5ad"
and
"loom"
).
The returned result is a dataframe containing failed projects. If not
NULL
, users can re-run ParseHCA
by setting
meta
to the returned result.
# no-run
ParseHCA(
meta = all.human.10x.projects,
out.folder = "/Users/soyabean/Desktop/tmp/scdown/download_hca"
)
The example structure of downloaded objects:
tree /Users/soyabean/Desktop/tmp/scdown/download_hca
## /Users/soyabean/Desktop/tmp/scdown/download_hca
## ├── CountAdded_PIP_global_object_for_cellxgene.h5ad
## ├── TICA_B_BCR.h5ad
## ├── adata_TILC_TCR_onlyseq.h5ad
## ├── adata_TILC_TCRgd_onlyseq.h5ad
## └── myeloid.h5ad
##
## 1 directory, 5 files