Skip to contents

Introduction

GEfetch2R provides functions for users to download processed single-cell RNA-seq data from Zenodo, CELLxGENE and Human Cell Atlas, including RDS, RData, h5ad, h5, loom objects.

## Warning: replacing previous import 'LoomExperiment::import' by
## 'reticulate::import' when loading 'GEfetch2R'

Until now, the public resources supported and the returned values:

Resources URL Download Type Returned values
Zenodo https://zenodo.org/ count matrix, rds, rdata, h5ad, et al. SeuratObject(rds) or failed datasets
CELLxGENE https://cellxgene.cziscience.com/ rds, h5ad SeuratObject(rds) or failed datasets
Human Cell Atlas https://www.humancellatlas.org/ rds, rdata, h5, h5ad, loom SeuratObject(rds) or failed projects

Zenodo

Zenodo contains various types of processed objects, such as SeuratObject which has been clustered and annotated, AnnData which contains processed results generated by scanpy.

Extract metadata

GEfetch2R provides ExtractZenodoMeta to extract dataset metadata, including dataset title, description, available files and corresponding md5. Please note that when the dataset is restricted access, the returned dataframe will be empty.

# library
library(GEfetch2R)

# single doi
zebrafish.df <- ExtractZenodoMeta(doi = "10.5281/zenodo.7243603")
# vector dois
multi.dois <- ExtractZenodoMeta(doi = c("1111", "10.5281/zenodo.7243603", "10.5281/zenodo.7244441"))

Show the metadata:

# single doi
zebrafish.df
##                              title
## 1 zebrafish scRNA data set objects
## 2 zebrafish scRNA data set objects
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                description
## 1 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 2 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
##                                                                         url
## 1  https://zenodo.org/api/records/7243603/files/zebrafish_data.h5ad/content
## 2 https://zenodo.org/api/records/7243603/files/zebrafish_data.RData/content
##               filename                              md5   license
## 1  zebrafish_data.h5ad 124f2229128918b411a7dc7931558f97 cc-by-4.0
## 2 zebrafish_data.RData a08c3ebd285b370fcf34cf2f8f9bdb59 cc-by-4.0
# vector dois
multi.dois
##                              title
## 1 zebrafish scRNA data set objects
## 2 zebrafish scRNA data set objects
## 3      frog scRNA data set objects
## 4      frog scRNA data set objects
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                description
## 1 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 2 <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Farrel et al. 2018, Wagner et al. 2018 and Qiu et al. 2022.</p>
## 3                     <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Briggs et al. 2018 and Qiu et al. 2022.</p>
## 4                     <p>Combined and converted scRNA data from http://tome.gs.washington.edu/ (Qiu et al. 2022), see a detailed description of the study here: https://www.nature.com/articles/s41588-022-01018-x</p>\n\n<p>Data were downloaded from http://tome.gs.washington.edu/ as R rds files, combined into a single Seurat object and converted into loom and AnnData (h5ad) files to be able to analyse with e.g. python scanpy package.</p>\n\n<p>If you use this data, please cite Briggs et al. 2018 and Qiu et al. 2022.</p>
##                                                                         url
## 1  https://zenodo.org/api/records/7243603/files/zebrafish_data.h5ad/content
## 2 https://zenodo.org/api/records/7243603/files/zebrafish_data.RData/content
## 3       https://zenodo.org/api/records/7244441/files/frog_data.h5ad/content
## 4      https://zenodo.org/api/records/7244441/files/frog_data.RData/content
##               filename                              md5   license
## 1  zebrafish_data.h5ad 124f2229128918b411a7dc7931558f97 cc-by-4.0
## 2 zebrafish_data.RData a08c3ebd285b370fcf34cf2f8f9bdb59 cc-by-4.0
## 3       frog_data.h5ad 7be7d6ff024ab2c8579b4d0edb2428e3 cc-by-4.0
## 4      frog_data.RData c80f46320c0cff9e341bed195f12c3b1 cc-by-4.0

Download object and load to R

After manually check the extracted metadata, users can download the specified objects with ParseZenodo. The downloaded objects are controlled by file.ext and the provided object formats should be in lower case (e.g. rds/rdata/h5ad).

The returned value is a dataframe containing failed objects or a SeuratObject (if file.ext is rds). If dataframe, users can re-run ParseZenodo by setting doi.df to the returned value.

# download objects
multi.dois.parse <- ParseZenodo(
  doi = c("1111", "10.5281/zenodo.7243603", "10.5281/zenodo.7244441"),
  file.ext = c("rdata"),
  out.folder = "/Volumes/soyabean/GEfetch2R/download_zenodo"
)

# return SeuratObject
sinle.doi.parse.seu <- ParseZenodo(
  doi = "10.5281/zenodo.8011282",
  file.ext = c("rds"), return.seu = TRUE,
  out.folder = "/Volumes/soyabean/GEfetch2R/download_zenodo"
)

Show the returned SeuratObject:

sinle.doi.parse.seu
## An object of class Seurat 
## 19594 features across 9219 samples within 2 assays 
## Active assay: RNA (17594 features, 0 variable features)
##  1 other assay present: integrated
##  2 dimensional reductions calculated: pca, umap

The structure of downloaded objects:

tree /Volumes/soyabean/GEfetch2R/download_zenodo
## /Volumes/soyabean/GEfetch2R/download_zenodo
## ├── PyMTM_immune_scRNA.rds
## ├── frog_data.RData
## └── zebrafish_data.RData
## 
## 1 directory, 3 files

CELLxGENE

The CELLxGENE is a web server contains 1598 single-cell datasets, users can explore, download and upload own datasets. The downloaded datasets provided by CELLxGENE have two formats: h5ad (AnnData v0.8) and rds (Seurat v4).

CELLxGENE provides an R package (cellxgene.census) to access the data, but sometimes it’s not timely updated. GEfetch2R also supports users to access CELLxGENE via cellxgene.census (use.census = TRUE).

Show available datasets

GEfetch2R provides ShowCELLxGENEDatasets to extract dataset metadata, including dataset title, description, contact, organism, ethnicity, sex, tissue, disease, assay, suspension type, cell type, et al.

# all available datasets
all.cellxgene.datasets <- ShowCELLxGENEDatasets()

Show the metadata:

head(all.cellxgene.datasets)
##                                                                                 title
## 1                                                                        Plaque Atlas
## 2        Single-cell reconstruction of follicular remodeling in the human adult ovary
## 3 Molecular characterization of selectively vulnerable neurons in Alzheimer's Disease
## 4 Molecular characterization of selectively vulnerable neurons in Alzheimer's Disease
## 5 Molecular characterization of selectively vulnerable neurons in Alzheimer's Disease
## 6 Molecular characterization of selectively vulnerable neurons in Alzheimer's Disease
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                description
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           A curation of publicly available atherosclerotic plaque scRNAseq datasets including annotation and validation.
## 2 The ovary is perhaps the most dynamic organ in the human body, only rivaled by the uterus. The molecular mechanisms that regulate follicular growth and regression, ensuring ovarian tissue homeostasis, remain elusive. We have performed single-cell RNA-sequencing using human adult ovaries to provide a map of the molecular signature of growing and regressing follicular populations. We have identified different types of granulosa and theca cells and detected local production of components of the complement system by (atretic) theca cells and stromal cells. We also have detected a mixture of adaptive and innate immune cells, as well as several types of endothelial and smooth muscle cells to aid the remodeling process. Our results highlight the relevance of mapping whole adult organs at the single-cell level and reflect ongoing efforts to map the human body. The association between complement system and follicular remodeling may provide key insights in reproductive biology and (in)fertility.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Single-nuclei RNA sequencing of caudal entorhinal cortex and superior frontal gyrus from individuals spanning the neuropathological progression of AD
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Single-nuclei RNA sequencing of caudal entorhinal cortex and superior frontal gyrus from individuals spanning the neuropathological progression of AD
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Single-nuclei RNA sequencing of caudal entorhinal cortex and superior frontal gyrus from individuals spanning the neuropathological progression of AD
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Single-nuclei RNA sequencing of caudal entorhinal cortex and superior frontal gyrus from individuals spanning the neuropathological progression of AD
##                          doi                    contact
## 1  10.1101/2024.09.11.612431          Korbinian Träuble
## 2 10.1038/s41467-019-11036-9 S. M. Chuva de Sousa Lopes
## 3 10.1038/s41593-020-00764-7            Martin Kampmann
## 4 10.1038/s41593-020-00764-7            Martin Kampmann
## 5 10.1038/s41593-020-00764-7            Martin Kampmann
## 6 10.1038/s41593-020-00764-7            Martin Kampmann
##                            contact_email                        collection_id
## 1 korbinian.traeuble@helmholtz-munich.de db70986c-7d91-49fe-a399-a4730be394ac
## 2                          lopes@lumc.nl 2902f08c-f83c-470e-a541-e463e25e5058
## 3               Martin.Kampmann@ucsf.edu 180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 4               Martin.Kampmann@ucsf.edu 180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 5               Martin.Kampmann@ucsf.edu 180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 6               Martin.Kampmann@ucsf.edu 180bff9c-c8a5-4539-b13b-ddbc00d643e6
##                                                                      collection_url
## 1 https://cellxgene.cziscience.com/collections/db70986c-7d91-49fe-a399-a4730be394ac
## 2 https://cellxgene.cziscience.com/collections/2902f08c-f83c-470e-a541-e463e25e5058
## 3 https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 4 https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 5 https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 6 https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
##   consortia            curator_name visibility                assay
## 1                     James Chaffer     PUBLIC 10x 3' v2, 10x 3' v3
## 2           Jennifer Yu-Sheng Chien     PUBLIC            10x 3' v2
## 3           Jennifer Yu-Sheng Chien     PUBLIC            10x 3' v2
## 4           Jennifer Yu-Sheng Chien     PUBLIC            10x 3' v2
## 5           Jennifer Yu-Sheng Chien     PUBLIC            10x 3' v2
## 6           Jennifer Yu-Sheng Chien     PUBLIC            10x 3' v2
##                                                                                                                                                                                                       assets
## 1 4221495897, 1216838411, H5AD, RDS, https://datasets.cellxgene.cziscience.com/999a6b92-46ca-498e-b1ee-5fc43b6988ef.h5ad, https://datasets.cellxgene.cziscience.com/999a6b92-46ca-498e-b1ee-5fc43b6988ef.rds
## 2    105862325, 96479066, H5AD, RDS, https://datasets.cellxgene.cziscience.com/2afef4bd-99af-41f4-b507-f80718b6a8ef.h5ad, https://datasets.cellxgene.cziscience.com/2afef4bd-99af-41f4-b507-f80718b6a8ef.rds
## 3     17724244, 14805677, H5AD, RDS, https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad, https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.rds
## 4     47491494, 46270555, H5AD, RDS, https://datasets.cellxgene.cziscience.com/000198ac-27c7-4b9f-9fd4-6362a5b91596.h5ad, https://datasets.cellxgene.cziscience.com/000198ac-27c7-4b9f-9fd4-6362a5b91596.rds
## 5       9460591, 7134383, H5AD, RDS, https://datasets.cellxgene.cziscience.com/017837df-b8be-4a4f-bc08-c5f14bd8815a.h5ad, https://datasets.cellxgene.cziscience.com/017837df-b8be-4a4f-bc08-c5f14bd8815a.rds
## 6     16025334, 13487453, H5AD, RDS, https://datasets.cellxgene.cziscience.com/3512034b-9a5c-4dba-9dfb-cead9f9840dc.h5ad, https://datasets.cellxgene.cziscience.com/3512034b-9a5c-4dba-9dfb-cead9f9840dc.rds
##   cell_count
## 1     184623
## 2      20676
## 3       8168
## 4       8362
## 5       3799
## 6       5970
##                                                                                                                                             cell_type
## 1 B cell, T cell, dendritic cell, endothelial cell, fibroblast, macrophage, mast cell, monocyte, natural killer cell, plasma cell, smooth muscle cell
## 2  B cell, T cell, endothelial cell, granulosa cell, innate lymphoid cell, natural killer cell, smooth muscle cell, stromal cell of ovary, theca cell
## 3                                                                                                                                     oligodendrocyte
## 4                                                                                                                                glutamatergic neuron
## 5                                                                                                                              mature microglial cell
## 6                                                                                                                                    mature astrocyte
##                                                                                                                                                                                                                                                                                                         citation
## 1  Publication: https://doi.org/10.1101/2024.09.11.612431 Dataset Version: https://datasets.cellxgene.cziscience.com/999a6b92-46ca-498e-b1ee-5fc43b6988ef.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/db70986c-7d91-49fe-a399-a4730be394ac
## 2 Publication: https://doi.org/10.1038/s41467-019-11036-9 Dataset Version: https://datasets.cellxgene.cziscience.com/2afef4bd-99af-41f4-b507-f80718b6a8ef.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/2902f08c-f83c-470e-a541-e463e25e5058
## 3 Publication: https://doi.org/10.1038/s41593-020-00764-7 Dataset Version: https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 4 Publication: https://doi.org/10.1038/s41593-020-00764-7 Dataset Version: https://datasets.cellxgene.cziscience.com/000198ac-27c7-4b9f-9fd4-6362a5b91596.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 5 Publication: https://doi.org/10.1038/s41593-020-00764-7 Dataset Version: https://datasets.cellxgene.cziscience.com/017837df-b8be-4a4f-bc08-c5f14bd8815a.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
## 6 Publication: https://doi.org/10.1038/s41593-020-00764-7 Dataset Version: https://datasets.cellxgene.cziscience.com/3512034b-9a5c-4dba-9dfb-cead9f9840dc.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
##                             dataset_id                   dataset_version_id
## 1 72955cdb-bd92-4135-aa52-21f33f9640db 999a6b92-46ca-498e-b1ee-5fc43b6988ef
## 2 1f1c5c14-5949-4c81-b28e-b272e271b672 2afef4bd-99af-41f4-b507-f80718b6a8ef
## 3 f9ad5649-f372-43e1-a3a8-423383e5a8a2 d9db936c-41c6-4398-a8c4-a9dca34a225e
## 4 cd77258f-b08b-4c89-b93f-6e6f146b1a4d 000198ac-27c7-4b9f-9fd4-6362a5b91596
## 5 bdacc907-7c26-419f-8808-969eab3ca2e8 017837df-b8be-4a4f-bc08-c5f14bd8815a
## 6 b94e3bdf-a385-49cc-b312-7a63cc28b77a 3512034b-9a5c-4dba-9dfb-cead9f9840dc
##                                                                                                                                                                                       development_stage
## 1 54-year-old stage, 58-year-old stage, 65-year-old stage, 67-year-old stage, 74-year-old stage, 76-year-old stage, 77-year-old stage, 82-year-old stage, 83-year-old stage, 87-year-old stage, unknown
## 2                                                                                                                                                                                           adult stage
## 3                                       50-year-old stage, 60-year-old stage, 71-year-old stage, 72-year-old stage, 77-year-old stage, 80 year-old and over stage, 82-year-old stage, 87-year-old stage
## 4                                       50-year-old stage, 60-year-old stage, 71-year-old stage, 72-year-old stage, 77-year-old stage, 80 year-old and over stage, 82-year-old stage, 87-year-old stage
## 5                                       50-year-old stage, 60-year-old stage, 71-year-old stage, 72-year-old stage, 77-year-old stage, 80 year-old and over stage, 82-year-old stage, 87-year-old stage
## 6                                       50-year-old stage, 60-year-old stage, 71-year-old stage, 72-year-old stage, 77-year-old stage, 80 year-old and over stage, 82-year-old stage, 87-year-old stage
##                     disease
## 1           atherosclerosis
## 2                    normal
## 3 Alzheimer disease, normal
## 4 Alzheimer disease, normal
## 5 Alzheimer disease, normal
## 6 Alzheimer disease, normal
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        donor_id
## 1 2_Pauli, 3_Pauli, 4_Pauli, 5_Pauli, 6_Pauli, 7_Pauli, 8_Pauli, 9_Pauli, 1_Alsaigh, 2_Alsaigh, 3_Alsaigh, 1_Fernandez, 2_Fernandez, 3_Fernandez, 3A_Fernandez, 4_Fernandez, 5_Fernandez, 6_Fernandez, 1_Wirka, 2_Wirka, 3_Wirka, 4_Wirka, 1_Emoto_ACS, 2_Emoto_SAP, 3_Chowdhury, 6_Chowdhury, 10_Chowdhury, 4_Chowdhury, 5_Chowdhury, 12_Chowdhury, 11_Chowdhury, 1_Chowdhury, 2_Chowdhury, 7_Chowdhury, 9_Chowdhury, 8_Chowdhury, 1_Pan, 2_Pan, 3_Pan, 1_Dib, 2_Dib, 3_Dib, 4_Dib, 5_Dib, 6_Dib, 1_2_Slysz, 3_Slysz, 4_Slysz, 1_2_3_Slysz_femoral, 4_Slysz_femoral, 5_Slysz_femoral, 6_Slysz_femoral, 7_Slysz_femoral, 8_Slysz_femoral, 9_Slysz_femoral, 1_Jaiswal, 2_Jaiswal
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  pat9, pat7, pat0, pat3, pat2
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 3, 1, 2, 5, 6, 7, 4, 8, 9, 10
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 3, 1, 2, 5, 6, 7, 4, 8, 9, 10
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 3, 1, 2, 5, 6, 7, 4, 8, 9, 10
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 3, 1, 2, 5, 6, 7, 4, 8, 9, 10
##                             embeddings
## 1                     X_scpoli, X_umap
## 2 X_mnn.correct, X_pca, X_tsne, X_umap
## 3         X_cca, X_cca.aligned, X_tsne
## 4         X_cca, X_cca.aligned, X_tsne
## 5         X_cca, X_cca.aligned, X_tsne
## 6         X_cca, X_cca.aligned, X_tsne
##                                                                   explorer_url
## 1 https://cellxgene.cziscience.com/e/72955cdb-bd92-4135-aa52-21f33f9640db.cxg/
## 2 https://cellxgene.cziscience.com/e/1f1c5c14-5949-4c81-b28e-b272e271b672.cxg/
## 3 https://cellxgene.cziscience.com/e/f9ad5649-f372-43e1-a3a8-423383e5a8a2.cxg/
## 4 https://cellxgene.cziscience.com/e/cd77258f-b08b-4c89-b93f-6e6f146b1a4d.cxg/
## 5 https://cellxgene.cziscience.com/e/bdacc907-7c26-419f-8808-969eab3ca2e8.cxg/
## 6 https://cellxgene.cziscience.com/e/b94e3bdf-a385-49cc-b312-7a63cc28b77a.cxg/
##   feature_biotype feature_count feature_reference is_primary_data
## 1            gene         28033    NCBITaxon:9606            TRUE
## 2            gene         32839    NCBITaxon:9606            TRUE
## 3            gene         32743    NCBITaxon:9606           FALSE
## 4            gene         32743    NCBITaxon:9606           FALSE
## 5            gene         32743    NCBITaxon:9606           FALSE
## 6            gene         32743    NCBITaxon:9606           FALSE
##   mean_genes_per_cell     organism primary_cell_count              published_at
## 1           1390.0576 Homo sapiens             184623 2024-09-24T15:53:59+00:00
## 2            942.2772 Homo sapiens              20676 2022-06-20T09:32:24+00:00
## 3            599.5956 Homo sapiens                  0 2020-11-20T13:39:38+00:00
## 4           2165.4574 Homo sapiens                  0 2020-11-20T13:39:40+00:00
## 5            585.7452 Homo sapiens                  0 2020-11-20T13:39:43+00:00
## 6            777.2630 Homo sapiens                  0 2020-11-20T13:39:41+00:00
##   raw_data_location                revised_at schema_version
## 1             raw.X 2024-10-10T20:20:32+00:00          5.2.0
## 2             raw.X 2024-10-10T20:20:32+00:00          5.2.0
## 3                 X 2024-10-10T20:20:33+00:00          5.2.0
## 4                 X 2024-10-10T20:20:33+00:00          5.2.0
## 5                 X 2024-10-10T20:20:33+00:00          5.2.0
## 6                 X 2024-10-10T20:20:33+00:00          5.2.0
##   self_reported_ethnicity                   sex spatial suspension_type
## 1       European, unknown female, male, unknown      NA            cell
## 2                 unknown                female      NA            cell
## 3                 unknown                  male      NA         nucleus
## 4                 unknown                  male      NA         nucleus
## 5                 unknown                  male      NA         nucleus
## 6                 unknown                  male      NA         nucleus
##                                                    tissue
## 1 carotid artery segment, coronary artery, femoral artery
## 2                                                   ovary
## 3                                       entorhinal cortex
## 4                                       entorhinal cortex
## 5                                  superior frontal gyrus
## 6                                  superior frontal gyrus
##                                                                                          dataset_description
## 1                                                                                               plaque atlas
## 2                               Single-cell reconstruction of follicular remodeling in the human adult ovary
## 3    Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease: EC oligodendrocyte
## 4 Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease: EC excitatory neurons
## 5         Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease: SFG microglia
## 6        Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease: SFG astrocytes
##   tombstone x_approximate_distribution spatial.has_fullres spatial.is_single
## 1     FALSE                       <NA>                  NA                NA
## 2     FALSE                       <NA>                  NA                NA
## 3     FALSE                       <NA>                  NA                NA
## 4     FALSE                       <NA>                  NA                NA
## 5     FALSE                       <NA>                  NA                NA
## 6     FALSE                       <NA>                  NA                NA
##   batch_condition default_embedding
## 1                              <NA>
## 2                              <NA>
## 3                              <NA>
## 4                              <NA>
## 5                              <NA>
## 6                              <NA>
##                                                                               rds_id
## 1 https://datasets.cellxgene.cziscience.com/999a6b92-46ca-498e-b1ee-5fc43b6988ef.rds
## 2 https://datasets.cellxgene.cziscience.com/2afef4bd-99af-41f4-b507-f80718b6a8ef.rds
## 3 https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.rds
## 4 https://datasets.cellxgene.cziscience.com/000198ac-27c7-4b9f-9fd4-6362a5b91596.rds
## 5 https://datasets.cellxgene.cziscience.com/017837df-b8be-4a4f-bc08-c5f14bd8815a.rds
## 6 https://datasets.cellxgene.cziscience.com/3512034b-9a5c-4dba-9dfb-cead9f9840dc.rds
##                                                                               h5ad_id
## 1 https://datasets.cellxgene.cziscience.com/999a6b92-46ca-498e-b1ee-5fc43b6988ef.h5ad
## 2 https://datasets.cellxgene.cziscience.com/2afef4bd-99af-41f4-b507-f80718b6a8ef.h5ad
## 3 https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad
## 4 https://datasets.cellxgene.cziscience.com/000198ac-27c7-4b9f-9fd4-6362a5b91596.h5ad
## 5 https://datasets.cellxgene.cziscience.com/017837df-b8be-4a4f-bc08-c5f14bd8815a.h5ad
## 6 https://datasets.cellxgene.cziscience.com/3512034b-9a5c-4dba-9dfb-cead9f9840dc.h5ad

Summary attributes

GEfetch2R provides StatDBAttribute to summary attributes of CELLxGENE:

StatDBAttribute(
  df = all.cellxgene.datasets, filter = c("organism", "sex", "disease"),
  database = "CELLxGENE", combine = TRUE
)
## # A tibble: 343 × 4
## # Groups:   organism, sex [18]
##    organism     sex     disease    Num
##    <chr>        <chr>   <chr>    <int>
##  1 homo sapiens male    normal     856
##  2 homo sapiens female  normal     691
##  3 mus musculus male    normal     191
##  4 homo sapiens unknown normal     146
##  5 mus musculus female  normal     143
##  6 mus musculus unknown normal     123
##  7 homo sapiens female  covid-19    50
##  8 homo sapiens female  dementia    50
##  9 homo sapiens male    dementia    50
## 10 homo sapiens male    covid-19    42
## # ℹ 333 more rows
# # use cellxgene.census
# StatDBAttribute(filter = c("disease", "tissue", "cell_type"), database = "CELLxGENE",
#                 use.census = TRUE, organism = "homo_sapiens")

Extract metadata

GEfetch2R provides ExtractCELLxGENEMeta to filter dataset metadata, the available values of attributes can be obtained with StatDBAttribute except cell number:

# human 10x v2 and v3 datasets
human.10x.cellxgene.meta <- ExtractCELLxGENEMeta(
  all.samples.df = all.cellxgene.datasets,
  assay = c("10x 3' v2", "10x 3' v3"), organism = "Homo sapiens"
)
# subset
cellxgene.down.meta <- human.10x.cellxgene.meta[human.10x.cellxgene.meta$cell_type == "oligodendrocyte" &
  human.10x.cellxgene.meta$tissue == "entorhinal cortex", ]

Show the metadata:

head(cellxgene.down.meta)
##                                                                                 title
## 3 Molecular characterization of selectively vulnerable neurons in Alzheimer's Disease
##                                                                                                                                             description
## 3 Single-nuclei RNA sequencing of caudal entorhinal cortex and superior frontal gyrus from individuals spanning the neuropathological progression of AD
##                          doi         contact            contact_email
## 3 10.1038/s41593-020-00764-7 Martin Kampmann Martin.Kampmann@ucsf.edu
##                          collection_id
## 3 180bff9c-c8a5-4539-b13b-ddbc00d643e6
##                                                                      collection_url
## 3 https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
##   consortia            curator_name visibility     assay
## 3           Jennifer Yu-Sheng Chien     PUBLIC 10x 3' v2
##                                                                                                                                                                                                   assets
## 3 17724244, 14805677, H5AD, RDS, https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad, https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.rds
##   cell_count       cell_type
## 3       8168 oligodendrocyte
##                                                                                                                                                                                                                                                                                                         citation
## 3 Publication: https://doi.org/10.1038/s41593-020-00764-7 Dataset Version: https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad curated and distributed by CZ CELLxGENE Discover in Collection: https://cellxgene.cziscience.com/collections/180bff9c-c8a5-4539-b13b-ddbc00d643e6
##                             dataset_id                   dataset_version_id
## 3 f9ad5649-f372-43e1-a3a8-423383e5a8a2 d9db936c-41c6-4398-a8c4-a9dca34a225e
##                                                                                                                                                 development_stage
## 3 50-year-old stage, 60-year-old stage, 71-year-old stage, 72-year-old stage, 77-year-old stage, 80 year-old and over stage, 82-year-old stage, 87-year-old stage
##                     disease                      donor_id
## 3 Alzheimer disease, normal 3, 1, 2, 5, 6, 7, 4, 8, 9, 10
##                     embeddings
## 3 X_cca, X_cca.aligned, X_tsne
##                                                                   explorer_url
## 3 https://cellxgene.cziscience.com/e/f9ad5649-f372-43e1-a3a8-423383e5a8a2.cxg/
##   feature_biotype feature_count feature_reference is_primary_data
## 3            gene         32743    NCBITaxon:9606           FALSE
##   mean_genes_per_cell     organism primary_cell_count              published_at
## 3            599.5956 Homo sapiens                  0 2020-11-20T13:39:38+00:00
##   raw_data_location                revised_at schema_version
## 3                 X 2024-10-10T20:20:33+00:00          5.2.0
##   self_reported_ethnicity  sex spatial suspension_type            tissue
## 3                 unknown male      NA         nucleus entorhinal cortex
##                                                                                       dataset_description
## 3 Molecular characterization of selectively vulnerable neurons in Alzheimer’s Disease: EC oligodendrocyte
##   tombstone x_approximate_distribution spatial.has_fullres spatial.is_single
## 3     FALSE                       <NA>                  NA                NA
##   batch_condition default_embedding
## 3                              <NA>
##                                                                               rds_id
## 3 https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.rds
##                                                                               h5ad_id
## 3 https://datasets.cellxgene.cziscience.com/d9db936c-41c6-4398-a8c4-a9dca34a225e.h5ad

Download object and load to R

After manually check the extracted metadata, users can download the specified objects with ParseCELLxGENE. The downloaded objects are controlled by file.ext (choose from "rds" and "h5ad").

The returned value is a dataframe containing failed objects or a SeuratObject (if file.ext is rds). If dataframe, users can re-run ParseCELLxGENE by setting meta to the returned value.

When using cellxgene.census, users can subset metadata and gene.

# download objects
cellxgene.down <- ParseCELLxGENE(
  meta = cellxgene.down.meta, file.ext = "rds",
  out.folder = "/Volumes/soyabean/GEfetch2R/download_cellxgene"
)

# retuen SeuratObject
cellxgene.down.seu <- ParseCELLxGENE(
  meta = cellxgene.down.meta, file.ext = "rds", return.seu = TRUE,
  obs.value.filter = "cell_type == 'oligodendrocyte' & disease == 'Alzheimer disease'",
  obs.keys = c("cell_type", "disease", "sex", "suspension_type", "development_stage"),
  out.folder = "/Volumes/soyabean/GEfetch2R/download_cellxgene"
)

# use cellxgene.census (support subset, but update is not timely)
cellxgene.down.census <- ParseCELLxGENE(
  use.census = TRUE, organism = "Homo sapiens",
  obs.value.filter = "cell_type == 'B cell' & tissue_general == 'lung' & disease == 'COVID-19'",
  obs.keys = c("cell_type", "tissue_general", "disease", "sex"),
  include.genes = c("ENSG00000161798", "ENSG00000188229")
)

Show the returned SeuratObject:

cellxgene.down.seu
## An object of class Seurat 
## 32743 features across 6873 samples within 1 assay 
## Active assay: RNA (32743 features, 0 variable features)
##  3 dimensional reductions calculated: cca, cca.aligned, tsne

Show the returned SeuratObject (cellxgene.census):

cellxgene.down.census
## An object of class Seurat 
## 2 features across 2729 samples within 1 assay 
## Active assay: RNA (2 features, 0 variable features)

The structure of downloaded objects:

tree /Volumes/soyabean/GEfetch2R/download_cellxgene
## /Volumes/soyabean/GEfetch2R/download_cellxgene
## └── Molecular.characterization.of.selectively.vulnerable.neurons.in.Alzheimer.s.Disease..EC.oligodendrocyte.rds
## 
## 1 directory, 1 file

Human Cell Atlas

The Human Cell Atlas aims to map every cell type in the human body, it contains 484 projects, most of which are from Homo sapiens (also includes projects from Mus musculus, Macaca mulatta and canis lupus familiaris).

Show available datasets

GEfetch2R provides ShowHCAProjects to extract detailed project metadata, including project title, description, organism, sex, organ/organPart, disease, assay, preservation method, sample type, suspension type, cell type, development stage, et al.

There are 484 unique projects under five different catalogs (dcp29, dcp30, dcp1, lm2, lm3):

all.hca.projects <- ShowHCAProjects()

Show the metadata:

head(all.hca.projects)
##                                                                                             projectTitle
## 1                                                                  1.3 Million Brain Cells from E18 Mice
## 2                            A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra
## 3                                               A Cellular Atlas of Pitx2-Dependent Cardiac Development.
## 4                              A Human Liver Cell Atlas reveals Heterogeneity and Epithelial Progenitors
## 5                          A Partial Picture of the Single-Cell Transcriptomics of Human IgA Nephropathy
## 6 A Protocol for Revealing Oral Neutrophil Heterogeneity by Single-Cell Immune Profiling in Human Saliva
##                              projectId              projectShortname
## 1 74b6d569-3b11-42ef-b6b1-a0454522b4a0                    1M Neurons
## 2 53c53cd4-8127-4e12-bc7f-8fe1610a715c             ProstateCellAtlas
## 3 7027adc6-c9c9-46f3-84ee-9badc3a4f53b          Pitx2DevelopingHeart
## 4 94e4ee09-9b4b-410a-84dc-a751ad36d0df   LiverCellAtlasHeterogeneity
## 5 c5b475f2-76b3-4a8e-8465-f3b69828fec3      Tang-Human-IgAN-GEXSCOPE
## 6 60ea42e1-af49-42f5-8164-d641fdb696bc PRJNA640427_human_neutrophils
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                projectDescription
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Cortex, hippocampus, and subventricular zone were purchased from BrainBits (C57EHCV). They were from 2 E18 C57BL/6 mice dissected on the same day, shipped overnight on ice, and stored at 4C until being prepared for scRNA-Seq. Brain tissues were dissociated following the Demonstrated Protocol for Mouse Embryonic Neural Tissue (https://support.10xgenomics.com/single-cell/sample-prep/doc/demonstrated-protocol-dissociation-of-mouse-embryonic-neural-tissue-for-single-cell-rna-sequencing). 69 scRNA-Seq libraries were made from first mouse brain 2 days after the dissection. Another 64 scRNA-Seq libraries were made from second mouse brain 6 days after the dissection.
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    A comprehensive cellular anatomy of normal human prostate is essential for solving the cellular origins of benign prostatic hyperplasia and prostate cancer. The tools used to analyze the contribution of individual cell types are not robust. We provide a cellular atlas of the young adult human prostate and prostatic urethra using an iterative process of single-cell RNA sequencing (scRNA-seq) and flow cytometry on ∼98,000 cells taken from different anatomical regions. Immunohistochemistry with newly derived cell type-specific markers revealed the distribution of each epithelial and stromal cell type on whole mounts, revising our understanding of zonal anatomy. Based on discovered cell surface markers, flow cytometry antibody panels were designed to improve the purification of each cell type, with each gate confirmed by scRNA-seq. The molecular classification, anatomical distribution, and purification tools for each cell type in the human prostate create a powerful resource for experimental design in human prostate disease.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Single-cell RNA sequencing was applied to study the role of Pitx2 expression in heart development in mice.  Over 75,000 single cardiac cell transcriptomes between two key developmental timepoints in control and Pitx2-null embryos were amplified and sequenced.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 We perfomed single-cell RNA-sequnecing of around 10,000 cells from normal human liver tissue to construct a human liver cell atlas. We reveal previously unknown subtypes in different cell type compartments. Overall design: Single cells were isolated from human liver resection specimens and then sorted by FACS into 384 well plates in a unbiased way and on the basis of cell surface markers for distinct cell types. ScRNA-seq was done using the mCelSeq2 protocol.
## 5 Aa comprehensive scRNA-seq analysis of human renal biopsies from IgAN. We showed for the first time that IgAN mesangial cells displayed increased expression of several novel genes including MALAT1, GADD45B, SOX4, and EDIL3, which were related to cell proliferation and matrix accumulation. The overexpressed genes in tubule cells of IgAN were mainly enriched in inflammatory pathways including TNF signaling, IL-17 signaling, and NOD-like receptor signaling. Furthermore, we compared the results of 4 IgAN patients with the published scRNA-Seq data of healthy kidney tissues of three human donors in order to further validate the findings in our study. The results also verified that the overexpressed genes in tubule cells from IgAN patients were mainly enriched in inflammatory pathways including TNF signaling, IL-17 signaling, and NOD-like receptor signaling. The receptor-ligand crosstalk analysis revealed potential interactions between mesangial cells and other cells in IgAN. IgAN patients with overt proteinuria displayed elevated genes participating in several signaling pathways compared with microproteinuria group. It needs to be mentioned that based on number of mesangial cells and other kidney cells analyzed in this study, the results of our study are preliminary and needs to be confirmed on larger number of cells from larger number of patients and controls in future studies. Therefore, these results offer new insight into pathogenesis and identify new therapeutic targets for IgAN.
## 6                                 Neutrophils are the most abundant white blood cells in the human body responsible for fighting viral, bacterial and fungi infections. Out of the 100 billion neutrophils produced daily, it is estimated that 10 % of these cells end up in oral biofluids. Because saliva is a fluid accessible through non-invasive techniques, it is an optimal source of cells and molecule surveillance in health and disease. While neutrophils are abundant in saliva, scientific advancements in neutrophil biology have been hampered likely due to their short life span, inability to divide once terminally differentiated, sensitivity to physical stress, and low RNA content. Here, we devise a protocol aiming to understand neutrophil heterogeneity by improving isolation methods, single-cell RNA extraction, sequencing and bioinformatic pipelines. Advanced flow cytometry 3D analysis, and machine learning validated our gating system model, by including positive neutrophil markers and excluding other immune cells and uncovered neutrophil heterogeneity. Considering specific cell markers, unique mitochondrial content, stringent and less stringent filtering strategies, our transcriptome single cell findings unraveled novel neutrophil subpopulations. Collectively, this methodology accelerates the discovery of salivary immune landscapes, with the promise of improving the understanding of diversification mechanisms, clinical diagnostics in health and disease, and guide targeted therapies.
##                                                                                             publications
## 1                                   Massively parallel digital transcriptional profiling of single cells
## 2                            A Cellular Anatomy of the Normal Adult Human Prostate and Prostatic Urethra
## 3                                               A cellular atlas of Pit2x-dependent cardiac development.
## 4                             A human liver cell atlas reveals heterogeneity and epithelial progenitors.
## 5                          A Partial Picture of the Single-Cell Transcriptomics of Human IgA Nephropathy
## 6 A Protocol for Revealing Oral Neutrophil Heterogeneity by Single-Cell Immune Profiling in Human Saliva
##                                                                                                                                                                                                                                                                                                                  laboratory
## 1                                                                                                                                                                                                                                                                               Human Cell Atlas Data Coordination Platform
## 2                                                                                                                                                                                                                                                                    Human Cell Atlas Data Coordination Platform|Strand Lab
## 3                                                                                                                                                                                                                       Department of Molecular Physiology and Biophysics|Human Cell Atlas|Program in Developmental Biology
## 4                                                                                                                                                                                                                                                                                                                        NA
## 5 Centre for Inflammatory Diseases|Department of Hematology|Department of Medical Records & Information|Department of Nephrology|Department of Nephrology;Centre for Inflammatory Diseases|Department of Organ Transplantation|Department of Pathology|Department of Ultrasound|Human Cell Atlas Data Coordination Platform
## 6                                                                                                                                                                                                                                                                                                          Human Cell Atlas
##                          accessions accessible estimatedCellCount
## 1  GSE93421, SRP096558, PRJNA360949       TRUE            1330000
## 2                         GSE117403       TRUE             108700
## 3 SRP198380, GSE131181, PRJNA542873       TRUE              75000
## 4 SRP174502, GSE124395, PRJNA511895       TRUE              10000
## 5 GSE171314, SRP313266, PRJNA719108       TRUE              20570
## 6            SRP271375, PRJNA640427       TRUE               1145
##       sampleEntityType          organ
## 1            specimens          brain
## 2            specimens prostate gland
## 3            specimens          heart
## 4 organoids, specimens   liver, liver
## 5            specimens         kidney
## 6            specimens    oral cavity
##                                                 organPart
## 1                                                  cortex
## 2 peripheral zone of prostate|transition zone of prostate
## 3                                                      NA
## 4                                                    , NA
## 5                                                      NA
## 6                                                  saliva
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     sampleID
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      E18_20160930_Brain|E18_20161004_Brain
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            D17PrPz|D17PrTz|D27PrPz|D27PrTz|D35PrPz|D35PrTz
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            SAMN11640779_het|SAMN11640779_wt|SAMN11640780|SAMN11640781_het|SAMN11640781_wt|SAMN11640782_het|SAMN11640782_wt|SAMN11640783|SAMN11640784|SAMN11640785_het|SAMN11640785_wt|SAMN11640786_het|SAMN11640786_wt|SAMN11640787|SAMN11640788|SAMN11640789|SAMN11640790
## 4 SAMN10645790|SAMN10645791|SAMN10645843|SAMN10645847|SAMN10645915|SAMN10645916|SAMN10645917|SAMN10645918|SAMN10645919|SAMN10645920|SAMN10645921|SAMN10645922|SAMN10645923|SAMN10645924, SAMN10645729|SAMN10645734|SAMN10645735|SAMN10645736|SAMN10645737|SAMN10645738|SAMN10645739|SAMN10645741|SAMN10645744|SAMN10645745|SAMN10645746|SAMN10645747|SAMN10645748|SAMN10645749|SAMN10645750|SAMN10645751|SAMN10645756|SAMN10645758|SAMN10645761|SAMN10645773|SAMN10645782|SAMN10645810|SAMN10645830|SAMN10645831|SAMN10645832|SAMN10645833|SAMN10645834|SAMN10645835|SAMN10645836|SAMN10645837|SAMN10645838|SAMN10645839|SAMN10645840|SAMN10645848|SAMN10645849|SAMN10645850|SAMN10645851|SAMN10645852|SAMN10645853|SAMN10645854|SAMN10645855|SAMN10645856|SAMN10645857|SAMN10645858|SAMN10645860|SAMN10645861|SAMN10645862|SAMN10645863|SAMN10645864|SAMN10645873|SAMN10645880|SAMN10645881|SAMN10645882|SAMN10645883|SAMN10645884|SAMN10645885|SAMN10645886|SAMN10645888|SAMN10645892|SAMN10645911|SAMN10645935|SAMN10645945|SAMN10645954|SAMN10645955|SAMN10645956|SAMN10645962|SAMN10645963|SAMN10645964|SAMN10645965|SAMN10645966|SAMN10645967|SAMN10645968|SAMN10645969|SAMN10645970|SAMN10645971|SAMN10645972|SAMN10645973|SAMN10645974|SAMN10645977|SAMN10645978|SAMN10645979|SAMN10645980|SAMN10645989|SAMN10646005|SAMN10646007|SAMN10646011|SAMN10646013|SAMN10646019|SAMN10646032|SAMN10646036|SAMN10646037|SAMN10646038|SAMN10646039|SAMN10646040|SAMN10646041|SAMN10646042|SAMN10646043|SAMN10646044|SAMN10646045|SAMN10646046
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   kidney_1_igan|kidney_2_igan|kidney_3_igan|kidney_4_igan|kidney_5_healthy
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               saliva_sample1|saliva_sample2|saliva_sample3
##                             disease preservationMethod donorCount
## 1                            normal              fresh          2
## 2                            normal              fresh          3
## 3                            normal                 NA         17
## 4 , hepatocellular carcinoma|normal               , NA         23
## 5     IgA glomerulonephritis|normal                 NA          5
## 6                            normal                 NA          3
##                    developmentStage genusSpecies biologicalSex
## 1                mouse embryo stage Mus musculus       unknown
## 2                 human adult stage Homo sapiens          male
## 3 Theiler stage 17|Theiler stage 21 Mus musculus       unknown
## 4                 human adult stage Homo sapiens       unknown
## 5                 human adult stage Homo sapiens   female|male
## 6                 human adult stage Homo sapiens          male
##                                                                                                                                                                                                              selectedCellType
## 1                                                                                                                                                                                                                      neuron
## 2 basal cell of prostate epithelium|epithelial cell of prostate|fibroblast of connective tissue of prostate|luminal cell of prostate epithelium|prostate epithelial cell|prostate stromal cell|smooth muscle cell of prostate
## 3                                                                                                                                                                                                                          NA
## 4                                                                                                                                                                                                                          NA
## 5                                                                                                                                                                                                                          NA
## 6                                                                                                                                                                                                                  neutrophil
##   catalog                              entryId
## 1   dcp43 74b6d569-3b11-42ef-b6b1-a0454522b4a0
## 2   dcp43 53c53cd4-8127-4e12-bc7f-8fe1610a715c
## 3   dcp43 7027adc6-c9c9-46f3-84ee-9badc3a4f53b
## 4   dcp43 94e4ee09-9b4b-410a-84dc-a751ad36d0df
## 5   dcp43 c5b475f2-76b3-4a8e-8465-f3b69828fec3
## 6   dcp43 60ea42e1-af49-42f5-8164-d641fdb696bc
##                               sourceId
## 1 53e00b4b-4351-4131-b873-7931f3d4037f
## 2 1d18ed01-dce0-4f9d-8ff0-61a56e63854b
## 3 aae2cbe1-2c21-4abe-a611-71d1c778657d
## 4 c95bf55d-7f4e-47b0-af97-198823f21aaa
## 5 be6cf599-d084-4c5e-a89f-95308dbac4cd
## 6 6abbe02d-b89d-49b2-b639-b45f21d5baa7
##                                                                                                      sourceSpec
## 1 tdr:bigquery:gcp:datarepo-aa6a9210:hca_prod_74b6d5693b1142efb6b1a0454522b4a0__20220117_dcp2_20220307_dcp14:/0
## 2 tdr:bigquery:gcp:datarepo-d13e36e7:hca_prod_53c53cd481274e12bc7f8fe1610a715c__20220117_dcp2_20231002_dcp32:/0
## 3 tdr:bigquery:gcp:datarepo-1840929b:hca_prod_7027adc6c9c946f384ee9badc3a4f53b__20220117_dcp2_20230815_dcp30:/0
## 4 tdr:bigquery:gcp:datarepo-071fb08c:hca_prod_94e4ee099b4b410a84dca751ad36d0df__20220519_dcp2_20220804_dcp19:/0
## 5 tdr:bigquery:gcp:datarepo-b3b1e92f:hca_prod_c5b475f276b34a8e8465f3b69828fec3__20230331_dcp2_20230331_dcp26:/0
## 6 tdr:bigquery:gcp:datarepo-41cca7ce:hca_prod_60ea42e1af4942f58164d641fdb696bc__20220117_dcp2_20230314_dcp25:/0
##                                                               workflow
## 1                                                                     
## 2                    optimus_post_processing_v1.0.0|optimus_v4.2.3, , 
## 3                    optimus_post_processing_v1.0.0|optimus_v4.2.2, , 
## 4 analysis_protocol_normalization|analysis_protocol_quantification, , 
## 5                                                analysis_protocol, , 
## 6                                                                     
##   libraryConstructionApproach nucleicAcidSource instrumentManufacturerModel
## 1                 10x 3' v2,      single cell,        , Illumina HiSeq 4000
## 2       , 10X v2 sequencing,    , single cell,     , , Illumina NextSeq 500
## 3    , 10X 3' v2 sequencing,    , single cell,     , , Illumina NextSeq 500
## 4                , CEL-seq2,    , single cell,      , , Illumina HiSeq 2500
## 5     , GEXSCOPE technology,    , single cell,         , , Illumina HiSeq X
## 6                Smart-seq2,      single cell,      , Illumina NovaSeq 6000
##   pairedEnd cellLineID cellLineType cellLinemodelOrgan
## 1   , FALSE                                           
## 2 , , FALSE                                           
## 3 , , FALSE                                           
## 4 , , FALSE                                           
## 5  , , TRUE                                           
## 6    , TRUE                                           
##                                                                                                                                                                             organoidsID
## 1                                                                                                                                                                                      
## 2                                                                                                                                                                                      
## 3                                                                                                                                                                                      
## 4 SAMN10645790|SAMN10645791|SAMN10645843|SAMN10645847|SAMN10645915|SAMN10645916|SAMN10645917|SAMN10645918|SAMN10645919|SAMN10645920|SAMN10645921|SAMN10645922|SAMN10645923|SAMN10645924
## 5                                                                                                                                                                                      
## 6                                                                                                                                                                                      
##   organoidsmodelOrgan organoidsmodelOrganPart            lastModifiedDate
## 1                                             2021-10-20T09:03:20.617000Z
## 2                                             2022-08-31T14:47:54.785000Z
## 3                                             2022-08-31T14:47:30.838000Z
## 4               liver                      NA 2022-04-21T15:03:35.422000Z
## 5                                             2023-03-20T11:06:31.139000Z
## 6                                             2021-09-30T17:38:34.675000Z

Summary attributes

GEfetch2R provides StatDBAttribute to summary attributes of Human Cell Atlas:

StatDBAttribute(df = all.hca.projects, filter = c("organism", "sex"), database = "HCA")
## $organism
##                    Value Num      Key
## 1           homo sapiens 464 organism
## 2           mus musculus  56 organism
## 3 canis lupus familiaris   1 organism
## 4         macaca mulatta   1 organism
## 
## $sex
##     Value Num Key
## 1  female 353 sex
## 2    male 348 sex
## 3 unknown 151 sex
## 4   mixed   5 sex

Extract metadata

GEfetch2R provides ExtractHCAMeta to filter projects metadata, the available values of attributes can be obtained with StatDBAttribute except cell number:

# human 10x v2 and v3 datasets
hca.human.10x.projects <- ExtractHCAMeta(
  all.projects.df = all.hca.projects, organism = "Homo sapiens",
  protocol = c("10x 3' v2", "10x 3' v3")
)

Show the metadata:

head(hca.human.10x.projects)
##                                                                                                                 projectTitle
## 1                                                                           A Single-Cell Atlas of the Human Healthy Airways
## 2                                                                    A Single-Cell Transcriptomic Atlas of Human Skin Aging.
## 3                                                A human breast atlas integrating single-cell proteomics and transcriptomics
## 4                                                              A human cell atlas of the pressure-induced hypertrophic heart
## 5                                                               A human embryonic limb cell atlas resolved in space and time
## 6 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates
##                              projectId                 projectShortname
## 1 ef1e3497-515e-4bbe-8d4c-10161854b699              healthyAirwaysAtlas
## 2 923d3231-7295-4184-b3f6-c3082766a8c7                   AgingSkinAtlas
## 3 9b876d31-0739-4e96-9846-f76e6a427279         breastTranscriptomeAtlas
## 4 902dc043-7091-445c-9442-d72e163b9879 PressureInducedHypertrophicHeart
## 5 b176d756-62d8-4933-83a4-8b026380262f                EmbryonicHindlimb
## 6 2fe3c60b-ac1a-4c61-9b59-f6556c0fce63                  FetalLungImmune
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   projectDescription
## 1 Rationale: The respiratory tract constitutes an elaborated line of defense that is based on a unique cellular ecosystem. Single-cell profiling methods enable the investigation of cell population distributions and transcriptional changes along the airways.Methods: We have explored the cellular heterogeneity of the human airway epithelium in 10 healthy living volunteers by single-cell RNA profiling. 77,969 cells were collected at 35 distinct locations, from the nose to the 12th division of the airway tree.Results: The resulting atlas is composed of a high percentage of epithelial cells (89.1%), but also immune (6.2%) and stromal (4.7%) cells with distinct cellular proportions in different regions of the airways. It reveals differential gene expression between identical cell types (suprabasal, secretory, and multiciliated cells) from the nose (MUC4, PI3, SIX3) and tracheobronchial (SCGB1A1, TFF3) airways. By contrast, cell-type specific gene expression is stable across all tracheobronchial samples. Our atlas improves the description of ionocytes, pulmonary neuro-endocrine (PNEC) and brush cells, and identifies a related population of NREP-positive cells. We also report the association of KRT13 with dividing cells that are reminiscent of previously described mouse “hillock” cells, and with squamous cells expressing SCEL, SPRR1A/B.Conclusions: Robust characterization of a single-cell cohort in healthy airways establishes a valuable resource for future investigations. The precise description of the continuum existing\nfrom the nasal epithelium to successive divisions of the airways and the stable gene expression profile of these regions better defines conditions under which relevant tracheobronchial proxies of human respiratory diseases can be developed.
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       Skin undergoes constant self-renewal, and its functional decline is a visible consequence of aging. Understanding human skin aging requires in-depth knowledge of the molecular and functional properties of various skin cell types. We performed single-cell RNA sequencing of human eyelid skin from healthy individuals across different ages and identified eleven canonical cell types, as well as six subpopulations of basal cells. Further analysis revealed progressive accumulation of photoaging-related changes and increased chronic inflammation with age. Transcriptional factors involved in the developmental process underwent early-onset decline during aging. Furthermore, inhibition of key transcription factors HES1 in fibroblasts and KLF6 in keratinocytes not only compromised cell proliferation, but also increased inflammation and cellular senescence during aging. Lastly, we found that genetic activation of HES1 or pharmacological treatment with quercetin alleviated cellular senescence of dermal fibroblasts. These findings provide a single-cell molecular framework of human skin aging, providing a rich resource for developing therapeutic strategies against aging-related skin disorders.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      The breast is a dynamic organ whose response to physiological and pathophysiological conditions alters its disease susceptibility, yet the specific effects of these clinical variables on cell state remain poorly annotated. We present a unified, high-resolution breast atlas by integrating single-cell RNA-seq, mass cytometry, and cyclic immunofluorescence, encompassing a myriad of states. We define cell subtypes within the alveolar, hormone-sensing, and basal epithelial lineages, delineating associations of several subtypes with cancer risk factors, including age, parity, and BRCA2 germline mutation. Of particular interest is a subset of alveolar cells termed basal-luminal (BL) cells, which exhibit poor transcriptional lineage fidelity, accumulate with age, and carry a gene signature associated with basal-like breast cancer. We further utilize a medium-depletion approach to identify molecular factors regulating cell-subtype proportion in organoids. Together, these data are a rich resource to elucidate diverse mammary cell states. Overall design: A total of 16 breast samples were assayed (4 samples from reductive mammoplasties and 12 from prophylactic mastectomies).
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               Pathological cardiac hypertrophy is a leading cause of heart failure, but knowledge of the full repertoire of cardiac cells and their gene expression profiles in the human hypertrophic heart is missing. Here, by using large-scale single-nucleus transcriptomics, we present the transcriptional response of human cardiomyocytes to pressure overload caused by aortic valve stenosis and describe major alterations in cardiac cellular crosstalk. Hypertrophied cardiomyocytes had reduced input from endothelial cells and fibroblasts. Genes encoding Eph receptor tyrosine kinases, particularly EPHB1, were significantly downregulated in cardiomyocytes of the hypertrophied heart. Consequently, EPHB1 activation by its ligand ephrin (EFN)B2, which is mainly expressed by endothelial cells, was reduced. EFNB2 inhibited cardiomyocyte hypertrophy in vitro, while silencing its expression in endothelial cells induced hypertrophy in co-cultured cardiomyocytes. Our human cell atlas of the hypertrophied heart highlights the importance of intercellular crosstalk in disease pathogenesis and provides a valuable resource.
## 5                                                                                                                                                                                                        Human limbs emerge during the fourth post-conception week as mesenchymal buds, which develop into fully formed limbs over the subsequent months. This process is orchestrated by numerous temporally and spatially restricted gene expression programmes, making congenital alterations in phenotype common. Decades of work with model organisms have defined the fundamental mechanisms underlying vertebrate limb development, but an in-depth characterization of this process in humans has yet to be performed. Here we detail human embryonic limb development across space and time using single-cell and spatial transcriptomics. We demonstrate extensive diversification of cells from a few multipotent progenitors to myriad differentiated cell states, including several novel cell populations. We uncover two waves of human muscle development, each characterized by different cell states regulated by separate gene expression programmes, and identify musculin (MSC) as a key transcriptional repressor maintaining muscle stem cell identity. Through assembly of multiple anatomically continuous spatial transcriptomic samples using VisiumStitcher, we map cells across a sagittal section of a whole fetal hindlimb. We reveal a clear anatomical segregation between genes linked to brachydactyly and polysyndactyly, and uncover transcriptionally and spatially distinct populations of the mesenchyme in the autopod. Finally, we perform single-cell RNA sequencing on mouse embryonic limbs to facilitate cross-species developmental comparison, finding substantial homology between the two species.
## 6                                                                                                                                                                                                                                                             We present a multiomic cell atlas of human lung development that combines single cell RNA and ATAC sequencing, high throughput spatial transcriptomics and single cell imaging. Coupling single cell methods with spatial analysis has allowed a comprehensive cellular survey of the epithelial, mesenchymal, endothelial and erythrocyte/leukocyte compartments from 5-22 post conception weeks. We identify new cell states in all compartments. These include developmental-specific secretory progenitors that resemble cells in adult fibrotic lungs and a new subtype of neuroendocrine cell related to human small cell lung cancer; observations which strengthen the connections between development and disease/regeneration. Our datasets are available for the community to download and interact with through our web interface ( https://fetal-lung.cellgeni.sanger.ac.uk ). Finally, to illustrate its general utility, we use our cell atlas to generate predictions about cell-cell signalling and transcription factor hierarchies which we test using organoid models. Highlights Spatiotemporal atlas of human lung development from 5-22 post conception weeks identifies 147 cell types/states. Tracking the developmental origins of multiple cell compartments, including new progenitor states. Functional diversity of fibroblasts in distinct anatomical signalling niches. Resource applied to interrogate and experimentally test the transcription factor code controlling neuroendocrine cell heterogeneity and the origins of small cell lung cancer.
##                                                                                                                 publications
## 1                                                                          A Single-Cell Atlas of the Human Healthy Airways.
## 2                                                                    A Single-Cell Transcriptomic Atlas of Human Skin Aging.
## 3                                                A human breast atlas integrating single-cell proteomics and transcriptomics
## 4                                                              A human cell atlas of the pressure-induced hypertrophic heart
## 5                                                               A human embryonic limb cell atlas resolved in space and time
## 6 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            laboratory
## 1                                                                                                                                                      Memorial Sloan Kettering Cancer Center, New York, New York|Université Côte d'Azur, CNRS, Institut Pharmacologie Moléculaire et Cellulaire, Sophia-Antipolis, France.|Université Côte d'Azur, Centre Hospitalier Universitaire de Nice, Fédération Hospitalo-Universitaire OncoAge, CNRS, Inserm, Institute for Research on Cancer and Aging Nice Team 3, Pulmonology Department, Nice, France.
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  NA
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  NA
## 4                                                                                                                                                                                                                                                                                                                   Cardiopulmonary Institute; Cardiac Metabolism Group, Department of Cardiology|Department of Cardiovascular Surgery|Institute for Cardiovascular Regeneration|Institute for Cardiovascular Regeneration; Cardiopulmonary Institute
## 5 Department of Clinical Neurosciences|Division of Biology and Biological Engineering|INSERM, CNRS, Institut de la Vision|John van Geest Centre for Brain Repair, Department of Clinical Neurosciences; Wellcome-MRC Cambridge Stem Cell Institute|Key Laboratory of Tropical Disease Control of Ministry of Education, Zhongshan School of Medicine, Institute of Human Virology|MRC Human Genetics Unit, IGC, WGH|The Key Laboratory for Stem Cells and Tissue Engineering, Zhongshan School of Medicine|Wellcome-MRC Cambridge Stem Cell Institute
## 6                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  NA
##                                                                                                  accessions
## 1                                                                                           EGAS00001004082
## 2                                                                                                          
## 3                                                                         SRP329970, GSE180878, PRJNA749859
## 4                                                                                              E-MTAB-11268
## 5 ERP119958, ERP129376, ERP160032, E-MTAB-8813, E-MTAB-10514, E-MTAB-10367, PRJEB36736, PRJEB45293, S-SUBS5
## 6                                                    E-MTAB-11265, E-MTAB-11278, E-MTAB-11267, E-MTAB-11266
##   accessible estimatedCellCount     sampleEntityType                 organ
## 1       TRUE              77969            specimens bronchus|nose|trachea
## 2       TRUE              35678            specimens          skin of body
## 3       TRUE              52682            specimens                breast
## 4       TRUE              88536            specimens                 heart
## 5       TRUE             125955            specimens     forelimb|hindlimb
## 6       TRUE                 NA organoids, specimens      lung, heart|lung
##                                                                                                          organPart
## 1                  epithelium of bronchus|epithelium of trachea|inferior nasal concha|terminal bronchus epithelium
## 2                                                                                                               NA
## 3                                                                                                               NA
## 4                                                                            interventricular septum muscular part
## 5 forelimb stylopod|forelimb zeugopod|hindlimb bud|hindlimb stylopod|hindlimb zeugopod|manus|pes|right hindlimb|NA
## 6                                                                              , heart atrium|lung|lung epithelium
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    sampleID
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              D322_Biop_Int1|D322_Biop_Nas1|D322_Biop_Pro1|D326_Biop_Int1|D326_Biop_Pro1|D326_Brus_Dis1|D337_Brus_Dis1|D339_Biop_Int1|D339_Biop_Nas1|D339_Biop_Pro1|D339_Brus_Dis1|D344_Biop_Int1|D344_Biop_Nas1|D344_Biop_Pro1|D344_Brus_Dis1|D353_Biop_Int2|D353_Biop_Pro1|D353_Brus_Dis1|D353_Brus_Nas1|D354_Biop_Int2|D354_Biop_Pro1|D354_Brus_Dis1|D363_Biop_Int2|D363_Biop_Pro1|D363_Brus_Dis1|D363_Brus_Nas1|D367_Biop_Int1|D367_Biop_Pro1|D367_Brus_Dis1|D367_Brus_Nas1|D372_Biop_Int1|D372_Biop_Int2|D372_Biop_Pro1|D372_Brus_Dis1|D372_Brus_Nas1
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 HRS118996|HRS118997|HRS118998|HRS118999|HRS119000|HRS119001|HRS119002|HRS119003|HRS119004
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           SAMN20422990|SAMN20422991|SAMN20422992|SAMN20422993|SAMN20422994|SAMN20422995|SAMN20422996|SAMN20422997|SAMN20422998|SAMN20422999|SAMN20423000|SAMN20423001|SAMN20423002|SAMN20423003|SAMN20423004|SAMN20423005
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          AS1_specimen|AS2_specimen|AS3_specimen|AS4_specimen|AS5_specimen
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         10_e15_mid|11_e15_dist|12_e13|13_e14|1_e13_5|3_e11|4_e12|5_e13|6_e15|7_e10_5|8_e15_whole|9_e15_prox|C1-FSM-0-SC-1|C13-FSM-0-SC-1|C3-FSM-0-SC-1|C34-FSM-0-SC-1|C35-FSM-0-SC-1|C36-FSM--SC-1|C36-FSM--SC-2|C4-FSM-0-SC-1|C42-FLEG-0-FO-1|C5-FSM-0-SC-1|C58-FLEG-0-FO-2-S15-D1-ii|C58-FLEG-0-FO-2-S16-D1-iii|C59-FLEG-0-FO-1-S11-C1-ii|C59-FLEG-0-FO-1-S18-D1-i|C59-FLEG-0-FO-1-S5-C1-i|C68-FARM-1-SC-1|C68-FARM-2-SC-1|C68-FARM-3-SC-1|C68-FLEG-1-SC-1|C68-FLEG-2-SC-1|C68-FLEG-3-SC-1|C70-FLEG-0-FO-2-S10-A1-ii|C70-FLEG-0-FO-2-S11-B1-i|C70-FLEG-0-FO-2-S20-B1-ii|C70-FLEG-0-FO-2-S7-A1-i|C72-FARM-0-SC-1|C72-FARM-0-SC-2|C72-FARM-0-SC-3|C72-FLEG-0-SC-1|C72-FLEG-0-SC-2|C72-FLEG-0-SC-3|C9-FSM-0-SC-1|GSM4227224|GSM4227225|GSM4227226|GSM4227227|GSM4498677|GSM4498678|RFL-D_1|RFL-D_2|RFL-M_1|RFL-M_2|RFL-P_1|RFL-P_2|RHL-D_1|RHL-D_2|RHL-M_1|RHL-M_2|RHL-P_1|RHL-P_2|RHL-P_3
## 6 HDBR-N13393_ASCL1_OE|HDBR-N13393_Ctrl_OE|HDBR-N13393_NEUROD1_OE|HDBR-N13393_NEUROG3_OE|HDBR-N13393_PAX9_OE|HDBR-N13393_RFX6_OE|HDBR-N13393_TFAP2A_OE|HDBR-N13393_deltaNp63alpha_OE|HDBR1915_ASCL1_OE|HDBR1915_Ctrl_OE|HDBR1915_NEUROD1_OE|HDBR1915_NEUROG3_OE|HDBR1915_PAX9_OE|HDBR1915_RFX6_OE|HDBR1915_TFAP2A_OE|HDBR1915_deltaNp63alpha_OE|HDBR2174_ASCL1_OE|HDBR2174_Ctrl_OE|HDBR2174_NEUROD1_OE|HDBR2174_NEUROG3_OE|HDBR2174_PAX9_OE|HDBR2174_RFX6_OE|HDBR2174_TFAP2A_OE|HDBR2174_deltaNp63alpha_OE, 15413-LNG--FO-3_specimen|15415-LNG--FO-2_specimen|15417-LNG-0-FO-4_specimen|15424-LNG-0-FO-3_specimen|15428-LNG-0-FO-2_specimen|15737-FLNG-1(distal)_specimen|15739-FLNG-1(distal)_specimen|15739-FLNG-3(proximal)_specimen|15773-FLNG-3(proximal)_specimen|5478STDY7698210specimen|5698STDY7839908specimen|5698STDY7839910specimen|5698STDY7839918specimen|5891STDY8062349specimen|5891STDY8062350specimen|5891STDY8062351specimen|5891STDY8062352specimen|5891STDY8062353specimen|5891STDY8062354specimen|5891STDY8062355specimen|5891STDY8062356specimen|5891STDY9030806_8specimen|5891STDY9030807_9specimen|Hst4-LNG--FO-1_specimen|Hst5-LNG--FO-1_specimen|Hst7-LNG--FO-2_specimen|SIGAA10specimen|SIGAB10specimen|SIGAC10specimen|SIGAE6specimen|SIGAF6specimen|SIGAG12specimen|SIGAG6specimen|SIGAH10specimen|SIGAH12specimen|SIGAH4specimen|WSSS8011222specimen|WSSS8012016specimen|WSSS_F_LNG8713176specimen|WSSS_F_LNG8713177specimen|WSSS_F_LNG8713178specimen_15168proximal|WSSS_F_LNG8713179specimen_15168distal|WSSS_F_LNG8713184specimen_15233proximal|WSSS_F_LNG8713185specimen_15233distal|WSSS_F_LNG8713186specimen_proximal|WSSS_F_LNG8713187specimen_distal
##                 disease         preservationMethod donorCount
## 1                normal                         NA         10
## 2                normal                         NA          9
## 3                normal                         NA         16
## 4 aortic valve stenosis    cryopreservation, other          5
## 5                normal cryopreservation, other|NA         36
## 6              , normal                       , NA         38
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              developmentStage
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           human adult stage
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          adolescent stage|human adult stage
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           human adult stage
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           human adult stage
## 5                                                                                                                                                                                                                                                             10th week post-fertilization human stage|9th week post-fertilization human stage|Carnegie stage 21|Carnegie stage 22|Carnegie stage 23|embryo stage|embryonic day 10.5|embryonic day 11.5|embryonic day 12|embryonic day 12.5|embryonic day 13|embryonic day 13.5|embryonic day 14.5|embryonic day 15.5|embryonic day 16.5|embryonic day 18.5|embryonic day 9.5
## 6 11th week post-fertilization human stage|12th week post-fertilization human stage|13th week post-fertilization human stage|14th week post-fertilization human stage|15th week post-fertilization human stage|16th week post-fertilization human stage|17th week post-fertilization human stage|18th week post-fertilization human stage|19th week post-fertilization human stage|20th week post-fertilization human stage|21st week post-fertilization human stage|22nd week post-fertilization human stage|9th week post-fertilization human stage|Carnegie stage 14|Carnegie stage 17|Carnegie stage 22|Carnegie stage 23
##                genusSpecies       biologicalSex selectedCellType catalog
## 1              Homo sapiens         female|male       NA, NA, NA   dcp43
## 2              Homo sapiens              female               NA   dcp43
## 3              Homo sapiens              female               NA   dcp43
## 4              Homo sapiens         female|male               NA   dcp43
## 5 Homo sapiens|Mus musculus female|male|unknown           NA, NA   dcp43
## 6              Homo sapiens female|male|unknown           NA, NA   dcp43
##                                entryId                             sourceId
## 1 ef1e3497-515e-4bbe-8d4c-10161854b699 949febcd-69f0-4c61-aaf8-52252d0ae5cd
## 2 923d3231-7295-4184-b3f6-c3082766a8c7 445d8c04-718b-4367-9616-38d106e4431b
## 3 9b876d31-0739-4e96-9846-f76e6a427279 d5994701-3915-4357-a3f7-d1b3d8ebb0ee
## 4 902dc043-7091-445c-9442-d72e163b9879 52a17083-f001-4e0f-b207-6b1d8fa32579
## 5 b176d756-62d8-4933-83a4-8b026380262f 124b11a6-66a9-4e3e-b14a-255d21521b53
## 6 2fe3c60b-ac1a-4c61-9b59-f6556c0fce63 f52e9f35-cbbb-43f9-bb87-94f9d2788bf6
##                                                                                                      sourceSpec
## 1 tdr:bigquery:gcp:datarepo-145a904d:hca_prod_ef1e3497515e4bbe8d4c10161854b699__20220118_dcp2_20231213_dcp34:/0
## 2 tdr:bigquery:gcp:datarepo-64e86c6c:hca_prod_923d323172954184b3f6c3082766a8c7__20220906_dcp2_20230314_dcp25:/0
## 3 tdr:bigquery:gcp:datarepo-9f5be9ac:hca_prod_9b876d3107394e969846f76e6a427279__20220906_dcp2_20230314_dcp25:/0
## 4 tdr:bigquery:gcp:datarepo-f0498b78:hca_prod_902dc0437091445c9442d72e163b9879__20240201_dcp2_20240328_dcp37:/0
## 5 tdr:bigquery:gcp:datarepo-582bf509:hca_prod_b176d75662d8493383a48b026380262f__20240903_dcp2_20240904_dcp42:/0
## 6 tdr:bigquery:gcp:datarepo-573f4ced:hca_prod_2fe3c60bac1a4c619b59f6556c0fce63__20220606_dcp2_20230314_dcp25:/0
##                                                                             workflow
## 1 ap_matrixNormalization|ap_rawMatrixGeneration_hg19|ap_rawMatrixGeneration_hg38, , 
## 2                                                                                   
## 3                                        analysis_protocol_1|analysis_protocol_2, , 
## 4                                                          raw_matrix_generation, , 
## 5                                               scRNA_analysis|visium_analysis, , , 
## 6                           processed_matrix_ap|raw_matrix_ap|visium_matrix_ap, , , 
##                                                     libraryConstructionApproach
## 1                                                                 , 10x 3' v2, 
## 2                                                                   10x 3' v2, 
## 3                                                                 , 10x 3' v2, 
## 4                                                                 , 10x 3' v3, 
## 5            , , 10x 3' v2|10x 3' v3|10x 5' v1|Visium Spatial Gene Expression, 
## 6 , , 10x 3' v2|10x 3' v3|10x 5' v1|Visium Spatial Gene Expression|scATAC-seq, 
##             nucleicAcidSource                     instrumentManufacturerModel
## 1             , single cell,    , , Illumina NextSeq 500|Illumina NextSeq 550
## 2               single cell,                          , Illumina NovaSeq 6000
## 3             , single cell,                             , , Illumina HiSeq X
## 4          , single nucleus,                        , , Illumina NovaSeq 6000
## 5 , , bulk cell|single cell,  , , , Illumina HiSeq 4000|Illumina NovaSeq 6000
## 6           , , single cell,  , , , Illumina HiSeq 4000|Illumina NovaSeq 6000
##     pairedEnd cellLineID cellLineType cellLinemodelOrgan
## 1   , , FALSE                                           
## 2     , FALSE                                           
## 3   , , FALSE                                           
## 4   , , FALSE                                           
## 5 , , , FALSE                                           
## 6 , , , FALSE                                           
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         organoidsID
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
## 6 HDBR-N13393_ASCL1_OE|HDBR-N13393_Ctrl_OE|HDBR-N13393_NEUROD1_OE|HDBR-N13393_NEUROG3_OE|HDBR-N13393_PAX9_OE|HDBR-N13393_RFX6_OE|HDBR-N13393_TFAP2A_OE|HDBR-N13393_deltaNp63alpha_OE|HDBR-N13393_organoid|HDBR1915_ASCL1_OE|HDBR1915_Ctrl_OE|HDBR1915_NEUROD1_OE|HDBR1915_NEUROG3_OE|HDBR1915_PAX9_OE|HDBR1915_RFX6_OE|HDBR1915_TFAP2A_OE|HDBR1915_deltaNp63alpha_OE|HDBR1915_organoid|HDBR2174_ASCL1_OE|HDBR2174_Ctrl_OE|HDBR2174_NEUROD1_OE|HDBR2174_NEUROG3_OE|HDBR2174_PAX9_OE|HDBR2174_RFX6_OE|HDBR2174_TFAP2A_OE|HDBR2174_deltaNp63alpha_OE|HDBR2174_organoid
##   organoidsmodelOrgan organoidsmodelOrganPart            lastModifiedDate
## 1                                             2023-11-02T12:30:26.777000Z
## 2                                             2022-08-08T11:11:06.854000Z
## 3                                             2022-08-23T12:48:58.239000Z
## 4                                             2024-03-06T13:09:41.289000Z
## 5                                             2024-07-03T12:56:39.509000Z
## 6                lung epithelial cell of lung 2022-05-20T09:34:37.057000Z

Download object and load to R

After manually check the extracted metadata, users can download the specified objects with ParseHCA. The downloaded objects are controlled by file.ext (choose from "rds", "rdata", "h5", "h5ad" and "loom").

The returned value is a dataframe containing failed objects or a SeuratObject (if file.ext is rds). If dataframe, users can re-run ParseHCA by setting meta to the returned value.

# download objects
hca.down <- ParseHCA(
  meta = hca.human.10x.projects[1:4, ], out.folder = "/Volumes/soyabean/GEfetch2R/download_hca",
  file.ext = c("h5ad", "rds")
)

# return SeuratObject
hca.down.seu <- ParseHCA(
  meta = hca.human.10x.projects[1:4, ], out.folder = "/Volumes/soyabean/GEfetch2R/download_hca",
  file.ext = c("h5ad", "rds"), return.seu = TRUE
)

Show the returned SeuratObject:

hca.down.seu
## An object of class Seurat 
## 33555 features across 88536 samples within 1 assay 
## Active assay: RNA (33555 features, 2000 variable features)
##  3 dimensional reductions calculated: pca, harmony, umap

The example structure of downloaded objects:

tree /Volumes/soyabean/GEfetch2R/download_hca
## /Volumes/soyabean/GEfetch2R/download_hca
## └── seurat_object_hca_as_harmonized_AS_SP_nuc_refined_cells.rds
## 
## 1 directory, 1 file