Researchers can access CRDC-hosted cancer research datasets either through CRDC repositories or through CRDC Cloud Resources.
Repositories / Key Datasets | Description | Link |
---|---|---|
Genomic Data Commons (GDC): TCGA, TARGET, CPTAC, FMI, CCLE, APOLLO |
The GDC houses both open and controlled access datasets. The list of open and closed datasets can be viewed through the GDC portal. Access to controlled GDC datasets requires an NIH review process. |
|
Proteomic Data Commons (PDC): CPTAC, ICPC, APOLLO |
All PDC data are accessible to the public as open access datasets. |
|
Imaging Data Commons (IDC): TCGA, CPTAC, and HTAN The IDC pulls data from these datasets, through the Imaging Data Archive and other NCI-approved imaging repositories. |
Only de-identified, publicly available data are available through the IDC. |
|
Integrated Canine Data Commons (ICDC): COTC007B, GLIOMA01, NCATS-COP-01 |
ICDC data are accessible to the public as open access datasets. |
|
Clinical Trials Data Commons (CTDC): ECOG-ACRIN and NCI-MATCH |
CTDC datasets are available under restricted access. |
|
Cancer Data Service (CDS): HTAN, CCDI, PLCO |
The CDS hosts controlled and open access data. Access to controlled access data is through the NCI DAC approval process. |
The following cloud resources make datasets accessible for analysis in secure workspaces, and facilitate uploading of researchers’ own data for aggregated/federated analysis.
Cloud Resources / Key Datasets | Description | Link |
---|---|---|
Seven Bridges Cancer Genomics Cloud (SB-CGC) Includes data housed in several CRDC repositories plus other NCI datasets or datasets from international sources or US sources not directly affiliated with the NCI. |
Access to datasets follows guidance outlined with each dataset and/or repository. | |
ISB Cancer Gateway in the Cloud (ISB-CGC) Datasets include TARGET, TCGA, and CPTAC. ISB also hosts data from specialty databases such as TP53 and the Mitelmen databases. |
Access to datasets follows guidance outlined with each dataset and/or repository. | ISB-CGC |
Broad Institute’s FireCloud Powered by Terra Datasets include TARGET and TCGA. |
Access to datasets follows guidance outlined with each dataset and/or repository. |
Broad Institute’s FireCloud |
Other Data Access Portals in Development
cBioPortal: cBioPortal is an open-access, open-source resource for exploratory and interactive visualization, analysis, and download of large-scale cancer genomics data sets. It is hosted by the Center for Molecule Oncology at Memorial Sloan Kettering Cancer Center (MSK). The CRDC-specific instance of the cBioPortal portal integrates all public genomic datasets found within the CRDC. Learn more about the cBioPortal.
Galaxy: Galaxy is an open source, web-based platform used for computational biomedical research, which allows users without programming experience to easily specify parameters and run individual tools as well as larger workflows. This interactive analysis tool is being integrated into the CGC Data Studio by Seven Bridges.
For the full list of NCI datasets, go to the NCI data catalog.
More information on accessing open and controlled-access data can be found on the NCI Office of Data Sharing website.