Cancer Data Service(CDS)
The Cancer Data Service (CDS) is a data repository under the Cancer Research Data Commons (CRDC) infrastructure for storing cancer research data generated by NCI funded programs. The CDS provides secure and authorized storage and data sharing capabilities in the cloud for studies that can fall under either of the categories below:
- Studies with data which do not fit current data type criteria for submission, and/or do not meet minimum metadata standards for submission to a CRDC Data Commons
- Studies with data that do not have a Data Commons set up for the data type
- Studies which are on a waiting list on a specific CRDC Data Commons for storage such as the Genomics Data Commons.
The CDS system is hosted on NCI's CloudOne Amazon instance. The clinical data, biospecimen data, and derived mutation files for these projects are stored in the Database for Genotypes and Phenotypes (dbGaP) database provided by National Center for Biotechnology Information (NCBI).
The CDS contains mostly genomic data but can accommodate multiple data types based on the accepted studies.
The Seven Bridges Cancer Genomics Cloud, one of the NCI’s Cloud Resources, can be used as a primary resource for analyzing data. To help researchers analyze data in the cloud, CBIIT established three Cloud Resources that provide support for data access through a web-based user interface in addition to programmatic access to analytic tools and workflows, and the capability of sharing results with collaborators. ISB-CGC, and Broad FireCloud are the other two Cloud Resources which could be used for analysis in future. While SB-CGC is established on Amazon Web Services (AWS), ISB-CGC and Broad FireCloud are Google Cloud based.
The CDS hosts controlled and open access data. Access to controlled access data on CDS is through the NCI DAC approved, dbGaP compiled whitelists. No authorization will be required for any open access studies, although we do not currently have any open access datasets in the CDS.
Users can access the data for analysis through the Seven Bridges Cancer Genomics Cloud (SB-CGC) which is one of the NCI-funded Cloud Resource/platform for compute intensive analysis.
Links are provided for the studies released on dbGaP and data access is available on SB-CGC
- Detection of Colorectal Cancer Susceptibility Loci Using Genome-Wide Sequencing (phs001554.v1.p1)
- Development of A Tumor Molecular Analyses Program and Its Use to Support Treatment Decisions (phs001713.v1.p1)
- Pediatric Preclinical Testing Consortium (phs001437.v1.p1)
- The Genetic Basis of Aggressive Prostate Cancer, The Role of Rare Variation (phs001524.v1.p1)
- Discovery of colorectal cancer susceptibility genes in high-risk families (phs001787.v1.p1)
- Prostrate, Lung, Colorectal and Ovarian Cancer Screening Trial (phs002011.v1.p1)
- Whole genome sequencing to discover familial myeloma risk genes (phs001819.v2.p1)
- Feasibility and Clinical Utility of Whole Genome Profiling in Pediatric and Young Adult Cancers (phs002620.v1.p1)
- Lung Cancer Genetic Study among Asian Never Smokers (phs002366.v1.p1)
- NCI CCSG CCDI Supplement Additional Genomic Submission (phs002599.v1.p1)
- Molecular Pathological Epidemiology of Colorectal Cancer (phs002050.v2.p1)
- Human Tumor Atlas Network (phs002371.v1.p1)
- NCI UCSF CCDI Database for the Advancement of JMML - Integration of Metadata with "Omic" Data (phs002504.v1.p1)
CDS has data on the following tumors: