Data

 

Key Datasets


The NCI CRDC provides access to a variety of open and controlled datasets from NCI programs and key external cancer programs. Key datasets include: 

Dataset Name Description Available Resources
The Cancer Genome Atlas (TCGA) A collaboration between NCI and the National Human Genome Research Institute (NHGRI) that has characterized tumor and normal tissues from 11,000 patients, covering 33 cancer types GDC, Broad, SB, ISB, IDC*
Therapeutically Applicable Research to Generate Effective Treatments (TARGET)  A consortium of extramural and NCI investigators working to characterize and understand hard-to-treat childhood cancers and translate findings into the clinic. GDC, Broad, SB, ISB
Clinical Proteomic Tumor Analysis Consortium (CPTAC) A national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics. GDC, PDC, Broad, ISB, SB, IDC*
Human Cancer Model Initiative (HCMI) An international consortium that is generating novel, next-generation, tumor-derived culture models complete with genomic and clinical data. GDC, SB
Cancer Genome Characterization Initiatives (CGCI) An initiative examining genomes, exomes, and transcriptomes of various types of adult and pediatric cancers. GDC, SB
Foundation Medicine (FM)  Targeted sequencing data from ~18,000 adult patients generated by the Foundation Medicine Inc., molecular information company seeking to match patients with personalized treatment plans. GDC, SB
Multiple Myeloma Research Foundation (MMRF) Data from nearly 1,000 patients with extensive molecular and clinical data, including longitudinal information collected over the course of disease for many patients. GDC, SB
Genomics Evidence Neoplasia Information Exchange (GENIE) Over 44,000 cases from the international pan-cancer registry continuing to be collected by the American Association for Cancer Research (AACR) initiative. GDC, SB
International Cancer Proteogenomic Consortium (ICPC) An international consortium that brings together more than a dozen countries to study the application of proteogenomic analysis in predicting cancer treatment success and to share data and results with researchers worldwide, hastening progress for patients. PDC, SB
Children's Brain Tumor Tissue Consortium (CBTTC) A collaborative research consortia focused on identifying therapies for children with brain tumors PDC, SB
Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO) A research collaboration to detect colorectal cancer susceptibility loci using genome-wide sequencing CDS, SB
Comparative molecular life history of spontaneous canine and human gliomas (GLIOMA01) Characterization of the genomic and transciptomic landscape of canine glioma to enable cross-species comparative genomic analysis of sporadic glioma ICDC*, SB
* coming soon

 

Users can bring their own data to combine with the existing data to perform novel analyses through the NCI Cloud Resources.

 

How to Submit Data


The NCI CRDC provides data repositories to store diverse data types.  Currently, data submission requests are made by completing an application at each target repository’s website or by contacting the helpdesk.  For more information, please visit: https://datascience.cancer.gov/data-sharing/submitting-data

 

How to Access Data


The mission of the NCI CRDC is to enhance data sharing by making research data Findable, Accessible, Interoperable, and Reusable (FAIR) for the broad cancer research community.  More information on accessing open and controlled access data can be found on the NCI Office of Data Sharing website: https://datascience.cancer.gov/data-sharing/accessing-data 

 

CRDC Data Use Policy Statement


The Cancer Research Data Commons (CRDC) data sharing and use expectations are consistent with applicable international, national, tribal, and state laws and regulations and relevant institutional policies for data submission, access and sharing that enable broad data access on the CRDC ecosystem. Data available on the CRDC ecosystem are subject to both general and data-set specific data use policies, and access to controlled data are restricted to authorized users.  Users and User’s institutions are responsible for understanding terms of use and adhering to Data Use Agreement(s), Institutional Review Board policies and guidelines.  Users must ensure that data uses for research purposes are consistent with the informed consent of study participants from whom the data were obtained when uploading or downloading data from the CRDC ecosystem.  Visit NCI Office of Data Sharing website for more information.