Connecting Data to Accelerate Cancer Research

The NCI Cancer Research Data Commons (CRDC) is a cloud-based data science infrastructure that provides secure access to a large, comprehensive, and expanding collection of cancer research data. Users can explore and use analytical and visualization tools for data analysis in the cloud.

85,937 Participants 67 Anatomical Sites 109 Studies 641,980 Files 1.45PB Data
109 Studies 85,937 Participants 641,980 Files 1.45PB Data 67 Sites 109 Studies 85,937 Participants 641,980 Files 1.45PB Data 67 Anatomical Sites
Integrated Canine Data Commons

Comparative oncology is the study of naturally developing cancers in animals as a model for human disease. The Integrated Canine Data Commons (ICDC) provides access to canine clinical trial data to drive insight into human cancer research.

Latest Blog
How Can Big Data Help to Address Health Disparities?

Dr. Tony Kerlavage, director of NCI’s Center for Biomedical Informatics and Information Technology (CBIIT), sat down to discuss one key component of racial inequality, the issue of health disparities, as it relates to Big Data. As noted by Dr. Kerlavage, representing our diverse U.S. population in research and in the workforce are key, but we also need better data.



Store and share NCI-funded data that are not hosted elsewhere to further advance scientific discovery across a broad range of research areas.

Store and share data from NCI Clinical Trials. The resource is expected to launch in 2020.

Share, analyze, and visualize harmonized genomic data, including TCGA, TARGET, and CPTAC.

Share, analyze, and visualize multi-modal imaging data from both clinical and basic cancer research studies.

Share data from canine clinical trials, including the PRE-medical Cancer Immunotherapy Network Canine Trials (PRECINCT) and the Comparative Oncology Program.

Share, analyze, and visualize proteomic data, such as CPTAC and The International Cancer Proteogenome Consortium (ICPC).


Enables users to query and connect data distributed across the CRDC for integrative analysis.

Provides semantic services and tools that facilitate interoperability of data across CRDC.

Provides secure user authentication and authorization and permanent digital object identifiers for data objects.

Analytical Resources

Access elastic compute capacity of Google Cloud Platform to perform large-scale multi-omics analyses.

Access data sets using fully interactive web-based applications, including BigQuery, which is hosted on Google Cloud Platform.

Explore and analyze large datasets alongside secure and scalable analytical resources for large-scale computational research.


CRDC is built for researchers

  • Enable the cancer research community to share diverse data types
  • Provide secure access to data
  • Facilitate the generation of innovative tools
  • Adhere to FAIR principles of data stewardship: Findable, Accessible, Interoperable, and Reusable