Cancer Research Data Commons

 

The goal of the National Cancer Institute’s Cancer Research Data Commons (CRDC) is to empower researchers to accelerate data-driven scientific discovery by connecting diverse datasets with analytical tools in the cloud. The CRDC is built upon an expandable data science infrastructure that provides secure access to many different data across scientific domains via Data Commons Framework. The CRDC enables users to search and aggregate data across repositories via the Cancer Data Aggregator using a common data model developed by the Center for Cancer Data Harmonization. The ability to combine diverse data types and perform cross-domain analysis of large cancer datasets can lead to new discoveries in cancer prevention, treatment and diagnosis, further supporting the goals of precision medicine and the Cancer Moonshot℠. The CRDC will encompass and connect multiple cloud-based data repositories and serve as a central location to support public data sharing for NCI-funded programs.

 

 

Image
NCI Cancer Research Data Commons Infographic for Repositories, Tools, and Infrastructure

 

CRDC Features


 

Access Diverse Data Types

Datasets include:

  • The Cancer Genome Atlas (TCGA)
  • Therapeutically Applicable Research to Generate Effective Treatments (TARGET)
  • The Clinical Proteomic Tumor Analysis Consortium (CPTAC)

Open and controlled-access data are available through NCI Cloud Resources and repositories.

 

Analyze Datasets using Innovative Tools

CRDC offers a wide range of innovative analytical and visualization tools through NCI Cloud Resources. Users can bring their own tools and data to the cloud to access more than 1,000 tools and workflows.

 

Share Your Data with the Community

To ensure broad, equitable data sharing for the research and participant (patient and advocate) communities, NCI’s Office of Data Sharing is establishing a comprehensive data sharing vision and strategy.

Learn more at datasharing.cancer.gov.

 

CRDC Goals


  • Enable the cancer research community to share diverse data types across programs and institutions
  • Provide secure access to data
  • Facilitate the generation of innovative tools
  • Help NCI-funded Data Coordinating Centers sustain and share data publicly
  • Build in an open and modular way to make components extendable and reusable
  • Adhere to FAIR principles of data stewardship: Findable, Accessible, Interoperable, and Reusable

CRDC Fact Sheet