CRDC Insights

Updates from the Cancer Research Data Commons:
Empowering the Scientific Community to Make New Discoveries

Apr 25, 2024

CPTAC's Pan-Cancer Multi-omic Papers: Data Accessible Through CRDC

Image
Image depicting the CPTAC Consortium

Researchers with NCI’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) have produced a resource of global proteome and post-translational modifications, whole genome and whole exome sequencing, miRNA and total RNA sequencing, DNA methylation, imaging, and clinical information for more than 1,000 cancer patients across 10 tumor types. A description of the effort to harmonize and disseminate this resource, as well as four subsequent studies that use it, were published in mid-August online by Cell Press. These analyses, probing post-translational modifications (PTMs), oncogenic drivers, DNA methylation, and histopathology images are the first of what is expected will be many to leverage this rich, now publicly available, resource.

Read more on the NCI Division of Cancer Treatment and Diagnosis website.

Data from CPTAC research is typically found across several CRDC data commons including the Proteomic Data Commons (PDC), the Genomic Data Commons (GDC), Imaging Data Commons (IDC) and the Cancer Data Service (CDS). Data from this recent series of papers can be found currently on the Proteomic Data Commons (PDC)Researchers can explore the data through the PDC and can take that data to one of the CRDC cloud resources for further analysis.

CPTAC members working on this pan-cancer multi-omic research created multiple comparative workflows in applying various analytical strategies to these data. Those tools are described in the lead-off paper, Proteogenomic Data and Resources for Pan-Cancer Analysis, Cancer Cell, August 14, 2023. 

CPTAC has placed the processed and curated data files from this paper in the CDSThe CPTAC data stored in the CDS include all the harmonized proteogenomic data for the pan-cancer analyses, including mutation calls, RNA and protein quantification tables, clinical and demographic data, and derived molecular data such as HLA typing, immune cell decomposition, and ancestry prediction. The CPTAC pan-cancer data hosted in CDS is controlled data. Access to controlled access data on CDS is through the NCI data access policies approved, dbGaP compiled whitelists.

Once approved, users can access the data for analysis with a queryable web portal through the Seven Bridges Cancer Genomics Cloud with dbGaP Study Accession, phs001287.v16.p6.