CRDC Insights

Updates from the Cancer Research Data Commons:
Empowering the Scientific Community to Make New Discoveries

CRDC Components: Updates

December 15, 2024
Image
CRDC Component image

 

Four of CRDC’s Data Commons provide updates. From the GDC: new webinars, data dictionary updates, and an updated portal. From the PDC: new types of data available, a new tab for exploration of treatment data, plus a new publication. From the IDC: 25,000 new pathology images added from three large datasets. From the ICDC: a new software release for easier search and analysis.


Genomic Data Commons (GDC)

  • The GDC Analysis Tool Challenge: Registration is open until November 25th.

    The NCI GDC Analysis Tool Challenge is a collaborative competition to enhance cancer research by integrating innovative analysis tools with the GDC. Participants will leverage data available in the GDC to develop their tools, which upon winning, will be featured in the GDC Data Portal Analysis Center. This initiative aims to broaden the accessibility and utility of advanced analytical tools for cancer researchers globally. Learn more and register.
     
  • Webinar: Integrating Analysis Tools with GDC 2.0: Harnessing the Power of the GDC SDK

    This webinar introduces the GDC Analysis Tool Software Development Kit (SDK), which allows the GDC and other 3rd party applications to operate within the GDC Data Portal. The GDC Analysis Tool Challenge requires that participants integrate their tools with the GDC using the SDK. Learn more about the SDK in the GDC Developer’s Guide.
     
  • GDC Data Dictionary updates

    New updates were released in September with additional properties to further refine TCGA clinical data. Release details are available on the GDC website
     
  • Updated GDC Data Portal (2.3.0)

    This new release introduces new features to enhance the user experience and improve data analysis: 

    o   Cohort Builder Enhancements: Custom case, gene, and mutation filters, and reorganization of disease-specific classification filters

    o   Filter Reset Improvements: Enhanced reset options, including the ability to reset filters to default, and the ability to maintain local filters when the cohort changes. Ability to collapse or expand facet filters.

    o   Case Summary Updates: Other clinical attributes now included in case summary pages

    o   Clinical Data Analysis Tool Enhancements: Numerical data rounding added where applicable for simplified results

    o   OncoMatrix Update: Now supports custom gene sets

    o   Gene Expression Clustering Tool: Improved performance with default plots displaying the top 1,000 variably expressed genes and the first 1,000 cases. Plus, ability to customize the clustering plot color scheme.

    This release, named after renowned scientist Linus Pauling, continues GDC's tradition of honoring distinguished researchers. Read more and find the Release Notes.
     
  • Upcoming Webinar:  Sequence Reads & BAM Slicing Tools

    Scheduled for November 18, 2pm ET, this new webinar is for researchers visualizing DNA/RNA reads or targeting specific genes, SNPs, or variants. Learn more and register.   

Proteomic Data Commons (PDC)  

  •  New data types: Proteomic Data Commons now supports metabolomics and lipidomics data

    The PDC is the first database of its kind to present metabolomics and lipidomics data along with proteomics data from the same sample cohort. These new data types include raw mass spectrometry files, harmonized metadata, and any processed data provided by data submitters as supplementary data. When integrated with other omics data, metabolomics and lipidomics data contribute to a more complete view of tumor biology and the microenvironment.

    Read more and find a summary of currently available metabolomics and lipidomics data through the PDC.
     
  • New publication: NCI's Proteomic Data Commons: A Cloud-Based Proteomics Repository Empowering Comprehensive Cancer Analysis Through Cross-Referencing with Genomic and Imaging Data

    Published in Cancer Research Communications, a journal of the American Association for Cancer Research, it highlights PDC’s crucial role in cancer research by providing a centralized repository of high-quality cancer proteomic data enriched with extensive clinical annotations. By integrating and cross-referencing with complementary genomic and imaging data, the PDC facilitates multi-omics analyses, driving comprehensive insights and accelerating discoveries across various cancer types.

    Read more about this publication on the Office of Cancer Clinical Proteomics Research website
     
  • Treatment Data Display and Distribution

    A new data field and tab have been added to the Case Summary page to display information related to clinical treatment. Clinical data manifests for download or use through an NCI Cloud Resource can also include treatment files. Multiple treatment records for each participant/patient in the studies can be captured and presented on the PDC’s graphical user interface.
     
  • APOLLO Lung Adenocarcinoma (LUAD) Treatment Data

    Treatment data from the APOLLO LUAD cohort is now harmonized with the PDC data model, making it available for exploration on the PDC portal. Find the APOLLO LUAD data here

Imaging Data Commons (IDC)

Release v19: Focus on Digital Pathology

More than 25,000 new pathology images are now publicly available on the IDC portal to explore, visualize, download, analyze, and reuse. New images are available from the following projects: 

  • Genotype-Tissue Expression (GTEx) Project established a data resource and tissue bank to study the relationship between genetic variants and gene expression in multiple human tissues and across individuals. GTEx collected 26,468 unique tissue samples from more than 50 different tissue types from 956 donors. Learn more about this project and find links to the data
     
  • Cancer Moonshot Biobank (CMB) collects longitudinal clinical data and biospecimens shared by participants throughout their standard-of-care treatments at participating US medical institutions. Data will be collected longitudinally from at least 1,000 patients across at least 10 cancer types, who represent the demographic diversity of the US. Learn more and find currently available CMB imaging data
     
  • Molecular Characterization Initiative (MCI) of the National Cancer Institute’s Childhood Cancer Data Initiative is a component of the NCI Childhood Cancer Data Initiative (CCDI). The MCI enhances the understanding of genetic factors in pediatric cancers and provides timely, clinically relevant findings to doctors and families to aid in treatment decisions and determine eligibility for clinical trials. Learn more and find MCI imaging data on the IDC portal.

Integrated Canine Data Commons (ICDC)

Release V4.1.0: Focus on easier data exploration and analysis  

The newest software update from the ICDC adds enhanced search and exploration capabilities, and makes it easier for researchers to export data to SB-CGC for analysis. The ICDC has also added a new version of JBrowse Genomic Viewer (2.11.0) to enhance performance. Additional new features and improvements include:   

  • Expandable sample profiles with interactive sample distributions and legends
     
  • New ways to export or download a file manifest for select files or all files from within the My Files Cart page, including the option to export directly to the Seven Bridges Cancer Genomics Cloud (SB-CGC)
     
  • An updated Explore Dashboard page that makes it possible to copy the URL related to specific queries
     
  • The ability to activate a node traceback by selecting a node from the Data Model Navigator to better understand relationships within the data model
     
  • New videos to highlight scientific programs and contributing researchers within the ICDC Programs and Program Details pages
     
  • A new page to feature the ICDC Working Groups

Release notes can be found in ICDC’s GitHub.