Data Release Updates

This aggregated page of listings is updated regularly. These listings derive from individual data commons or cloud resource websites and portals.

Learn more about accessing all data on our How to Access Data page.  

 


Genomic Data Commons (GDC)

The following data were released on March 29, 2023.   

  • New Projects

    • APOLLO-LUAD - Proteogenomic characterization of lung adenocarcinoma - phs003011
      • 87 cases
      • WGS, RNA-Seq
    • CGCI-HTMCP-LC - HIV+ Tumor Molecular Characterization Project - Lung Cancer - phs000530
      • 39 cases
      • WGS, RNA-Seq, miRNA-Seq, Slide Images
    • MATCH-Q - Genomic Characterization CS-MATCH-0007 Arm Q - phs001926
      • 35 cases
      • WXS, RNA-Seq
    • MATCH-Y - Genomic Characterization CS-MATCH-0007 Arm Y - phs001904
      • 31 cases
      • WXS, RNA-Seq
  • New Data from Existing Projects

    • CPTAC-3 - 139 new cases and two new snRNA-Seq samples
    • HCMI-CMDC - 118 new cases
    • TCGA-THCA - 941 new WGS alignments
    • TARGET-OS and TARGET-ALL-P2 - Masked Somatic Mutation MAFs are now open access and their mutations now appear in the exploration portal.
  • Data Migrated from the Legacy Archive to Active Portal

    • Birdseed files that were generated from Affymetrix SNP6 arrays
    • Additional WGS Alignments are now available for TCGA projects
    • Additional samples from RNA-Seq and WXS are now available for TCGA projects

Find the complete list of datasets available on the GDC portal.
 


Proteomic Data Commons (PDC)

New datasets as of April 2023 include: 

Proteome

  • Description:  CPTAC Pan-Cancer Proteome data processed by the University of Michigan team's pipeline, and then post-processed by the Baylor College of Medicine's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'BCM pipeline for pan-cancer multi-omics data harmonization').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC
     
  • Description:  CPTAC Pan-Cancer Proteome data processed by the Broad institute based on Spectrum Mill pipeline. Details can be found in the STAR Methods of 'Pan-Cancer analysis of post-translational modifications reveals shared patterns of protein regulation' (i.e., 'Proteomics data processing ').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC, MB
     
  • Description:  CPTAC Pan-Cancer Proteome data processed by the University of Michigan team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Global proteomics and phosphoproteomics (pipeline from the University of Michigan)').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC
     
  • Description:  CPTAC Pan-Cancer Proteome data processed by the University of Michigan team's pipeline, and then pre-processed by the Mount Sinai team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Abundance table preprocessing)').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC

Phosphoproteome

  • Description:  CPTAC Pan-Cancer Phosphoproteome data processed by the University of Michigan team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Global proteomics and phosphoproteomics (pipeline from the University of Michigan)').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC
     
  • Description:  CPTAC Pan-Cancer Phosphoproteome data processed by the University of Michigan team's pipeline, and then post-processed by the Baylor College of Medicine's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'BCM pipeline for pan-cancer multi-omics data harmonization').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC
     
  • Description:  CPTAC Pan-Cancer Phosphoproteome data processed by the University of Michigan team's pipeline, and then pre-processed by the Mount Sinai team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Abundance table preprocessing)').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC
     
  • Description:  CPTAC Pan-Cancer Proteome data processed by the Broad institute based on Spectrum Mill pipeline. Details can be found in the STAR Methods of 'Pan-Cancer analysis of post-translational modifications reveals shared patterns of protein regulation' (i.e., 'Proteomics data processing ').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC, MB

Glycoproteome

  • Description:  CPTAC Pan-Cancer Glycoproteome data processed by the John Hopkins University team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Glycoproteomics (pipeline from Johns Hopkins University)').

    Cohorts:  BRCA, ccRCC, COAD, GBM, HGSC, HNSCC, LSCC, LUAD, PDAC, UCEC

Acetylome

  • Description:  CPTAC Pan-Cancer Acetylome data processed by Common Data Analysis Pipelines (CDAP). Details can be found at https://proteomic.datacommons.cancer.gov/pdc/

    Cohorts:  BRCA, GBM, LSCC, LUAD, UCEC
     
  • Description:  CPTAC Pan-Cancer Acetylome data processed by the University of Michigan team's pipeline. Details can be found in the STAR Methods of 'Proteogenomic Data and Resources for Pan-Cancer Analysis' (i.e., 'Global proteomics and phosphoproteomics (pipeline from the University of Michigan)')

    Cohorts:  BRCA, GBM, LSCC, LUAD, UCEC
     
  • Description:  CPTAC Pan-Cancer Proteome data processed by the Broad institute based on Spectrum Mill pipeline. Details can be found in the STAR Methods of 'Pan-Cancer analysis of post-translational modifications reveals shared patterns of protein regulation' (i.e., 'Proteomics data processing ').

    Cohorts:  BRCA, GBM, LSCC, LUAD, UCEC, MB

Ubiquitylome

  • Description:  CPTAC Pan-Cancer Ubiquitylome data processed by Common Data Analysis Pipelines (CDAP). Details can be found at https://proteomic.datacommons.cancer.gov/pdc/

    Cohorts:  LSCC

Find the complete list of datasets available on the PDC portal


Imaging Data Commons (IDC)

New datasets include: 

  • CT-vs-PET-Ventilation-Imaging
  • CTpred-Sunitinib-panNET
  • AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections

Updated collections include:

  • CMB-CRC
  • CMB-LCA
  • CMB-MEL
  • CMB-PCA
  • Pancreatic-CT-CBCT-SEG

Find the complete list of datasets available on the IDC portal.


Integrated Canine Data Commons (ICDC)

New study released 5-24-2023:

OSA03 - “Comparative analysis using whole genome bisulfite sequencing of human and canine osteosarcoma”

  • Adds 44 cases 
  • Adds 88 files (~73 GB)  

Find the complete list of datasets available on the ICDC portal. 


Cancer Data Service (CDS)

New datasets include:

  • CIDR: Discovery, Biology, and Risk of Inherited Variants in Glioma (phs002250.v1.p1)
  • Detection of Colorectal Cancer Susceptibility Loci Using Genome-Wide Sequencing (phs001554.v1.p1)
  • Development of A Tumor Molecular Analyses Program and Its Use to Support Treatment Decisions (phs001713.v1.p1)
  • Pediatric Preclinical Testing Consortium (phs001437.v1.p1)
  • The Genetic Basis of Aggressive Prostate Cancer, The Role of Rare Variation (phs001524.v1.p1)
  • Discovery of colorectal cancer susceptibility genes in high-risk families (phs001787.v1.p1)
  • Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (phs002011.v1.p1)
  • Whole genome sequencing to discover familial myeloma risk genes (phs001819.v2.p1)
  • Feasibility and Clinical Utility of Whole Genome Profiling in Pediatric and Young Adult Cancers (phs002620.v1.p1)
  • NCI CCSG CCDI Supplement Additional Genomic Submission (phs002599.v1.p1)
  • Molecular Pathological Epidemiology of Colorectal Cancer (phs002050.v2.p1)
  • Human Tumor Atlas Network (phs002371.v1.p1)
  • NCI UCSF CCDI Database for the Advancement of JMML - Integration of Metadata with "Omic" Data (phs002504.v1.p1)
  • Childhood Cancer Data Initiative (CCDI): Molecular Characterization Initiative (phs002790.v1.p1)
  • CCDI Data Sharing for Pediatric Cancers USC (phs003111)
  • CCDI Free the Data: Open Sharing of Comprehensive Genomic Childhood Cancer Datasets (Kansas) (phs002529)
  • OncoKids - NGS Panel for Pediatric Malignancies (CCDI) (phs002518)
  • Wistar PDX Development and Trial Center (phs002432)

Access to CDS datasets is through Seven Bridges (Velsera) CDS page, which requires a login.   
The CDS portal will go live in June 2023.  
Learn more about the CDS on their web page.