Repository

Population Science Data Commons

(PSDC)
Enabling secure management and sharing of data
Population Science

Overview

The Population Science Data Commons (PSDC) manages, houses, and shares data from various NCI-funded population science research programs and awards. In addition to making population science data accessible, the PSDC supports integration with other Cancer Research Data Commons (CRDC) data. Before the development of the PSDC, there was no unified resource to store, share, access, analyze, and facilitate integration of population science data with other data and analysis resources. 

As part of the CRDC, the PSDC’s data can be linked to other data resources – including omics data from the same populations stored in other CRDC data commons, such as the Genomics Data Commons, the Proteomic Data Commons, or the General Commons.  

The PSDC portal currently focuses on overall study data, not individual participant data. Descriptive summary data are searchable, allowing users to identify potential studies of interest. Parameters include study name, design, enrollment period, number of participants, cancer types, and the presence or absence of demographics such as age at enrollment, race, ethnicity, and sex. Users can see what study files are available to decide whether to access open data or request controlled-access data.  

Data Types

The PSDC is designed to manage and house several types of data, including, but not limited to:

  • Demographic data  
  • Survey and questionnaire data  
  • Clinical data, including tumor type, if applicable
  • Biomarker assays
  • Environmental exposure measurements
  • Dietary and anthropometric assessments

Data Submission

The PSDC is integrated into the CRDC Data Submission portal. Information about the data submission process and links to instructions can be found on the CRDC Submit Data page.  

Research teams start the process by completing the Data Submission Request found on the CRDC Submission Portal. The request will be reviewed by the CRDC’s Submission Review Committee (SRC) and will go through an approval process. Once the request is approved, data can be submitted.

Contact the CRDC Help Desk for information and support. 


Data Access

The PSDC portal provides a step-by-step guide for searching for studies of interest using various parameters, including study name, design, enrollment period, number of participants, cancer types, and the presence or absence of demographics such as age at enrollment, race, ethnicity, and sex. Learn more on the PSDC Explore page.

To use controlled-access data, researchers must first obtain authorization from the NCI Data Access Committee (DAC). The DAC manages the authorization process for Genotypes and Phenotypes (dbGaP), maintained by the National Center for Biotechnology Information (NCBI). Users request access to controlled data in the PSDC through dbGaP.

The PSDC also hosts open-access data that is publicly accessible; no authorization is required. 


Data Analysis

The Seven Bridges Cancer Genomics Cloud, powered by Velsera (SB-CGC), collaborates with the PSDC to facilitate access to its data for analysis. SB-CGC offers secure personal workspaces on the AWS cloud platform as well as publicly available analytical tools shared by the research community. Users create a manifest of files of interest through the PSDC portal and, with one click, can access those files for analysis within the secure SB-CGC environment. Learn more about working in the SB-CGC environment.

Learn more about PSDC