CRDC Insights

Updates from the Cancer Research Data Commons:
Empowering the Scientific Community to Make New Discoveries

Data Sharing, Simplified: The CRDC Submission Portal at Two Years

June 15, 2026
CRDC Submission portal thumbnail 350

 

The NIH Data Management and Sharing (DMS) Policy sets clear expectations for NIH-funded research: scientific data should be made publicly accessible in a timely and responsible manner. The Cancer Research Data Commons (CRDC) Submission Portal makes data sharing easier by providing cancer researchers with a single, unified entry point for submitting data across multiple CRDC Data Commons. Over the past two years, teams across the cancer research community have used the CRDC Submission Portal and are now sharing their experiences.  

 

 

One Portal, Seven Data Commons 

One Data Submission Request Process

The CRDC includes seven Data Commons, each focused on different types of cancer research data, ranging from genomic and proteomic to imaging and clinical trial data. In the past, researchers had to determine which data commons was the best fit for their data and then navigate each group’s distinct submission process. The CRDC Submission Portal has simplified this process. Research teams now complete a single Submission Request form, and experts from all of the CRDC’s Data Commons collaboratively decide where the data should go.

Using the CRDC Submission Request form, research teams share details about their study and confirm that their data are de-identified, meaning personally identifiable information (PII) has been removed.

Christopher Szot, PhD, Scientific Project Manager with the Frederick National Laboratory for Cancer Research, spoke about his team’s experience submitting data from the NCI Molecular Profiling to Predict Response to Treatment program (MP2PRT). 

“We benefited from the joint review as we hadn’t realized that some of our data could go to the Genomic Data Commons (GDC) while other data was better suited for the Imaging Data Commons (IDC). For researchers considering submitting data, having one request process for all CRDC Data Commons means you don’t have to ‘shop’ your data around or complete multiple submission forms."

Streamlined Research Data Submission Process

The process of submitting data to the CRDC Submission Portal is simple. Researchers upload their data, validate the data on the portal, resolve validation errors if any, and click Submit.

The CRDC provides a number of resources such as data dictionaries and metadata templates that explain CRDC data standards which are used to structure the data for submission. New vocabulary terms can be requested to make sure researchers’ data meet required CRDC standards. Passing the validations performed on the portal confirms that the CRDC standards are met. Metadata can be uploaded via the portal, and the actual data files can be uploaded using the command-line interface (Uploader CLI) tool.

The portal allows submitters to review their data before it is shared. They can check both their research data and metadata to make sure everything is correct and complete. CRDC data curators perform additional validations upon submission to ensure that everything is complete. After the data are released on CRDC Data Commons, users can also download records and view unique IDs assigned to their files by the CRDC, making it easier to update their data over time.

Always a Human in the Loop 

The data submission process can be navigated with little supervision, but there is always the option of support from a Data Concierge service. After a data submission request is approved, each submitting team is paired with a dedicated subject matter expert who can support them through the process.

Justin Kirby, Program Manager and Director of the Cancer Imaging Informatics Laboratory at the Frederick National Laboratory for Cancer Research, recently commented on his experience submitting data to the CRDC. 

“A key feature offered by the CRDC Submission Portal is the service layer. It’s not like ‘here's a website and go figure it out’. This is a big difference between the CRDC and other generalist data repositories.”

This support matters because data submissions need to conform to CRDC standards to ensure their data are searchable and usable by the research community. As the CRDC evolves, it welcomes input and feedback about its data models, common data elements, and its processes.

Dr. Szot also highlighted that the CRDC is extremely comprehensive in the metadata that it requires. 

"While it may take more time to prepare data for a submission, it means more researchers will benefit from rich datasets. In thinking about other researchers who may want to access the data, you definitely want to have as much context and information as you can. With this thorough process, the CRDC makes it more likely that others will reference and reuse the data in their research."

Built on Collaboration, Aligned with Open Science 

Data submission through the CRDC Submission Portal is supported by ongoing collaboration across NCI, NIH, and HHS teams, as well as the broader cancer research community. This includes work organized by the NCI Center for Biomedical Informatics and Information Technology (CBIIT) Semantics Infrastructure (SI) team as well as collaborative work integrating exciting new tools from ARPA-H. The CRDC actively participates in developing data-sharing standards with international groups like the Global Alliance for Genomics and Health (GA4GH).

What's Coming Next to the CRDC Submission Portal

The CRDC team is continuing to improve the portal. Planned improvements include an AI-powered chatbot to guide researchers through the submission process while complementing the support from the Data Concierge team. The CRDC team also continues to enhance the CRDC Submission Portal features to provide a more unified data submission process, including a common data model across all seven Data Commons.

From Submission to Discovery: The CRDC Ecosystem

For NCI-funded researchers, the CRDC Submission Portal is more than a way to meet NIH data-sharing requirements. It's the first step toward making research data broadly accessible through the CRDC Data Commons, searchable through the Cancer Data Aggregator and ready for analysis through the NCI’s cloud resource, Seven Bridges Cancer Genomics Cloud (SB-CGC), from Velsera. This CRDC ecosystem broadens access to valuable datasets for new discoveries, enables validation of research results, and promotes data reuse for future studies.

Useful Links