CRDC Insights

Updates from the Cancer Research Data Commons:
Empowering the Scientific Community to Make New Discoveries

May 02, 2024

2023 ITCR Annual Meeting, CRDC Workshop: Advancing Tools, Sustainability, and Equity in Cancer Research Informatics

Image
Image from the ICTR presentation quoted in this article. Shows a network with multiple inputs to the nodes of Data Preparation, MFA Analysis, and Pathways Analysis

Two CRDC teams presented at the NCI Informatics Technology for Cancer Research (ITCR) Annual Meeting in mid-September. Both presentations focused on supporting researchers who want to create or bring their own tools to CRDC platforms for tailored inquiries.   

CRDC’s Erin Beck co-presented with Rowan Beck (no relation) from Seven Bridges Cancer Genomics Cloud (Sb-CGC), powered by Velsera, one of three NCI-funded cloud resources included in the CRDC. Together they demonstrated how researchers can bring their tools to the SB-CGC, as well as adapting existing workflows available in the SB-CGC cloud environment. As an example, they showcased work done and published by Dr. Daoud Meerzaman’s NCI lab. His lab implemented the Common Workflow Language (CWL) to build a user-friendly version of MOPAW: An automated multi-omics workflow on the Cancer Genomics Cloud, which was recently published in Cancer Informatics. Additionally, they showed how they worked with his lab to convert the Bioconductor package called OmicCircos into an interactive R Shiny app on the SB-CGC designed for the circular visualization of multidimensional-omics data.

The presentation is available on the Velsera website. For additional questions and to request a demonstration, the SB-CGC team is available via email and offers online office hours.

Several team members with the Genomic Data Commons (GDC) gave a preview of the new GDC Version 2 Data Portal and the Analysis Tool Framework (ATF), which will be made available to the research community for feedback as a “soft launch” in January 2024. GDC Version 2 expands on the current GDC Data Portal launched by the NCI Center for Cancer Genomics (CCG) in 2016 by providing a cohort-centric design along with access to multiple, independent apps for more modular data analysis. These apps interoperate with the GDC using the ATF, a React-based framework that abstracts cohort and data access and interacts with the GDC Application Programming Interface (API) for data retrieval and analysis.  

The GDC is developing a video and webinar series to coincide with the GDC Version 2 “soft launch.”

ITCR is a trans-NCI program supporting investigator-initiated, research-driven, informatics technology development spanning all aspects of cancer research. CRDC as part of the NCI Center for Biomedical Informatics and Information Technology (CBIIT), coordinates program activities across the four participating NCI extramural divisions: Cancer Biology (DCB), Division of Cancer Control and Population Science (DCCPS), Division of Cancer Prevention (DCP), and the Division of Cancer Treatment and Diagnosis (DCTD).  

As Erin Beck notes, “One of the big topics this year was sustainability. Tool developers want to find better ways to keep tools current and useful for the benefit of the greater cancer research community. Sustainability should be considered at the beginning of the project to ensure the longevity and community reach of the tool. Too often a tool is created without a plan for sustainability and over time the code becomes obsolete or it’s no longer easily accessible to the community.”  

CRDC’s cloud resources (CR’s) are already playing an important role in addressing these concerns. The cloud resources already make thousands of tools available: apps, pipelines, and workflows that have been developed and released by researchers willing to share them with the research community. Anyone using the CR’s can tailor these tools as needed and/or create new tools within a secure environment until they are ready to release them publicly. 

Another significant topic at this year’s ITCR was around the description of socio-economic factors, race, and ethnicity in research in general, with a focus on how nuances in description can be assured – or left out – in the development of analytical tools, down to the code used to articulate these variables. 

As Erin Beck noted, “The best tool or app is only as good as the human who built it, so it was encouraging to be part of a conversation among tool developers who are aware of their role in ensuring that biomedical and population health research is as unbiased as possible.”