A Look Ahead: Tanja Davidsen on 2024 CRDC Goals
Across the cancer research community, we all recognize the complexities of working with data that are vast, diverse, and continuously evolving. Contemporary research requires data at massive scale along with consistency in quality and interoperability standards to support sophisticated inquiries that result in improved diagnostics, disease management, and treatment for patients and populations. Tanja Davidsen, Chief, Data Ecosystems Branch, NCI Center for Biomedical Informatics and Information Technology. Davidsen’s portfolio includes the Cancer Research Data Commons (CRDC).
IDC Team Demonstrates the Value of AI in Generating Imaging Annotations Clinical Images
A team of researchers affiliated with the CRDC Imaging Data Commons (IDC) has demonstrated the value of AI in generating annotations for imaging data sets that are now better situated to serve as reference data sets for users across the cancer research community. Enrichment of lung cancer computed tomography collections with AI-derived annotations, Krishnaswamy et al., was published in early January 2024, in Nature’s open access journal Scientific Data.
NCI CRDC Artificial Intelligence Data-Readiness (AIDR) Challenge
The National Cancer Institute (NCI) will be hosting the NCI Cancer Research Data Commons (CRDC) Artificial Intelligence Data-Readiness (AIDR) Challenge from March 4 through March 22, 2024, on the Seven Bridges Cancer Genomics Cloud (SB-CGC) powered by Velsera.
AACR Annual Meeting 2024: NCI-CRDC Session
As the NCI Cancer Research Data Commons (CRDC) enters its 10th year, CRDC leaders will provide an overview of work accomplished so far, CRDC’s impact on open science, and a look ahead to initiatives that ensure its foundational role in the cancer data ecosystem. The session will kick off with the history of CRDC including a review of key datasets housed in the CRDC and the core standards and services that ensure data are FAIR – Findable, Accessible, Interoperable, and Reusable.
CRDC at the 2024 AACR Annual Meeting
Members of the CRDC team presented at the 2024 AACR Annual Meeting in an NCI Sponsored Session focused on the Year of Open Science. They reviewed the evolving structure of the CRDC, showcased resources for researchers, highlighted success stories, and offered a look ahead to short- and longer-term initiatives to make NCI-funded research data FAIR: Findable, Accessible, Interoperable, and Reusable.
AACR's Cancer Research Journal: CRDC's Four-Part Series
A four-part manuscript series published in March 2024, in Cancer Research, one of the flagship journals of the American Association for Cancer Research (AACR), highlights the CRDC’s accomplishments from the past 10 years. The series lead authors and editors include: Anthony (Tony) Kerlavage, Jill S. Barnholtz-Sloan, Tanja Davidsen, Erika Kim, David Pot, Arthur Brady, Erin Beck, Heather Creasy, and Zhining Wang.
Feature: New Computational Approach Improves on Identification of Biomarkers
Recent work by a consortium of researchers demonstrates that a novel, network-based, analytical approach is effective in identifying predictive biomarkers of chemotherapy response in triple-negative breast cancer (TNBC) patient-derived xenografts (PDX) models. An update to this tool was presented at the 2024 AACR annual meeting.
The Cancer Data Aggregator: Tailored for All Skill Levels
The Cancer Data Aggregator (CDA) is a one-stop source for all types of users searching for data across the CRDC Data Commons. Its developers recognize that users have different interests and skill levels, so have developed the CDA to make it accessible and easy to use, regardless of computational expertise.
The Integrated Canine Data Commons (ICDC) Makes Data Exploration Easier
New features recently added to the Integrated Canine Data Commons (ICDC) portal make it easier for users to find and analyze data. The ICDC was established to advance research of human cancers by enabling comparative analysis with canine cancers, given similarities in types of cancer and response to therapeutics between canines and humans.
The CRDC's New Data Submission Portal
The CRDC has launched a new Submission Portal open to NCI and NIH-funded researchers looking to share their data through the CRDC. The portal provides access to the two-step process, which includes the Submission Request and Data Submission. Users can find the portal through the updated Submit Data page on the CRDC website.
The CRDC 2024 Fall Symposium: October 16 - 17
The CRDC marks its 10th Anniversary with its 2024 Fall Symposium: Ten Years of Empowering Cancer Researchers. This is a one-and-a-half-day event highlighting accomplishments over the last 10 years and providing a look ahead to new initiatives. The symposium is open to the broad cancer research community and the public. It is available for in-person or online participation.
AI Data Readiness Challenge Winners
The winners of the recent NCI CRDC Artificial Intelligence (AI) Data Readiness Challenge presented their findings and recommendations for improving the AI readiness of CRDC data. In conducting their study, users were tasked with making their recommendations using metrics such as availability, accuracy, completeness, and privacy.
Feature: Vaidhyanathan Mahaganapathy, Ph.D., Reflects on Support Received from NCI Cloud Resources
Dr. Mahaganapathy recently spoke with CRDC Insights reflecting on the support he received during his dissertation research from the team at Seven Bridges Cancer Genomic Cloud (SB-CGC), powered by Velsera. He also commented on the complexity of working with huge datasets and trends in cancer research data science, and offered advice to graduate students looking to start a research career.
MIDI-B Challenge: Registration is Open
Registration is open through August 15th for the MIDI-B Challenge – the NCI Medical Image De-Identification Benchmark Challenge. This is for developers (independent, academic, or industrial) who want to assess their automated image de-identification algorithms and software tools’ performance against a dataset of DICOM (Digital Imaging and Communications in Medicine) images that will be made available to challenge participants.
The CRDC Fall Symposium: Report
The CRDC marked its 10th Anniversary with its 2024 Fall Symposium. More than 500 registrants heard updates on the CRDC's accomplishments and previews of new initiatives. Featured speakers included Amanda Borens and NCI's Warren Kibbe. Several panels and presentations provided insights into the role CRDC will play in the evolving cancer research landscape.
Amrita Basu, Ph.D., Comments on Support Received by NCI Cloud Resources
Amrita Basu, Ph.D., with the University of California San Francisco (UCSF), recently spoke with CRDC Insights about using tools developed through a collaboration between UCSF and ISB-CBC to enhance the data infrastructure of the I-SPY 2 breast cancer research consortium. That work is proving useful as she leads a study collecting and analyzing complex quality-of-life data from patients in treatment.
The Clinical and Translational Data Commons Has Publicly Launched
The newest of CRDC’s Data Commons, the Clinical and Translational Data Commons (CTDC), launched publicly in September. The CTDC was developed to house a wide array of clinical and translational data from NCI-funded clinical trials, correlative studies, and interventional studies.
CRDC Components: Updates
Four of CRDC’s Data Commons provide updates. From the GDC: new webinars, data dictionary updates, and an updated portal. From the PDC: new types of data available, a new tab for exploration of treatment data, plus a new publication. From the IDC: 25,000 new pathology images added from three large datasets. From the ICDC: a new software release for easier search and analysis.
CRDC Collaborations: Updates
The CRDC collaborates with many NIH programs, offering its data, resources, and expertise. In this issue: an update on the MIDI-B Challenge, which focused on automated image de-identification. Also in this issue: an update on the Advanced Research Projects Agency for Health (ARPA-H) and the work to create a Biomedical Data Fabric (BDF) Toolbox that will make it easier to share and integrate data, analytical tools, and technologies.
NIH Data Management and Sharing: CRDC's Role
The new policy is in effect, and applies to new grant applications, competitive renewals, or competitive revisions. In brief: • Data sharing now pertains to all researchers with no budget minimum. • Applications, renewals, or revisions must include a data management and sharing plan. • Data must be shared at time of publication or by the end of the performance period, whichever is sooner. The Cancer Research Data Commons (CRDC) is home to a collection of data commons and cloud resources that host datasets from NCI-funded research, and make those datasets accessible to the research community.
CRDC at the 2023 AACR Annual Meeting
The AACR annual meeting is being held April 14-19, virtually and in person at the Orlando Convention Center, Orlando, Florida. Several sessions feature colleagues whose presentations focus on using CRDC resources, including a special session on Saturday focused on using HTAN data.
CRDC Resources in the Classroom
Faculty members and data scientists with Purdue University and Velsara (Seven Bridges) teamed up to produce a four-part online workshop that introduces the Cancer Genomics Cloud. The series also provides hands-on lessons in bulk- and single-cell RNA-seq analysis using datasets provided by Purdue researchers.
Nature Communications: Slim, a New Slide Microscopy Viewing Tool
A March 2023 Nature Communications publication details the development of a new slide microscopy viewer and annotation tool. Researchers used datasets housed in the Imaging Data Commons to develop this tool that supports integrated research across radiology and pathology teams.
The Cancer Data Service Portal Goes Live
The Cancer Data Service (CDS) is a data commons within the Cancer Research Data Commons (CRDC) infrastructure. It adds important capacity and flexibility to the NCI data ecosystem. CDS provides secure data storage and sharing capabilities for NCI-funded studies that fall under the following categories: • Studies with data that do not match an existing CRDC data commons • Studies with data that do not fit current data type criteria and/or the minimum metadata standards for a CRDC data commons
Imaging Data Commons (IDC): Now Available Through AWS
The Imaging Data Commons (IDC) recently announced a new partnership with Amazon Web Services (AWS) Open Data Sponsorship program. Through this partnership, IDC is making versioned IDC data available from AWS S3 buckets.
Data-Driven Research Improves Clinical Care: Interview with Jeffrey Dome, MD, PhD
Jeffrey Dome, MD, PhD, recently spoke with the CRDC team about his work and that of his peers as part of the Children’s Oncology Group (COG)/ Wilms Tumor program as well as with the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) program. His work is focused on the intersection of clinical care informed by an understanding of the genetic and ‘omics’ landscape of Wilms Tumor (WT), the most common pediatric renal (kidney) cancer. Dr. Dome is the Senior Vice President, Center of Cancer and Blood Disorders and Division Chief, Oncology, Children’s National Hospital, Washington, DC.
AACR Annual Meeting 2023 Report July 06, 2023
The American Association for Cancer Research (AACR) Annual Meeting was held April 2023, in Orlando, Florida, attracting more than 21,000 attendees across national and international cancer research communities. NCI staff share their takeaways from the meeting with their focus on data sharing and making CRDC resources more accessible.
Integrated Canine Data Commons (ICDC) in the News: The Washington Post
The Washington Post recently ran a story quoting Dr. Amy LeBlanc, DVM, director of NCI’s Comparative Oncology Program and a member of the Integrated Canine Data Commons (ICDC) Steering Committee. Additionally, Dr. LeBlanc is a member of the NCI Senior Advisory Committee that serves as the final point of approval and prioritization for incoming data submission requests for the ICDC.
NCI CRDC Partners with ARPA-H on the Biomedical Data Fabric Toolbox Program
The Advanced Research Projects Agency for Health (ARPA-H) announced a new program, Biomedical Data Fabric (BDF) Toolbox, on September 13th, following the White House’s announcement that described it as a program that will thread together research data across the country to help the scientific community find and share medical insights more easily, to break down data research silos, and deliver health solutions to people more quickly.
GA4GH 2023: Report Out From the CRDC
At its September 2023 annual meeting, GA4GH announced 10 new driver projects. Additionally, conference participants reviewed an updated Data Repository Service (DRS) v1.4 specification. Read more and find links to the final NIH keynote, as well as a fireside chat between the GA4GH outgoing Chair and the incoming Chair.
2023 ITCR Annual Meeting, CRDC Workshop: Advancing Tools, Sustainability, and Equity in Cancer Research Informatics
Two CRDC teams presented at the NCI Informatics Technology for Cancer Research (ITCR) Annual Meeting in mid-September. Both presentations focused on supporting researchers who want to create or bring their own tools to CRDC platforms for tailored inquiries.
Broad FireCloud and the PDC Offer New Way to Transfer Data to the Cloud
The recently released Portable Format for Bioinformatics (PFB) from the Broad Institute's FireCloud is a new way to transfer data from the Proteomic Data Commons (PDC) to the FireCloud platform. With PFB, users can easily identify the data they need within the PDC by applying metadata filters, and then with a single click, the data will be transferred to FireCloud for analysis.