Imaging Data Commons
(IDC)Overview
NCI Imaging Data Commons (IDC) is a cloud-based repository of publicly available cancer imaging data co-located with the analysis and exploration tools and resources. IDC is a node within the broader NCI Cancer Research Data Commons (CRDC) infrastructure that provides secure access to a large, comprehensive, and expanding collection of cancer research data.
All data hosted by IDC is available publicly. The current content of IDC is populated using the radiology collections from The Cancer Imaging Archive (TCIA), as well as data collected by other major NCI initiatives, such as TCGA, CPTAC, NLST and HTAN. IDC does not perform de-identification of images but accepts data de-identified by TCIA or other Data Coordinating Centers that are approved by NCI Security.
IDC provides access to the data standardized using the Digital Imaging and Communication in Medicine (DICOM) standard. IDC collaborates with the projects generating the data to harmonize alternative formats into DICOM representation. Its content includes not only images, but also image annotations and analysis results, and is linked using common identifiers to the other types of cancer data, such as proteomics and genomics datasets in the Cancer Research Data Commons (CRDC). Access to the data is supported using standard interfaces. Given the IDC role as an imaging data science platform, a major focus is on establishing best practices for imaging research. In this regard, a key role of IDC is in preparing and adapting commonly used tools for image analysis to be run on cloud environments with the IDC hosted datasets. Summarized derived data from analyses previously run will be associated with imaging data on IDC for ease of use by the research community.
Data Types
IDC contains various types of images and image-derived data harmonized using the DICOM standard. As of October 2022, IDC contains the following types of images:
- Clinical and preclinical imaging
- Radiological images (e.g., CT, MRI, PET)
- Digital pathology images
- Multispectral microscopy images
- Image annotations (e.g., planar and volumetric, regions of interest)
- Parametric maps derived from images (e.g., perfusion and diffusion maps)
- Measurements derived from the images (e.g., radiomics features for the annotated regions of interest)
- Expert assessments of the image findings (e.g., qualitative characterizations of lesion appearance)
Datasets
IDC includes imaging data from the following projects:
- The Cancer Genome Atlas (TCGA)
- The Cancer Imaging Archive (TCIA)
- Clinical Proteomics Tumor Analysis Consortium (CPTAC)
- Human Tumor Atlas Network (HTAN)
- Lung Imaging Database Consortium (LIDC)
- NCI Quantitative Imaging Network (QIN)
- National Library of Medicine Visible Human Project (VHP)
- National Lung Screening Trial Data (NLST)
Anatomical Sites
Data from the following organ sites, covered by the TCIA, will also populate the content of the IDC:
- Bladder
- Bone Marrow
- Brain
- Breast
- Colon
- Head and Neck
- Kidney
- Liver
- Lung
- Pancreas
- Prostate
- Rectum
- Skin
- Uterus