The Cancer Genome Atlas (TCGA)
N/A
GitHub Stars
N/A
Open Issues
N/A
Docker Support
N/A
Last Updated
Technical Summary
The Cancer Genome Atlas (TCGA) is a landmark cancer genomics program that molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer types.
Key Capabilities
- Multi-Omics Profiling: For every patient sample, it provides a comprehensive suite of data including whole-exome sequencing, RNA sequencing (transcriptomics), DNA methylation arrays, microRNA sequencing, and clinical metadata.
- Matched Normal Controls: Uniquely, it sequenced both the tumor tissue and healthy tissue from the same patient, allowing researchers to accurately identify somatic (cancer-causing) mutations versus inherited germline variations.
- Digital Pathology: Includes tens of thousands of high-resolution whole-slide tissue images (WSI), enabling the correlation of molecular signatures with physical tissue morphology.
Usage in Healthcare
TCGA is the foundational dataset for computational oncology. Researchers use it to identify new cancer biomarkers (e.g., finding that a specific mutation profile predicts survival in breast cancer) and to train machine learning models that can classify cancer subtypes or predict patient responses to specific immunotherapies directly from routine pathology slides.