🚀 We're looking for ML Engineers and Medical Reviewers! Join the OpenPHR Mission →
Back to Marketplace
Dataset

MIMIC-CXR

Imaging / Text Radiology PhysioNet Credentialed Data Use Agreement De-identified (HIPAA Safe Harbor)
N/A GitHub Stars
N/A Open Issues
N/A Docker Support
N/A Last Updated

Technical Summary

MIMIC-CXR (Medical Information Mart for Intensive Care, Chest X-Ray) is a large publicly available dataset of chest radiographs with free-text radiology reports.

Key Capabilities

  • Multimodal Learning: Contains both high-resolution DICOM images (frontal and lateral views) and the corresponding unstructured radiology reports dictated by attending radiologists.
  • Disease Classification: Annotations derived from reports using natural language processing (e.g., CheXpert labeler) allow for supervised learning of 14 common radiographic observations (e.g., Pneumonia, Cardiomegaly, Pleural Effusion).
  • Report Generation: Enables the training of vision-language models capable of automatically drafting preliminary radiology reports from raw X-ray images.

Usage in Healthcare

MIMIC-CXR is the gold standard benchmarking dataset for automated chest X-ray interpretation. It is widely used by researchers developing AI diagnostic aids to triage abnormal scans, reducing radiologist burnout and accelerating patient care in emergency and intensive care settings. Access requires completion of a human subjects training course and a signed data use agreement.