🚀 We're looking for ML Engineers and Medical Reviewers! Join the OpenPHR Mission →
Back to Marketplace
Dataset

ClinVar

Genomics Medical Genetics Public Domain De-identified
N/A GitHub Stars
N/A Open Issues
N/A Docker Support
N/A Last Updated

Technical Summary

ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. It is hosted by the National Center for Biotechnology Information (NCBI).

Key Capabilities

  • Variant Classifications: ClinVar aggregates submissions from clinical testing laboratories, research labs, and expert panels regarding the pathogenicity (e.g., Benign, Pathogenic, Variant of Uncertain Significance) of genetic variants.
  • Evidence-Based: Each variant submission includes the clinical context and the supporting evidence or rationale for the classification, enabling robust downstream analysis.
  • Standardized Formats: The data is available in multiple structured formats (VCF, XML, TSV) making it highly interoperable with standard bioinformatics pipelines.

Usage in Healthcare

ClinVar is the definitive ground-truth dataset for medical genetics. Almost every clinical variant interpretation pipeline and genomics AI model relies on ClinVar data either for training or for benchmarking their variant effect predictions. It is essential for identifying actionable genetic mutations in both rare diseases and oncology.

Similar Assets (Genomics)