Spark NLP for Healthcare
N/A
GitHub Stars
N/A
Open Issues
N/A
Docker Support
N/A
Last Updated
Technical Summary
Spark NLP for Healthcare is a widely used NLP library in the healthcare industry, designed to seamlessly scale on Apache Spark clusters. While the core NLP library is open source, the healthcare-specific models are commercially licensed.
Key Capabilities
- Massive Scalability: Built natively on Apache Spark, allowing it to process millions of clinical documents across distributed computing clusters rapidly.
- Pre-trained Clinical Models: Includes hundreds of pre-trained models for tasks like clinical entity recognition, assertion status detection (e.g., distinguishing between a patient having a symptom vs. a family member having it), and relation extraction.
- De-identification: Offers robust, out-of-the-box text de-identification tools that comply with HIPAA and GDPR standards to safely share clinical text.
Model Card Details
Architecture
N/A
Intended Use Cases
Production-grade natural language processing for extracting structured data from unstructured clinical notes at scale.