Skip to Content
SBMI Horizontal Logo

Clinical Language Annotation, Modeling and Processing Toolkit (CLAMP)

Introduction

CLAMP is an NLP toolkit, consisting of an Eclipse based graphical user interface (GUI) and a high performance language processing framework.

NLP pipelines: CLAMP components builds on a set of high performance NLP components that were proven in several clinical NLP challenges such as i2b2 , ShARe/CLEF , and SemEVAL. A pipeline can be visually created and customized by chaining CLAMP components in the processing order. Upon creation of a component, CLAMP will check errors and direct user to appropriate logical order for a properly working pipeline. These components are supported by knowledge resources consisting of medical abbreviations, dictionaries, section headers, and a corpus of 400 annotated clinical notes derived from mt-samples.

Machine learning and hybrid approaches: The framework provides alternative components for some tasks, utilizing rule based methods or machine learning methods using support vector machines, conditional random fields and Brown clustering. These components can be customized by re-training by an annotated corpus, or visually editing the rule sets within the GUI to achieve a custom NLP task. GUI also provides built-in functionality to test the model, using annotated test corpora or n-fold cross validation.

Corpus management and annotation tool: The user interface accommodates required tools to maintain clinical text corpora. It hosts an improved version of BRAT annotation tool for clinical text annotations.

Download

You can follow the link License and download CLAMP to check the license and download CLAMP.

Although CLAMP currently is not open source, it is possible to obtain source codes via appropriate license.