Natural Language Processing

Gold Standard Development

In order to know whether our process for turning clinical text into concepts works correctly, we must have some clinical text already converted to concepts by a known-good process to compare against. This set of documents is called the "gold standard."

Historically, concept-level information for the medical field has been derived from patient billing data or annotated by hand by professional abstractors. The inherent problem in using billing data is that billing codes don’t necessarily reflect what the patient actually has. For this reason, BIG’s effort focuses on using free text clinical notes from which UMLS concepts describing procedures, medications, and disorders are extracted.

Beginning with a randomly selected group of 250 patients from the CDW, a dedicated Graduate Research Assistant with a medical background reviews the clinical notes, (estimated approximately 1,000 notes), and annotates the content with concepts from the Unified Medical Language System (UMLS). A second review of a subset of records is performed by a clinically trained investigator to ascertain consistent coding before including the results in the Gold Standard.

Concept Extraction

The Concept Extraction initiative applies NLP tools to extract medical concepts from clinical text. It then filters the extracted concepts by means of different algorithms. We compare the results with the Gold Standard in order to test the correctness of our approach. We measure this correctness by calculating the  precision and recall of the resulting concept list when compared to our Gold Standard.

The main concept extractor we are using at present is MetaMap, a program developed at the National Library of Medicine (NLM). MetaMap uses a complex approach to assign concepts from the UMLS Metathesaurus to its text input. It was designed to operate on biomedical literature, but we are applying it to clinical notes. We employ several of its features, for instance:

  1. Negation detection via the NegEx algorithm;
  2. Restriction of the metathesaurus to specific component source vocabularies (e.g. SNOMED-CT);
  3. Ignoring word order when analyzing text to assign concepts.

MetaMap is prone to looping endlessly on large chunks of text, especially when ignoring word order. We thus feed it sentences rather than whole notes. In order to parse our clinical notes into sentences, we use the sentence parser developed by the Mayo Clinic for clinical text as part of  clinical Text Analysis and Knowledge Extraction System (cTAKES). Our entire text processing pipeline runs on the Unstructured Information Management Architecture ( UIMA) as implemented by the Apache UIMA project. We are also evaluating cTAKES' own concept extractor and MEDLEE as MetaMap replacements.

Concept extractors like MetaMap are not built for and cannot distinguish between a concept that is 'just mentioned' in a piece of text and a truly relevant one. We therefore seek to improve the concept extraction process by

The real informatics begins when we rank and filter the concepts in order to try to boost precision with minimal loss of recall. So far, we've employed two approaches to this problem:  TF*IDF and MedRank, a graph-based ranking algorithm.  After ranking the concepts, we eliminate some fraction of the lowest-ranked concepts from the list before computing precision and recall.