Author: Anupama E Gururaj, PhD
Primary Advisor: Hua Xu, PhD
Committee Members: Jorge r. Herzkovic, PhD; John C. Frenzel, MD, MS
Masters thesis, The University of Texas Health Science Center School of Health Information Sciences at Houston.
Objective: This study was a feasibility analysis of using medication information extracted from structured data and unstructured narrative text to generate a complete medication list for patients. A secondary objective of the study was to evaluate the concordance in medications between structured data and narrative text in a longitudinal manner in Electronic Health Records.
Materials and Methods: We used medication data from structured fields and clinical notes in the UTHealth Clinical Data Warehouse. We extracted medication information using SQL queries from the structured data in the database. We used Medex, a specialized medication extraction clinical NLP system for narrative text, to extract the information from clinical notes. We evaluated the variability between the datasets using statistical measures. We also manually validated information obtained from our analysis for 100 randomly chosen patients. We report precision, recall and F1- measures for Medex performance (from 100 documents) and extent of concordance between the algorithmic results and manual review for all other analyses.
Results & Conclusion: We analyzed longitudinal medication information for each patient and found the medication information extracted from structured and unstructured data reasonably well conserved from one encounter to the next. The correlation was low for the “Chronic” medication datasets between the structured and unstructured data indicating that these two datasets could potentially serve as complementary sources to build comprehensive medication information for patients. The Medex performance in extracting medication information from the UTHealth clinical notes was sub-optimal and we believe that fine-tuning may increase the performance levels.