Author: Safa Fathiamini, MD MS
Primary Advisor: Dean F. Sittig, PhD
Committee Member: Trevor Cohen, MD PhD
Masters thesis, The University of Texas School of Biomedical Informatics at Houston.
Medication-problem pairing is often required in clinical record systems, and distributional semantics provides a way to extract meaningful information and relationships from any corpus of data. Various medical knowledge sources, including the UpToDate clinical knowledge database, medical textbooks, Medline abstracts, corpus, and clinical notes from an EMR system were chosen as the corpora, and semantic spaces were built upon them. An expert reviewed list of medication-problem pairs was used as the gold standard. Nearest neighbors in the semantic space were obtained for each medication and the results were filtered based on the problems from the gold standard. Various metrics were then calculated, and the corpora and the techniques were compared. Overall, Medline seemed to be more suitable for medication-problem relationship extraction.