Skip to Content
SBMI logo

Research Support

Research Support


  • Biomedical Terminology Quality Assurance for Enhancing Clinical Queries over Electronic Health Records
  • NIH/NLM, R01LM013335 (PI: Cui), 08/2020 - 07/2022, Role: PI
  • An informatics framework for SUDEP Risk Marker Identification and Risk Assessment
  • NIH/NINDS, R01NS116287 (PI: Cui), 05/2020 - 04/2025, Role: PI
  • Methods for Auditing and Enhancing Completeness of Ontologies
  • NSF/IIS, 1931134 (PI: Cui), 09/2018 - 08/2021, Role: Sole PI
  • An Ontology-driven Faceted Query Engine for the Kentucky Cancer Registry
  • NIH/NCI, R21CA231904 (MPI: Cui & Zhang), 06/2018 - 07/2021, Role: PI
  • Center for SUDEP Research (CSR) - Informatics & Data Analytics Core (IDAC)
  • NIH/NINDS, U01NS090408 (PI: Zhang), 10/2014-07/2021, Role: Co-I


  • A Scalable Framework for Debugging Large Biological Ontologies
  • NSF/IIS, CRII, 1657306 (PI: Cui), 03/2017 - 02/2019, Role: Sole PI
  • National Sleep Research Resource (NSRR)
  • NIH/NHLBI, R24HL114473 (MPI: Redline & Zhang), 09/2013-06/2017, Role: Co-I
  • Amazon Web Services (AWS) Cloud Credits for Research
  • Amazon Inc., 06/2016 - 06/2017, Role: PI
  • Kentucky Center for Clinical and Translational Science
  • NIH/NCATS, UL1TR001998 (PI: Kern), 08/2016-05/2017, Role: Co-I
Selected Research Projects
  • SUDEP Risk Marker Identification and Risk Assessment using Artificial Intelligence

The main goal of this project is to develop an informatics approach for automated extraction of SUDEP (Sudden Unexpected Death in Epilepsy) risk markers from multimodal clinical data to enable individualized SUDEP risk assessment. Success of this study will enable systematic SUDEP risk assessment based on known and putative factors and communication of such risk factors to patients with epilepsy. Ultimately, this study can lead to evidence-based SUDEP risk assessment tools that help clinicians and patients manage potentially modifiable risks, leading to overall reduced SUDEP mortality and improved epilepsy patient care.

  • Quality Assurance of Biomedical Ontologies Using Big Data Approaches

Biomedical ontologies have been used in a range of biomedical informatics applications from bench experiments to patient care at the bedside, as well as data integration, enabling knowledge discovery, and managing biomedical big data. Ontologies are often incomplete, under-specified, and non-static for reasons such as the evolving state of knowledge in a domain, the involvement of manual curation work, and the progressive nature of ontological engineering. Thus Ontology Quality Assurance (OQA) has become an indispensable part of the ontological engineering lifecycle. However, OQA has been challenged by the lack of systematic and scalable methods necessary to keep pace with the evolution and emergence of ontological systems. We have developed scalable big data approaches using MapReduce to systematically auditing biomedical ontologies (e.g., detecting relation reversals and non-lattice fragments). With OQA methods implemented using massively parallel algorithms in the MapReduce framework, several orders of magnitude in speed-up were achieved. This big data approach makes it feasible not only to perform exhaustive structural analysis of large ontological hierarchies, but also to systematically track structural changes between different versions of ontologies.

  • Large-scale Data Integration

Cross-institutional data sharing is crucial for developing and implementing large-scale clinical studies. Identification of patients across multiple institutions is required both for rare disease studies and other studies that need very large and diverse populations. We have developed an adaptable and flexible cross-cohort query framework for integrating and querying patient data from multiple sources. This framework has been successfully deployed for two ongoing national research resource sharing projects: (1) National Sleep Research Resource (NSRR); and (2) Center for SUDEP Research (CSR).

  • Ontology-guided Health Information Extraction

Electronic information in unstructured or semi-structured form in health and healthcare has been steadily generated for decades. An explosive growth has occurred since the recent adoption of electronic health records (EHRs). Textual health information includes clinical notes recorded in hospitals and health-related information on the web. Such health-related textual data contains an extraordinary amount of underutilized biomedical knowledge. In order to take advantage of such knowledge to facilitate second use of EHRs for patient cohort discovery and consumer health information retrieval, we have developed effective ontology-guided methods for automatic extraction of structured information from patient discharge summaries and online consumer health information.

  • Online Consumer Health Information Retrieval

The Internet provides an important source of consumer health information to patients, caregivers, families, and laypersons. The proliferation of online health information from government agencies, non-profit organizations, for-profit companies, and chatting and social networking sites presents myriad of challenges for information access and retrieval. We have addressed such challenges by (1) providing a multi-topic assignment approach to organizing consumer health information using Formal Concept Analysis; (2) introducing a novel Conjunctive Exploratory Navigation Interface (CENI) for supporting effective consumer health information retrieval and navigation; and (3) evaluating the effectiveness of CENI through a search-interface comparative evaluation using crowdsourcing with Amazon Mechanical Turk (AMT).