Skip to Content
SBMI Horizontal Logo

Decoding Disease Mechanisms: Unraveling Large-Scale Biological Data for Novel Therapeutic Target Discovery

Author: Yan Chu (2024)

Advisory Committee: W. Jim Zheng, PhD

PhD thesis: McWilliams School of Biomedical Informatics at UTHealth Houston.

ABSTRACT

This dissertation leverages large biological datasets and data-driven research to enhance our understanding of complex human diseases, deploying a unified narrative across three interconnected bioinformatics studies. Each study targets a different biological level, including genes, proteins, cells, and tissues, to uncover disease mechanisms and speed up the discovery of potential therapeutic targets.

The first study addresses gene-level analysis in the context of bladder cancer. It integrates gene expression data from lab models with clinical survival data to identify new therapeutic targets. This research pinpoints the IL6/JAK/STAT3 signaling pathway as a promising target for treating SMARCB1-deficient bladder cancer, demonstrating how combining various data types can lead to significant discoveries in cancer therapeutic targets.

Moving to the protein level, the second study focuses on Alzheimer’s disease and explores how alternative splicing, a process by which a single gene can lead to multiple protein forms, affects protein structure and disease development. Employing the cuttingedge AlphaFold 2 technology alongside robust statistical analyses, this study identifies specific changes in the Tau protein that are potentially linked to Alzheimer's. The findings suggest new paths for understanding and treating the disease, emphasizing the importance of structural protein studies in biomedical research.

The third study broadens the scope to cellular and tissue levels, utilizing advanced statistical and explainable artificial intelligence (AI) models to analyze spatial transcriptomics and single-cell RNA sequencing data. At the cellular level, the use of AI helps decode complex gene-gene interactions related to Alzheimer's disease from intricate neural network analyses, paving the way for novel therapeutic approaches. Extending these techniques to tissue level, the research examines how aging impacts gene expression in acute pancreatitis. By integrating spatial and single-cell data, it identifies key gene expressions linked to aging effects, enhancing our understanding of age-related changes in disease processes.

In conclusion, this dissertation employs sophisticated bioinformatics methodologies across multiple studies to provide new insights into the mechanisms of human diseases at the gene, protein, cell, and tissue levels. It showcases the effective use of diverse data types for cancer therapy development, employs high-performance computing to reveal the impacts of protein splicing on disease, and utilizes explainable AI to unravel complex gene interactions. Collectively, these efforts advance the field of bioinformatics and open up new avenues for the discovery of therapeutic targets and strategies, illustrating the transformative potential of integrating advanced computational tools in biomedical research.