Skip to Content
SBMI Horizontal Logo

Deep Learning Frameworks for Disease Diagnosis and Overall Survival Prediction using Imaging, Genomic, and Clinical Data

Author: Tanjida Kabir, MS (2025)

Primary advisor: Xiaoqian Jiang, PhD

Committee members: Shayan Shams, PhD and Yejin Kim, PhD

PhD thesis, McWilliams School of Biomedical Informatics at UTHealth Houston.


ABSTRACT

Deep learning has transformed multiple domains, yet clinical data remain challenging due to severe class imbalance, missing values, heterogeneous multimodal inputs, and distribution shifts across scanners and institutions. These domain-specific issues limit the direct use of off-the-shelf models and motivate methods that are explicitly tailored to clinical imaging and outcome prediction. In this work, we develop and evaluate three complementary deep learning frameworks: (1) a Bayesian segmentation model for glioblastoma (GBM) on follow-up magnetic resonance imaging (MRI); (2) a domain- specific pipeline for detecting intra-bony defects and furcation involvement on cone-beam computed tomography (CBCT); and (3) a multimodal attention-based model that integrates MRI, genomic, and clinical data to predict overall survival in GBM.

First, we develop a Bayesian deep segmentation framework designed specifically for follow-up MRIs of GBM patients. Unlike preoperative scans, follow-up images exhibit substantial structural changes due to surgery and therapy, making models trained only on preoperative data suboptimal. Using 311 follow-up MRIs, our model leverages transfer learning and Bayesian inference to segment FLAIR-hyperintense regions, enhancing tumor, and non-enhancing necrosis while simultaneously estimating voxel-wise predictive uncertainty. Segmentation accuracy is quantified with Dice similarity coefficient, Jaccard index, and Hausdorff distance. The uncertainty maps allow identification and correction of unreliable predictions, thereby improving both performance and confidence in downstream clinical use.

Second, we develop a CBCT framework for automatic identification of intra-bony defects and furcation involvement, where lesions are small, sparsely annotated, and vastly outnumbered by normal anatomy. We first pretrain a 3D encoder on unlabeled CBCT volumes using patch-level contrastive learning to distinguish disease-like regions from background, learning anatomy-aware features without dense labels. We then fine-tune a segmentation decoder on 299 annotated cases using foreground-biased cropping, imbalance-aware sampling, and boundary-sensitive loss functions that emphasize voxels near defect margins. Performance, evaluated via nested k-fold cross-validation using Dice, Jaccard, and Hausdorff metrics, is aimed at achieving sharp, reproducible boundaries that support quantitative assessment of defect extent, morphology, and residual bone support, ultimately informing decisions between tooth-preserving therapy versus extraction and implant planning.

Third, we develop a multimodal deep learning framework for GBM overall survival prediction that jointly models MRI, genomic, and clinical features. Within each modality, multi-head self-attention captures long-range, modality-specific structure and distills robust embeddings. Cross-modal attention then aligns these embeddings in a shared latent space, allowing each modality to query complementary information from the others while down-weighting noisy signals. In a cohort of 296 GBM patients, we train and evaluate the model using stratified k-fold cross-validation, reporting accuracy, precision, recall, specificity, F1-score, and ROC–AUC. The model outputs calibrated risk scores that stratify patients into short, intermediate, and long survival groups, enabling risk-adapted treatment intensification or de-escalation. Attention weights across modalities provide interpretable insight into whether imaging, genomic, or clinical factors primarily drive individual risk estimates.

Collectively, these three frameworks illustrate how uncertainty-aware Bayesian modeling, contrastive pretraining for sparse lesions, and attention-based multimodal fusion can bridge the gap between generic deep learning algorithms and the complex realities of clinical neuro-oncology and dental imaging.