Unveiling the Hidden Landscape of Tumors: Developing Cutting-Edge Machine Learning Approaches to Determine the Compositions of Cells and Exosomes
Author: Bingrui Li, BS (2023)
Primary advisor: Xiaobo Zhou, PhD
Committee members: Raghu Kalluri, MD, PhD; Arif Harmanci, PhD
PhD thesis, McWilliams School of Biomedical Informatics at UTHealth Houston.
ABSTRACT
Cancer emerges from the accumulation of mutation in cancerous cells. While research has traditionally focused on cancer cells, there is increasing recognition that the tumor microenvironment also plays a critical role in modulating tumor progression. However, accurately characterizing compositional changes and functional roles of different cell types and cell subtypes in tumor microenvironments remains a significant challenge. Exosomes, extracellular vesicles secreted by all cells, have gained significance as essential agents of communication among cells within the tumor microenvironment. They can carry a diverse array of biomolecules and are abundantly present in various biological fluids, making them promising candidates as biomarkers for cancer diagnosis and monitoring. Nevertheless, developing a robust and adaptable exosome-based diagnostic approach that can effectively differentiate human cancers in diverse biological fluids and define the origins of the exosomes is yet to be defined.
In this dissertation, we developed cutting-edge machine learning approaches to address the challenges in characterizing the tumor microenvironment and exosomes. Specifically, we developed a novel deconvolution approach combining adversarial autoencoder and extreme gradient boosting to robustly estimate the relative compositions of cells and their subtypes for different cancer types and identify the phenotype- associated subclusters. We also developed a novel machine learning-based computational method to differentiate cancer and control samples using a panel of proteins associated with exosomes. Furthermore, we extended our deconvolution approach to define the origins of the exosomes derived from different cells and estimate their proportions in plasma and urine samples. By doing so, we defined a novel function to impute the profiles of the exosomes derived from the cell subtypes, where data were not yet available in this field. We comprehensively evaluated our approaches and compared them with the state-of-art available tools and our approaches demonstrated superior performance in multidimensional scenarios. Moreover, we performed comprehensive analyses to explore the biological significance of changes in cell and exosome proportions and their associations with clinical phenotypes. Overall, our research provides novel insights into the tumor microenvironment and exosomes and lays the groundwork for future studies in cancer diagnosis and monitoring.