Author: Sahiti Myneni, MSE (2013)
Primary Advisor: Trevor Cohen, MBChB, PhD
Committee Members: Sriram Iyengar, PhD, Jiajie Zhang, PhD, Kayo Fujimoto, PhD
PhD Thesis, The University of Texas School of Biomedical Informatics at Houston.
Ubiquitous online social networks provide us with a unique opportunity to deliver scalable interventions for the support of lifestyle modifications in order to change behaviors that predispose toward cancer and other diseases. At the same time these networks act as rich data sources to inform our understanding of end-user needs. Traditionally, social network analysis is based on communication frequency among members. In this work, I introduce communication content as a complementary frame for studying these networks.
QuitNet, an online social network developed to provide smoking cessation support is considered for analysis. Qualitative coding, automated content analysis, and network analysis were used to construct QuitNet sub-networks based on both frequency and content attributes. This merging of qualitative, quantitative, and automated methods expands the depth and breadth of existing network analysis techniques thereby allowing us to characterize the nature of communication among network members. First, grounded-theory based qualitative analysis provides a granular view of the QuitNet messages. Using automated text analysis, the communication links between network members were divided based on the similarity of the content in the exchanged messages to the identified themes. This automated analysis allowed us to expand the otherwise prohibitively labor-intensive qualitative methods to a large data sample using minimal time and resources. The follow-up one-mode and two-mode network analysis allowed us to investigate the content-specific communication patterns of QuitNet members.
Qualitative analysis of the QuitNet messages identified themes ranging from “Social support”, “Progress”, and “Traditions” to “Nicotine Replacement Therapy (NRT) entries” and “Craves”. Automated annotation of messages was achieved by using a distributional approach incorporating distributional information from an outside corpus into a model of the QuitNet corpus to generate vector representations of messages. A k- nearest neighbor approach was used to infer themes relating to each message. The recall and precision measures indicate that the performance of the automated classification system is 0.77 and 0.71 for high-level themes. The average agreement of the system with two human raters for high-level themes approached the agreement between these human coders for a subset of 100 messages suggesting that the system is a reasonable substitute for a human rater. Subsequent one-mode network analysis provided insights into different theme-based networks at population level revealing content-specific opinion leaders.
Two-mode network analysis allowed us to investigate the content affiliation patterns of QuitNet users and understand the content-specific attributes of social influence on smoking abstinence. These studies provide insights into the nature of communication among members in a smoking cessation related online social network. Ability to identify critical nodes and content-specific network patterns of communication has implications for the development and maintenance of support networks for health behavior change. Analysis of the frequency and content of health-related social network data can inform the development of tailored behavioral interventions that provide persuasive and targeted support for initiating or adhering to a positive behavior change.