Skip to Content
SBMI Horizontal Logo

Department of Health Data Science and Artificial Intelligence

We are in the midst of the Artificial Intelligence (AI) revolution that is transforming every major industry, including healthcare. Our students and researchers with a focus on this domain develop and use advanced computational methods and tools to significantly improve patient care, disease prevention, and biomedical discovery. Topics include Machine Learning, Natural Language Processing (including Large Language Models), Data Integration and Harmonization, Data Mining and Analytics, Computational Phenotyping, Predictive Modeling, Ontology, and Data Security and Cryptology. Students and researchers in Health Data Science and AI work closely with those in Clinical and Health Informatics and Bioinformatics and Systems Medicine.

Students can pursue an education in Health Data Science and Artificial Intelligence under the following academic programs: Graduate Certificate, Master of Science (MS), and Doctor of Philosophy (PhD). Students are admitted to the school, not a specific department, so that they can obtain a broad and comprehensive education experience while specializing with an in-depth training in the three areas offered by the Department of Health Data Science and Artificial Intelligence, Department of Clinical and Health Informatics, and Department of Bioinformatics and Systems Medicine.


View All Programs


SELECTED COURSES IN HEALTH DATA SCIENCE & Artificial Intelligence

In addition to a core set of foundational courses for all concentrations, the following are selected courses focusing on Health Data Science and Artificial Intelligence

  • BMI 5007 - Methods in Health Data Science

    Course Description:
    The course introduces methods in health data science – defining the problem, accessing, and loading the data, formatting into data structures required for analysis. This course covers the basics of computational thinking to define a computational solution, methods to access healthcare data from variety of sources (EHR data, UMLS, Medline, etc.), and in different data formats. The students will apply methods for data wrangling and data quality assessments to structure the data for analysis. The students will be introduced to basics of design and evaluation of algorithms and application of data structures for healthcare data. The course will use Python programming language and basic python libraries for health data sciences such as numpy, scipy, matplotlib and pandas.

  • BMI 5304 - Advanced Database Concepts in Biomedical Informatics

    Course Description:
    Database processing is a key area of competency in biomedical informatics. This course introduces the concepts and methods of database processing in the context of healthcare and biomedicine.

  • BMI 5351 - Research Design and Evaluation in Biomedical Informatics

    Course Description:
    This course provides the student the opportunity to develop more advanced competencies in the design, analysis, interpretation and critical evaluation of experimental, quasi-experimental, pre-experimental and qualitative biomedical informatics research and evaluation studies. The student will identify flaws or weaknesses in research and evaluation designs, choose which of several designs most appropriately tests a stated hypothesis or controls variables potentially jeopardizing validity, and analyze and interpret research and evaluation results. Through exposure to the basic “building block” designs, students will have the opportunity to develop the competence to appropriately choose and use the most important and frequently used design procedures for single or multifactor research or evaluation studies.

  • BMI 5353 - Biomedical Data Analysis

    Course Description:
    This course provides an overview of the data analysis process, with particular attention paid to the data quality issues encountered with biomedical data. The course will cover the entire data analysis pipeline from needs analysis to presentation of final results. The course is primarily project-based. The projects will cover a wide variety of biomedical data, including bioinformatics, clinical, public health, and literature datasets. Students will implement their analysis in Python and present their work in a variety of presentation formats.

  • BMI 6306 - Information and Knowledge Representation in Biomedical Informatics

    Course Description:
    The purpose of this course is to examine the role of information representation, controlled vocabularies and knowledge engineering constructs such as ontologies in conceptualization, design and implementation of modern health information systems. The course will introduce approaches for representing information and knowledge in a distributed network of health information systems. Moving beyond a general understanding of taxonomies, students will gain an understanding of the conceptual foundations of ontologies, including the limitations of the modern systems. Knowledge modeling and engineering principals will be introduced through lectures, hands-on practice and the class project. This will include the design, construction and use of ontologies in health care applications. Through hands-on experience, students will gain insight into the strengths and limitations of the existing resources, approaches and systems as well as point to directions where future work needs to be done.

  • BMI 6318 - Big Data in Biomedical Informatics

    Course Description:
    This course will expose students to the technologies used to solve 'Big Data' problems in biomedicine and healthcare. Through hands-on exercises, we will learn how to distill actionable information from small and large data leveraging multiple machines. We will cover the health data science toolboxes for processing data sets with distributed algorithms, how to apply machine learning models in this context and finally, evaluate and report on the analysis. Students will be required to complete hands-on exercises and working knowledge of Python and SQL is required.

  • BMI 6323 - Machine Learning in Biomedical Informatics

    Course Description:
    The increased digitization of biomedical data has dramatically increased interest in methods to analyze large quantities of data. Data mining is the process of transforming this raw data into actionable knowledge, which has led to many spectacular advances in biomedicine. This course provides an introduction to data mining methods from a biomedical perspective. The primary focus will be on practical and commonly used machine learning techniques for data mining (e.g., decision trees, support vector machines, clustering) and how these techniques transform data into knowledge. Students will engage in hands-on projects that expose them to data mining methods. Further, students will be able to critically evaluate the appropriateness of data mining methods on different tasks.

  • BMI 6331 - Medical Imaging and Signal Pattern Recognition

    Course Description:
    Biomedical data in the form of images, videos or other unstructured signals are continuously collected by clinicians, such as radiologists, dermatologists or ophthalmologists, life science researchers and increasingly by ourselves with our personal devices. Tools able to distill quantitative actionable information from these data are essential to generate phenotypes, aid diagnosis, screening, treatment and automate repetitive tasks. In the era of personalized medicine and big data, they have become an urgent medical need. In this course, you will be introduced to the essential pattern recognitions techniques to build and evaluate such tools. We will be covering the basics of image/signal processing, computer vision and applied machine learning with hands on examples relevant to biomedical applications.

  • BMI 6334 - Deep Learning in Biomedical Informatics

    Course Description:
    Deep learning and artificial intelligence have disrupted multiple industries including healthcare. This class offers students exposure to basic concepts of and practical skills for deep learning and its applications in selected problems in biomedical informatics. Students will study the foundations of deep learning, understand how to build neural networks, and conduct successful machine learning analyses. Deep learning architectures such as convolutional neural networks, recurrent neural networks, and autoencoders will be explored, along with concepts such as embeddings, dropout, and batch normalization. Case studies from biomedical informatics, including biomedical and clinical natural language processing, medical imaging, electronic health records, and genomics data will be utilized. Students will use the Python language and the state-of-the-art deep learning frameworks to implement deep learning models to solve real world problems. Experience with Python programming and basic knowledge of linear algebra is required.

  • BMI 6340 - Health Information Visualization and Visual Analytics

    Course Description:
    This course introduces the basics of information visualization, which is the use of interactive visual representations of data to amplify human cognition. Properly constructed visualizations allow us to analyze data by exploring it from different perspectives and using the power of our visual system to quickly reveal patterns and relationships. This course uses practical, hands-on examples and exercises to teach the theory and application of information visualization for health data. The class emphasizes visual analysis of time-series data, ranking and part-to-whole relations, deviations, distributions, correlations, multivariate, and geographic data. You will also learn how to combine multiple visualizations into interactive dashboards and how to use Tableau, a state-of-the-art information visualization tool to produce and deliver visualizations and dashboards quickly and easily.

See All Courses




FACULTY


Xiaoqian Jiang, PhD
Xiaoqian Jiang, PhD

Professor

Chair, Department of Health Data Science and Artificial Intelligence

Omer Anjum, PhD
Omer Anjum, PhD

Assistant Professor

Research Areas: Natural Language Processing, Machine Learning, AI System Design

Licong Cui, PhD
Licong Cui, PhD

Associate Professor

Research Areas: Ontologies and Terminologies, Big Data analytics, Neuroinformatics

Luca Giancardo, PhD
Luca Giancardo, PhD

Associate Professor

Research Areas: Image Signal Processing, Machine Learning, Translational Medicine

Arif Harmanci, PhD, MS
Arif Harmanci, PhD, MS

Assistant Professor

Research Areas: Information Extraction, Functional Genomics, Genomic Privacy

Ming Huang, PhD
Ming Huang, PhD

Associate Professor

Research Areas: Machine Learning and Deep Learning, Natural Language Processing and Large Language Models, Data Mining ...

Yejin Kim, PhD
Yejin Kim, PhD

Assistant Professor

Research Areas: Data Mining, Machine Learning, Computational Phenotyping

Hongfang Liu, PhD
Hongfang Liu, PhD

Professor

Research Areas: Artificial Intelligence and Informatics in Healthcare, Computational Biology and Bioinformatics ...

Laila Rasmy Bekhet PhD
Laila Rasmy Bekhet, PhD

Assistant Professor

Research Areas: Deep Learning, Predictive Modeling, Biomedical Data Mining

Kirk Roberts PhD MS
Kirk Roberts, PhD, MS

Associate Professor

Research Areas: Natural Language Processing, Question Answering, Clinical Information Extraction

Xiaoyang Ruan, PhD
Xiaoyang Ruan, PhD

Assistant Professor

Research Areas: Predictive modeling, Deep phenotyping, Explainable AI, and Precision Medicine

Toufeeq Ahmed Syed, PhD
Toufeeq Ahmed Syed, PhD

Associate Professor

Research Areas: Precision Education and Innovations, Education Informatics and Technology, Online platforms and Cloud ...

Lishan Yu, PhD
Lishan Yu, PhD

Assistant Professor

Research Areas: Data mining and analysis, Machine learning, Biomedical informatics

GQ Zhang, PhD
GQ Zhang, PhD

Professor

Research Areas: Clinical Informatics, Health Data Science and Big Data, Ontologies and Metadata Management

Kai Zhang PhD
Kai Zhang, PhD

Assistant Professor

Research Areas: Predictive Modeling, Machine Learning for Healthcare, Fairness in Machine Learning

Career Outlook

We crunched the numbers and they don't lie.

Career Outcomes for Health Data Science and Artificial Intelligence
  Average Salary   Average Salary Range
Houston $123,948 $63,000 - $140,000
Texas $114,453 $64,000 - $123,000
Nationwide $94,884 $92,866 - $154,000
Positions
  • Healthcare Data Analyst
  • Data Analytics Consultant
  • Health Data Engineer
  • Health Data Scientist
  • Quantitative Analyst
  • Operations Analyst
  • Algorithm Developer
  • Machine Learning Scientist