Skip to Content
SBMI Horizontal Logo

Data Science and Artificial Intelligence

We are part of this profound transformation in medicine as our students and researchers use these clinical applications to significantly improve patient care, disease prevention, and biomedical discovery. Topics include Machine Learning, Natural Language Processing (NLP), Data Integration and Harmonization, Data Mining and Analytics, Computational Phenotyping, Predictive Modeling, Ontology, Data Security and Cryptology, and Biostatistics.

Students can pursue an education in Data Science and Artificial Intelligence under the following academic programs: Graduate Certificate, Master of Science (MS), and Doctor of Philosophy (PhD).


View All Programs


SELECTED COURSES IN DATA SCIENCE & Artificial Intelligence

In addition to a core set of foundational courses for all concentrations, the following are selected courses focusing on Data Science and Artificial Intelligence

Course Description:
The course introduces methods in health data science – defining the problem, accessing, and loading the data, formatting into data structures required for analysis. This course covers the basics of computational thinking to define a computational solution, methods to access healthcare data from variety of sources (EHR data, UMLS, Medline, etc.), and in different data formats. The students will apply methods for data wrangling and data quality assessments to structure the data for analysis. The students will be introduced to basics of design and evaluation of algorithms and application of data structures for healthcare data. The course will use Python programming language and basic python libraries for data sciences such as numpy, scipy, matplotlib and pandas.

Course Description:
Database processing is a key area of competency in biomedical informatics. This course introduces the concepts and methods of database processing in the context of healthcare and biomedicine.

Course Description:
This course provides an overview of the data analysis process, with particular attention paid to the data quality issues encountered with biomedical data. The course will cover the entire data analysis pipeline from needs analysis to presentation of final results. The course is primarily project-based. The projects will cover a wide variety of biomedical data, including bioinformatics, clinical, public health, and literature datasets. Students will implement their analysis in Python and present their work in a variety of presentation formats.

Course Description:
The purpose of this course is to examine the role of information representation, controlled vocabularies and knowledge engineering constructs such as ontologies in conceptualization, design and implementation of modern health information systems. The course will introduce approaches for representing information and knowledge in a distributed network of health information systems. Moving beyond a general understanding of taxonomies, students will gain an understanding of the conceptual foundations of ontologies, including the limitations of the modern systems. Knowledge modeling and engineering principals will be introduced through lectures, hands-on practice and the class project. This will include the design, construction and use of ontologies in health care applications. Through hands-on experience, students will gain insight into the strengths and limitations of the existing resources, approaches and systems as well as point to directions where future work needs to be done.

Course Description:
This course will expose students to the technologies used to solve 'Big Data' problems in biomedicine and healthcare. Through hands-on exercises, we will learn how to distill actionable information from small and large data leveraging multiple machines. We will cover the data science toolboxes for processing data sets with distributed algorithms, how to apply machine learning models in this context and finally, evaluate and report on the analysis. Students will be required to complete hands-on exercises and working knowledge of Python and SQL is required.

Course Description:
The increased digitization of biomedical data has dramatically increased interest in methods to analyze large quantities of data. Data mining is the process of transforming this raw data into actionable knowledge, which has led to many spectacular advances in biomedicine. This course provides an introduction to data mining methods from a biomedical perspective. The primary focus will be on practical and commonly used machine learning techniques for data mining (e.g., decision trees, support vector machines, clustering) and how these techniques transform data into knowledge. Students will engage in hands-on projects that expose them to data mining methods. Further, students will be able to critically evaluate the appropriateness of data mining methods on different tasks.

Course Description:
This course will examine current natural language processing (NLP) methods and their applications in the biomedical domain. It will provide a systematic introduction to basic knowledge and methods used in NLP research and hands-on experience with existing biomedical NLP systems. Students will gain knowledge and skills in various NLP tasks such as named entity recognition, information extraction, and information retrieval.

Course Description:
Deep learning and artificial intelligence have disrupted multiple industries including healthcare. This class offers students exposure to basic concepts of and practical skills for deep learning and its applications in selected problems in biomedical informatics. Students will study the foundations of deep learning, understand how to build neural networks, and conduct successful machine learning analyses. Deep learning architectures such as convolutional neural networks, recurrent neural networks, and autoencoders will be explored, along with concepts such as embeddings, dropout, and batch normalization. Case studies from biomedical informatics, including biomedical and clinical natural language processing, medical imaging, electronic health records, and genomics data will be utilized. Students will use the Python language and the state-of-the-art deep learning frameworks to implement deep learning models to solve real world problems. Experience with Python programming and basic knowledge of linear algebra is required.


See All Courses

FACULTY


Omer Anjum, PhD
Omer Anjum, PhD

Assistant Professor

Research Areas: Natural Language Processing, Machine Learning, AI System Design

Licong Cui, PhD
Licong Cui, PhD

Associate Professor

Research Areas: Ontologies and Terminologies, Big Data analytics, Neuroinformatics

Na Hong PhD
Na Hong, PhD

Assistant Professor

Research Areas: Healthcare Data Standards, Medical Analysis, Clinical Decision Support Applications

Luca Giancardo, PhD
Luca Giancardo, PhD

Associate Professor

Research Areas: Image/Signal Processing, Machine Learning, Translational Medicine

Xiaoqian Jiang, PhD
Xiaoqian Jiang, PhD

Professor

Research Areas: Heathcare privacy, Biomedical Data Mining, Computational Phenotyping

Yejin Kim, PhD
Yejin Kim, PhD

Assistant Professor

Research Areas: Data Mining, Machine Learning, Computational Phenotyping

Fang Li PhD
Fang Li, PhD

Assistant Professor

Research Areas: Biomedical Ontologies and Knowledge Graphs, Big Data Analytics, Machine Learning and Deep Learning

Ardalan Naseri, PhD
Ardalan Naseri, PhD

Assistant Professor

Research Areas: Algorithms, Computational Biology, Bioinformatics

Kalpana Raja PhD MRSB CSci
Kalpana Raja PhD, MRSB, CSci

Assistant Professor

Research Areas: Natural Language Processing, Biomedical Text Mining, Machine Learning

Laila Rasmy Bekhet PhD
Laila Rasmy Bekhet, PhD

Assistant Professor

Research Areas: Deep Learning, Predictive Modeling, Biomedical Data Mining

Kirk Roberts PhD MS
Kirk Roberts, PhD, MS

Associate Professor

Research Areas: Natural Language Processing, Question Answering, Clinical Information Extraction

Cui Tao, PhD
Cui Tao, PhD

Professor

Research Areas: Ontology Generation, Conceptual Modeling, Ontology services

Hulin Wu, PhD, MS
Hulin Wu, PhD, MS

Professor

Research Areas: Biostatistics, Computational Biology, Computational Modeling

Stephen Wu, PhD
Stephen Wu, PhD

Associate Professor

Research Areas: Natural Language Processing, Information Retrieval, Temporal Data Processing

Hua Xu, PhD
Hua Xu, PhD

Professor

Research Areas: Clinical Natural Language Processing, Healthcare Data Analytics, EHR-Based Clinical and Translational Research

GQ Zhang, PhD
GQ Zhang, PhD

Professor

Research Areas: Clinical Informatics, Data Science and Big Data, Ontologies and Metadata Management

Kai Zhang PhD
Kai Zhang, PhD

Assistant Professor

Research Areas: Predictive Modeling, Machine Learning for Healthcare, Fairness in Machine Learning

Degui Zhi, PhD, MS
Degui Zhi, PhD, MS

Professor

Research Areas: Bioinformatics, Statistical Genetics, Deep Learning


Career Outlook

We crunched the numbers and they don't lie.

Career Outcomes for Data Science and Artificial Intelligence
  Average Salary   Average Salary Range
Houston $123,948 $63,000 - $140,000
Texas $114,453 $64,000 - $123,000
Nationwide $94,884 $92,866 - $154,000
Positions
  • Healthcare Data Analyst
  • Data Analytics Consultant
  • Health Data Engineer
  • Health Data Scientist
  • Quantitative Analyst
  • Operations Analyst
  • Algorithm Developer
  • Machine Learning Scientist