Skip to Content
SBMI Horizontal Logo

BMI 6318 - Big Data in Biomedical Informatics

3 semester credit hours
Lecture contact hours: 2; Lab contact hours: 3
Web-based and classroom instruction
Prerequisites: BMI 5007 or consent of instructor
Lab Fee: $30

This course will expose students to the technologies used to solve 'Big Data' problems in biomedicine and healthcare. Through hands-on exercises, we will learn how to distill actionable information from small and large data leveraging multiple machines. We will cover the Health Data Science toolboxes for processing data sets with distributed algorithms, how to apply machine learning models in this context and finally, evaluate and report on the analysis. Students will be required to complete hands-on exercises and working knowledge of Python and SQL is required.

Upon successfully completing this course, students will:

  • Structure extremely large datasets for input and output.
  • Design a data analysis pipeline using 'big data'.
  • Map from business needs to a proposed analytical design using a very large datasets.
  • Evaluate the results and utility of data analysis and make an effective argument.

These objectives will be pursued by hands-on examples using Python-based data analysis libraries such as Pandas and PySpark. We will be using modern container technologies (Docker) and databases built to store “Big Data.”