Course Director: Dr. Eugene Demidenko
- Course description:Today, health data specialists face the analysis of high-volume complex multidimensional data on the daily basis. This course offers cutting edge data science techniques required to succeed and overcome academia and industry demands to stay on the top of the field. I follow the saying: “Examples are the expressway to knowledge.” Much emphasis is paid to graphical presentation, including statistical animation, – an indispensable tool for the analysis and presentation of the multidimensional data accessible to a layman viewer. This high effort – high gain project driven course and involves three components: theory, real-life data analysis, and R programming for data analysis and its visualization.
We will cover the multivariate statistical techniques, such as principal component analysis, canonical correlation, discriminant analysis, hierarchical, hard and soft cluster analysis using Gaussian mixture distribution, multidimensional density estimation. The quality of the classification will be accessed via misclassification error with its connection to the ROC curve. Besides classic multivariate statistical techniques, students will learn advanced methods such as basics of image statistics, pharmacokinetics, and tumor growth analysis. We will discuss identification of objects in images through the bivariate kernel density estimation, statistical detection of synergy, analysis of dose-response relationships, and statistical estimation of the cancer treatment effect. An important feature of the course is uncertainty assessment for building parsimonious and reliable statistical models using machine-learning techniques such as cross-validation. The homework will be assigned each week with a team project as a culminating experience presented at the end of the course.
- Coursework: QBS 120, QBS 121, QBS 177
- Programming: Course work in Calculus, Algebra, and Programming. Intermediate programming experience in R.