STAT 32950 focuses on applications and techniques for analysis of multivariate and high dimensional data. Beginning subjects cover common multivariate techniques and dimension reduction, including principal component analysis, factor model, canonical correlation, multi-dimensional scaling, discriminant analysis, clustering, and correspondence analysis (as time permits). Further topics on statistical learning for high dimensional data and complex structures include penalized regression models (LASSO, ridge, elastic net), sparse PCA, independent component analysis, Gaussian mixture model, Expectation-Maximization methods, and random forest (as time permits). Theoretical derivations will be presented with emphasis on motivations, applications and hands-on data analysis.
Prerequisite(s): STAT 24410 and STAT 34300 or equivalent recommended; or instructor consent.