"False"
Skip to content

Students who have not changed their password since 7 May cannot log in to the student web. Read about how to change your password.

printicon
Main menu hidden.
Syllabus:

Multivariate Data Analysis, 7.5 Credits

Swedish name: Multivariat dataanalys

This syllabus is valid: 2022-07-25 and until further notice

Course code: 5MS081

Credit points: 7.5

Education level: Second cycle

Main Field of Study and progress level: Mathematical Statistics: Second cycle, has only first-cycle course/s as entry requirements

Grading scale: TH teknisk betygsskala

Established by: Faculty Board of Science and Technology, 2022-03-02

Contents

The course provides the basic theory and methods for multivariate data analysis and lays a solid foundation for learning more advanced methods and algorithms in the next step. It starts from multivariate Gaussian distribution (MGD) and its generalization, the Gaussian mixture model. The maximum likelihood estimation (MLE) and the EM algorithm are discussed. Based on MGD, statistical inference approaches (Hotelling's T square test, multivariate analysis of variance, MANOVA), classification methods (Linear discriminant analysis and logistic regression), and clustering analysis methods are covered. Furthermore, based on the projection ideas, different eigen-decomposition based methods for dimensionality reduction, such as principal component analysis (PCA), factor analysis (FA), canonical correlation analysis (CCA), and partial least squares (PLS) are introduced. Models for regression analysis with colinear explanatory variables such as principal component regression (PCR) and PLS regression are also included.

Module 1 (5 hp): Theory and applications
The module covers multivariate distributions with special emphasis on the multivariate normal distribution and its properties. The EM algorithm for finding maximum likelihood estimation of GMM is introduced. Further, methods for inference concerning mean vectors, and variance and correlation matrices are treated, along with methods for projections, classification, and clustering analysis.
 
Module 2 (2,5 hp): Computer labs
Multivariate data analysis with suitable statistical software. The module includes written and oral presentation of results.

Expected learning outcomes

For a passing grade, the student must be able to

Knowledge and understanding

  • derive the most important properties of the multivariate normal distribution
  • account for the connections between Principal Component Analysis (PCA), Canonical Correlation Analysis (CCA), and Partial Least Squares (PLS)
  • account for the connections between Gaussian discriminant analysis and Fisher's projection method

Skills

  • build different models from a probability theory point of view
  • apply the EM algorithm to find the maximum likelihood estimators in a mixture model
  • describe the basic ideas of the likelihood ratio test (LRT), and apply it to develop different inference methods, such as Multivariate Analysis of Variance (MANOVA)
  • determine the distribution of linear combinations of multivariate normal distributions
  • analyze multivariate data sets with the methods included in the course
  • evaluate the results from multivariate analyses and with a scientific touch present the results orally and in written form
  • summarize the most important results from a scientific report on some area in multivariate analysis

Judgment and approach

  • evaluate the applicability of different models from a scientific perspective, and judge what multivariate analysis methods are suitable for use in different situations

Required Knowledge

The course requires 90 ECTS including courses in Mathematical Statistics, minimum 12 ECTS, or courses in Statistics, minimum 75 ECTS and in both cases a course in Basic Calculus, 7,5 ECTC and a course in Linear algebra, 7,5 ECTS, or equivalent. Proficiency in English and Swedish equivalent to the level required for basic eligibility for higher studies.

Form of instruction

The teaching is mainly in the form of lectures, lessons and supervision af computer labs.

Examination modes

The examination consists of written reports, oral presentations and a written exam. The oral presentations are awarded with one of the following judgments: Fail (U), or Pass (G). The written reports are awarded with one of the following judgments: Fail (U), or Pass (G), and with points. Module 1 is awarded with one of the following grades: Fail (U), Pass (3), Pass with merit (4), Pass with distinction (5). The grade is based on the total score, where the written reports has 1/3, and the written exam 2/3 of the total score. Module 2 is awarded with one of the following grades: Fail (U), or Pass (G). In order to get the grade G, all the oral and written presentations must be awarded with the judgment G. For the course as a whole, one of the following grades is awarded: Fail (U), Pass (3), Pass with merit (4), Pass with distinction (5). The grade for the whole course is determined by the grade given for Module 1. To pass the whole course, all modules must have been passed. The grade is only set once all compulsory modules have been assessed. Scores on written reports can be used on later occasions for examination.

Deviations from the syllabus examination form can be made for a student who has a decision on pedagogical support due to disability. Individual adaptation of the examination form shall be considered based on the student's needs. The examination form is adapted within the framework of the expected learning outcomes of the course syllabus. At the request of the student, the course coordinator, in consultation with the examiner, must promptly decide on the adapted examination form. The decision shall then be communicated to the student.

A student who has been awarded a passing grade for the course cannot be re-assessed for a higher grade. Students who do not pass a test or examination on the original date are given another date to retake the examination. A student who has sat two examinations for a course or a part of a course, without passing either examination, has the right to have another examiner appointed, provided there are no specific reasons for not doing so (Chapter 6, Section 22, HEO). The request for a new examiner is made to the Head of the Department of Mathematics and Mathematical Statistics. Examinations based on this course syllabus are guaranteed to be offered for two years after the date of the student's first registration for the course.

Credit transfer
All students have the right to have their previous education or equivalent, and their working life experience evaluated for possible consideration in the corresponding education at Umeå university. Application forms should be addressed to Student services/Degree evaluation office. More information regarding credit transfer can be found on the student web pages of Umeå university, http://www.student.umu.se, and in the Higher Education Ordinance (chapter 6). If denied, the application can be appealed (as per the Higher Education Ordinance, chapter 12) to Överklagandenämnden för högskolan. This includes partially denied applications.

Other regulations

In a degree, this course may not be included together with another course with a similar content. If unsure, students should ask the Director of Studies in Mathematics and Mathematical Statistics. The course can also be included in the subject area of computational science and engineering.



In the event that the syllabus ceases to apply or undergoes major changes, students are guaranteed at least three examinations (including the regular examination opportunity) according to the regulations in the syllabus that the student was originally registered on for a period of a maximum of two years from the time that the previous syllabus ceased to apply or that the course ended.

Literature

Valid from: 2022 week 30

Applied multivariate statistical analysis
Johnson Richard Arnold, Wichern Dean W.
Sixth edition, Pearson New International edition. : ii, 770 pages :
ISBN: 1292024941
Mandatory
Search the University Library catalogue