The course gives an introduction to data science with emphasis on the essential part of data science that consists predictive modelling. Predictive modelling aims to generate predictions based on historical data. In addition to parametric predictive models, such as linear regression and logistic regression models already known from the course Statistik A, some non-parametric predictive models, such as K-nearest neighbors models, are introduced during the course.
Regardless of which kind of predictive models that is used, it is of key importance to evaluate the accuracy of the predictions. Ways to evaluate predictions are therefore also introduced during the course.
As predictive modelling, more and more regularly, are used in all parts of society and as a basis for decisions it is also necessary to be aware of that, similar to human decisions, algorithms can also be subject to bias and errors. Thus, there are crucial ethical considerations that must be reflected on when doing data science and predictive modelling. During the course this is problematized.