Navigated to

Data preprocessing and visualisation 7.5 credits

About the course

The objective of Data Science is to enable society, companies and citizens to understand and use the ever-increasing amount of collected data in ways that make it possible to detect potential problems or improvements to the current state of affairs. Data Science should also empower humans to estimate and understand the potential result of different actions. There's a saying about "lies, damned lies, and statistics", which expresses the fact that data-based statistics can be presented in very convincing ways even when the conclusions are false. This course attempts to teach how to detect such false information and ensure more ethical use of Data Science

One example of practical use of Data Science is analyzing and presenting epidemic-related data and statistics in correct and human-understandable ways so that decisions and actions can be taken based on rational information. Data Science methods are also used for estimating effects of actions for reducing global warming, dimensioning road networks, choosing where to install new shopping centers or restaurants, optimizing the energy usage of buildings, …. To put it shortly, Data Science is one of the most crucial domains for deciding how our current and future society is to be built. More and more companies are also coming to realize the importance of Data Science. Regardless of industry or size, organizations that wish to remain competitive in the age of big data need to efficiently develop and implement Data Science capabilities or risk being left behind

Module 1, theory, 4.0 credits.
This course on data preprocessing and visualization provides an introduction to the domain of Data Science. The students will learn how to import, manipulate and preprocess data coming from various real-world data sources with the objective to present it in ways that allow gaining insight into the underlying systems or phenomena. Preprocessing of data may produce improved insight into the meaning of data by statistical measurements, presented as numerical tables that summarize the data in various ways. However, in most cases, humans tend to understand visual presentations of data better than purely numerical presentations. The course will teach how to use basic data visualizations such a point and line plots, bar charts, histograms, boxplots and violin plots. 3D visualization techniques will be taught, as well as how to use maps and images for data visualization.

Various data analysis and machine learning methods will be used but the underlying theory is beyond the scope of this course. The intention is to make the students proficient with how those methods can be applied in real-world settings encountered in industry and society in general. This is why lectures are accompanied by exercises where students practice applying some of the methods treated during lectures.

The course mainly uses the R programming language, so students will learn the basics of R. Also included is an introduction to how data preprocessing and visualization methods can be used in the Python programming language.

Topics covered are:

  • Introduction to the R programming language and tools
  • Introduction to data processing and visualization in the Python programming language
  • Import and export of data from text files, data bases and other sources
  • Data visualization in R, in 2D and 3D
  • Map visualizations
  • Displaying and working with images in R
  • Introduction to other useful data preprocessing and visualization packages
  • Linear regression, BLUE, RMSE, shrinkage methods (Lasso, ridge regression)
  • Linear classification (logistic regression, LDA)
  • Principal Components Analysis (PCA) for identifying linear correlations between variables
  • K-means clustering
  • Nonlinear or nonparametric methods (e.g., k-NN)
  • Preparation of data for machine learning
  • Basic notions of Explainable Artificial Intelligence (XAI) 

Module 2, proficiency training, 3.5 credits.
Module 2 consists in a practical project that requires the combined use of methods learned in Module 1. Project topics and data sets will be provided by the course personnel, but student-proposed topics are encouraged. The project is performed in groups of 1-4 students. Each group presents their progress, plans and open questions to course personnel and fellow students in two "mentoring sessions" and in one final presentation session. The purpose of mentoring sessions is to provide constructive feedback and guidance to the students in their learning project. Mentoring session do NOT directly influence the grading of this Module.

Apply

  • Spring 2026

    • Data preprocessing and visualisation

      Spring 2026 / Umeå / English / On site

      Application opens 15 September 2025

      Show more Show less


      Starts

      19 January 2026

      Ends

      23 March 2026

      Number of credits

      7.5 credits

      Type of studies

      On site

      Study pace

      50%

      Teaching hours

      Daytime

      Study location

      Umeå

      Language

      English

      Application code

      UMU-57303


      Eligibility At least 7.5 ECTS mathematical statistics.
      Selection

      Academic credits

      Application

      The online application opens 15 September 2025 at 09:00 CET. Application deadline is 15 October 2025.


      Application and tuition fees

      As a citizen of a country outside the European Union (EU), the European Economic Area (EEA) or Switzerland, you are required to pay application and tuition fees for studies at Umeå University.

      Application fee: SEK 900

      Tuition fee, first instalment: SEK 19,038

      Total fee: SEK 19,038

      Details about tuition, fees and funding

       

How to apply

Apply online via universityadmissions.se  
You apply to our programmes and courses via universityadmissions.se – the official website for higher education applications in Sweden. There, you can track your application, check that your documents have been registered, and log in to find our your admission results. 
  
Late applications 
Admissions to most programmes and courses typically close after the final application deadline. However, some programmes and courses may still accept late applications if seats are available. These are marked “Open for late application” on universityadmissions.se. Please note that late applications are not guaranteed to be reviewed. 
 
More about application and admission 

Explore your future at Umeå University

Join a vibrant academic community where high-quality education meets groundbreaking research in science, technology, humanities, and the arts. At Umeå University, you will learn from passionate, expert teachers and benefit from a close connection between research, education, collaboration, and innovation.

Questions about the course?

Please be aware that the University is a public authority and that what you write here can be included in an official document. Therefore, be careful if you are writing about sensitive or personal matters in this contact form. If you have such an enquiry, please call us instead. All data will be treated in accordance with the General Data Protection Regulation.

Please be aware that the University is a public authority and that what you write here can be included in an official document. Therefore, be careful if you are writing about sensitive or personal matters in this contact form. If you have such an enquiry, please call us instead. All data will be treated in accordance with the General Data Protection Regulation.


Course is given by
Computing Science
New message

Good to know

Studenten Sarah pluggar med dator och anteckningsblock.

How to apply

A step-by-step guide to apply for studies at Umeå University.

Staff welcoming new students and providing support at the infocenter service desk.

International Student Guide

Essential information for your journey to Umeå and your studies here.

One person is writing on a whiteboard whilst another is watching.

Study guidance

A study counsellor can help you with many of your study-related questions.