Swedish name: Bearbetning och visualisering av data
This syllabus is valid: 2024-01-01 and until further notice
Syllabus for courses starting after 2024-01-01
Course code: 5DV217
Credit points: 7.5
Education level: First cycle
Main Field of Study and progress level:
Computing Science: First cycle, has less than 60 credits in first-cycle course/s as entry requirements
Mathematical Statistics: First cycle, has less than 60 credits in first-cycle course/s as entry requirements
Grading scale: Three-grade scale
Responsible department: Department of Computing Science
Established by: Faculty Board of Science and Technology, 2021-01-13
Revised by: Faculty Board of Science and Technology, 2023-06-19
The objective of Data Science is to enable society, companies and citizens to understand and use the ever-increasing amount of collected data in ways that make it possible to detect potential problems or improvements to the current state of affairs. Data Science should also empower humans to estimate and understand the potential result of different actions. There's a saying about "lies, damned lies, and statistics", which expresses the fact that data-based statistics can be presented in very convincing ways even when the conclusions are false. This course attempts to teach how to detect such false information and ensure more ethical use of Data Science
One example of practical use of Data Science is analyzing and presenting epidemic-related data and statistics in correct and human-understandable ways so that decisions and actions can be taken based on rational information. Data Science methods are also used for estimating effects of actions for reducing global warming, dimensioning road networks, choosing where to install new shopping centers or restaurants, optimizing the energy usage of buildings, …. To put it shortly, Data Science is one of the most crucial domains for deciding how our current and future society is to be built. More and more companies are also coming to realize the importance of Data Science. Regardless of industry or size, organizations that wish to remain competitive in the age of big data need to efficiently develop and implement Data Science capabilities or risk being left behind
Module 1, theory, 4.0 credits.
This course on data preprocessing and visualization provides an introduction to the domain of Data Science. The students will learn how to import, manipulate and preprocess data coming from various real-world data sources with the objective to present it in ways that allow gaining insight into the underlying systems or phenomena. Preprocessing of data may produce improved insight into the meaning of data by statistical measurements, presented as numerical tables that summarize the data in various ways. However, in most cases, humans tend to understand visual presentations of data better than purely numerical presentations. The course will teach how to use basic data visualizations such a point and line plots, bar charts, histograms, boxplots and violin plots. 3D visualization techniques will be taught, as well as how to use maps and images for data visualization.
Various data analysis and machine learning methods will be used but the underlying theory is beyond the scope of this course. The intention is to make the students proficient with how those methods can be applied in real-world settings encountered in industry and society in general. This is why lectures are accompanied by exercises where students practice applying some of the methods treated during lectures.
The course mainly uses the R programming language, so students will learn the basics of R. Also included is an introduction to how data preprocessing and visualization methods can be used in the Python programming language.
Topics covered are:
Module 2, proficiency training, 3.5 credits.
Module 2 consists in a practical project that requires the combined use of methods learned in Module 1. Project topics and data sets will be provided by the course personnel, but student-proposed topics are encouraged. The project is performed in groups of 1-4 students. Each group presents their progress, plans and open questions to course personnel and fellow students in two "mentoring sessions" and in one final presentation session. The purpose of mentoring sessions is to provide constructive feedback and guidance to the students in their learning project. Mentoring session do NOT directly influence the grading of this Module.
Knowledge and understanding
After completing the course, the student should be able to:
Competence and skills
After completing the course, the student should be able to:
Judgement and approach
After completing the course, the student should be able to:
At least 7.5 ECTS mathematical statistics.
The course consists of lectures, practical exercises performed individually, and a project performed in groups of up to four students. In addition to scheduled activities the course also requires individual work with the material
The assessment of Module 1 (ELO 1-7) is done through a written Learning Diary, which includes written lab reports. The grades given in this module are Fail (U), Pass (G) or Pass with distinction (VG).
The assessment of Module 2 (ELO 3-8) is done through a written project report. The grades given in this module are Fail (U), Pass (G) or Pass with distinction (VG).
On the whole course, one of the grades Fail (U), Pass (G) or Pass with distinction (VG) is given. The grade given on the course is Pass with distinction (VG) only if both Modules have the grade Pass with distinction (VG).
Adapted examination
The examiner can decide to deviate from the specified forms of examination. Individual adaptation of the examination shall be considered based on the needs of the student. The examination is adapted within the constraints of the expected learning outcomes. A student that needs adapted examination shall no later than 10 days before the examination request adaptation from the Department of Computing Science. The examiner makes a decision of adapted examination and the student is notified.
If the syllabus has expired or the course has been discontinued, a student who at some point registered for the course is guaranteed at least three examinations (including the regular examination) according to this syllabus for a maximum period of two years from the syllabus expiring or the course being discontinued.
All needed course literature is freely available on the web. The list will be presented on the course site.