Xuan-Son Vu, a postdoctoral fellow in computing science at Umeå University, is involved in a new research collaboration that will help researchers comply with the EU's General Data Protection Regulation (GDPR). "We will develop methods that automatically mask personal and sensitive information and separate it from the data researchers need," says Xuan-Son Vu.
Text: Victoria Skeidsvoll
Xuan-Son Vu, postdoctoral fellow at the Department of Computer Science, is part of a project to anonymise personal data.
In order to develop new knowledge and stay at the forefront, researchers around the world need to be able to share research information with each other. However, today there is a risk that people mentioned in words in different masses of text can be identified.
"For example, it could be information about your name, or where you live, or your political views," explains Xuan-Son Vu.
He is a post-doctoral fellow in Computing Science at Umeå University and involved in the new research project, "Grandma Karl is 27: Automatic pseudonymisation of research data", which tackles difficult and important challenges in pseudonymisation – recommended by the EU.
"Pseudonymisation is about systems and technologies to automatically mask personal and sensitive information and separate it from the data researchers need. We must develop and provide ways to strengthen pseudonymisation techniques before applying them in practice," says Xuan-Son Vu.
Access to research data
The goal is to create linguistic algorithms that can detect personal data and sensitive information in large masses of text and automatically replace the words with appropriate pseudonyms. In this way, personal data can be protected and all texts can be used in different kinds of research.
The research environment initiative is intended to support Sweden's work on open access to research data.
"Grandma Karl is 27 years old" is coordinated by the University of Gothenburg and brings together expertise in philosophy, linguistics and comparative linguistics from the University of Gothenburg, Umeå University and the University of Helsinki. The project runs from 2023 to 2028 and is funded with SEK 18 million by the Swedish Research Council.
"We will be working on very ambitious and challenging questions in a brand new research environment and I am looking forward to collaborating with the other project members," says Xuan-Son Vu.
Strengthened research collaboration
The Department of Computing Science at Umeå University has in recent years succeeded in recruiting world-leading researchers in areas such as data integrity, data security, responsible AI, and human interaction with robots and systems.
"We have a strong collaboration with national and international actors in academia, society and industry. We look forward to Xuan-Son Vu's work and to further strengthening our collaboration with the University of Gothenburg and the University of Helsinki," says Lena Kallin Westin, Head of the Department of Computing Science.
Research participants in "Grandma Karl is 27 years old: Automatic pseudonymisation of research data":
Project leader Elena Volodina, researcher and associate professor at the Department of Swedish, Multilingualism and Language Technology, University of Gothenburg, (lead applicant)
Simon Dobnik, Professor at the Linguistics and Theory of Science unit, University of Gothenburg, (PI)
Xuan-Son Vu, Postdoctoral fellow at the Department of Computing Science, Umeå University.
Therese Lindström Tiedemann, University Lecturer, Department of Finnish, Finno-Ugrian and Scandinavian Studies, University of Helsinki