Knowing privacy concerns of users, decision makers can introduce suitable privacy policies to increase user experience as well as to comply with privacy related laws, for example GDPR. Xuan-Son Vu has developed frameworks and algorithms to support these two vital requirements. He will defend his dissertation on Tuesday 20 October at Umeå University.
Text: Ingrid Söderbergh
Xuan-Son Vu, PhD at the Department of Computing Science at Umeå University.
Privacy is an important concept of our time because of the digital footprints surrounding us. Even if people choose to not use social media, which is the source of active digital footprints, they are still being “tracked” by passive digital footprints (for example via data of their friends/relatives, or via data of Internet of Things devices). This circumstance of our time requires us to learn and understand more about privacy. Because privacy is about control. Without privacy, we cannot choose how to live our lives. However, privacy is not only about privacy-guarantee.
In his dissertation, Xuan-Son Vu introduces more knowledge about privacy to understand the need for not only privacy-guarantee but also privacy analysis in machine learning with big data. It concerns protecting privacy of data subjects from privacy leakages, and privacy analysis helps to understand user concerns about privacy.
“Starting my PhD study, I struggled with two main questions regarding research in personal data: firstly, if I have access to previously collected data, how can I protect privacy of data subjects when having no user consents? And secondly, how can I better protect user privacy without sacrificing data utility?”
For research studies in computer science, especially in machine learning, natural language processing, or computer vision, it is very important to have user generated data via social media (for example shared texts and images on Twitter), to experiment new methods on realistic and large scale data.
Researchers often crawl data themselves to pursue their research topics. Once the data is collected, it can be shared among scholars to facilitate related research. However, data privacy arises from data subjects’ perspective. When a large number of data is shared and distributed widely, there is a high risk that they can be used by adversaries to identify the true identity of data subjects in other sensitive data such as medical data.
Given extensive research in privacy methodologies, there is a lack of efficient algorithms and frameworks to enable research in sensitive data. For example, without user consents at the first place, there has to be a suitable method to detect privacy concerns naturally as if users were given their consents. Similarly, lots of privacy-guaranteed algorithms were introduced but they were too complicated to be used without paying a sufficient amount of time for the learning curve.
“I focused mainly on developing methods that allow researchers and machine learning practitioners to work on sensitive data without worrying about knowing details of privacy-guaranteed algorithms. Different privacy analysis techniques were also introduced to understand user needs for privacy guarantee, to better support users and protect their data”, says Xuan-Son Vu.
Xuan-Son Vu comes from Vietnam and has a Master Degree in Computer Science majoring in Natural Language Processing and Machine Learning from Kyungpook National University, Korea. His doctoral thesis work has been performed at the Department of Computing Science within the Deep Data Mining research group. His supervisors are Asso. Prof. Lili Jiang and Prof. Erik Elmroth.