Research interests: trustworthy ML/DL, graph learning in NLP/multimodal data. Co-authored different neural models - e.g., ppRNN (faster version of RNN), MGTN (modular), SGTN (privacy-preseving).
I am a senior researcher at WASP Media & Language arena, and Founder of DeepTensor AB - a spin-off from CS department, working on securing ML/AI Solutions. I received my Ph.D degree from Umeå University with focus in privacy-guaranteed machine learning with big data. Before that, I got a M.Sc degree in Computer Science, Kyungpook National University in Korea with focus on NLP and Machine Learning. My work has been primarily focused on knowledge – both acquiring knowledge from text, multimodal data, and using structured knowledge to power downstream applications. I am a reviewer for journals/conferences including TheWebConf, ECAI, ICDM, PAKDD, SSR, SC2, COSE (Computer & Security), TPAMI.
My research has been involved around ML/DL solutions for ubiquitous data processing and analysis. Ubiquitous data refers to the various types of data available via IoT applications and user generated contents in daily activities (e.g., multimodal reviews, user activities on mobile devices, etc.). These data are not only scarce, ubiquitous but also located in different devices, locations, and belong to different users. To user generated content research, I have been working on privacy-guaranteed models for protecting user privacy while learning on their data. To data analytics, I have been proposing KaPPA and INFRA, two different analytical frameworks to project personal data for data analytics with new privacy-aware learning methods in the frameworks. Last but not least, different neural models were proposed by us in different research including (but not limited to) ppRNN [6], dpUGC [5], SGTN [4], MG-PRIFAIR [3], MGTN [1], Cformer [7]. These models play important roles in processing ubiquitous data which are not only scarce but also complex in most of the cases.
Collaboration is an important part of my work and I am a board member and publication chair of Vietnam Language and Speech Processing, VLSP, association organizing an International Workshop annually. VLSP encourages research in related areas by providing high quality datasets and letting the research community work on them via data challenges.
Locally, I work with a project in robust machine learning that studies new algorithms and neural methods to enhance robustness of ML based applications in multimodal data (e.g., the combination of textual data, visual data, graph data) tasks. Most recently, our work involves applications in healthcare, food of interest, and security on the edge cloud.
Publications
[1] Modular Graph Transformer Networks for Multi-Label Image Classification. Hoang D. Nguyen, Xuan-Son Vu, Duc-Trong Le, In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI2021)
[2] WINFRA: A Web-Based Platform for Semantic Data Retrieval and Data Analytics. Addi Ait-Mlouk, Xuan-Son Vu, Lili Jiang, In: Mathematics (Special Issue "Applied Data Analytics"), 2020, 8(11), 2090; doi:10.3390/math8112090.
[3] Multimodal Review Generation with Privacy and Fairness Awareness. Xuan-Son Vu, Thanh-Son Nguyen, Duc-Trong Le, Lili Jiang, In: Proceedings of the 28th International Conference on Computational Linguistics (COLING), 2020.
[4] Privacy-Preserving Visual Content Tagging using Graph Transformer Networks. Xuan-Son Vu, Duc-Trong Le, Christoffer Edlund, Lili Jiang, Hoang D. Nguyen, In: Proceedings of the 28th ACM International Conference on Multimedia (ACM MM), 2020.
[5] dpUGC: Learn Differentially Private Representation for User Generated Contents. Xuan-Son Vu, Son N. Tran, Lili Jiang, In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing, April, 2019.
[6] Improving Recurrent Neural Networks with Predictive Propagation for Sequence Labelling. Son N. Tran, Qing Zhang, Anthony Nguyen, Xuan-Son Vu, Son Ngo, In: Proceedings of the 25th International Conference on Neural Information Processing (ICONIP-2018).
[7] Cformer: Semi-Supervised Text Clustering Based on Pseudo Labeling. A Hatefi, Xuan-Son Vu, M Bhuyan, F Drewes, CIKM 2021.