Haoran Li 李浩然

Ph.D.

Computer Science
Hong Kong University of Science and Technology

Contact:
hlibt [at] connect [dot] ust [dot] hk
University Center 101, Tower C, HKUST

[Google Scholar] [CV]


About Me

I am currently a fourth-year Ph.D. student majoring in Computer Science at the Hong Kong University of Science and Technology advised by Prof. Yangqiu Song. I obtained my B.S. degree, majoring in Computer Science and Math-CS track, from the Hong Kong University of Science and Technology in 2020.

I was an intern student in the Toutiao AI Lab, Bytedance for NLP research during Summer of 2022.

My research interest is mainly about privacy studies in NLP that include:

Preprints

Haoran Li*, Dadi Guo*, Donghao Li*, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yangqiu Song. P-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models. 2023. [link]
Haoran Li*, Yulin Chen*, Jinglong Luo*, Yan Kang, Xiaojin Zhang, Qi Hu, Chunkit Chan, Yangqiu Song. Privacy in Large Language Models: Attacks, Defenses and Future Directions. 2023. [link]
Qi Hu, Haoran Li, Jiaxin Bai, Yangqiu Song. Privacy-Preserving Neural Graph Databases. 2023. [link]

Publications

Multi-step Jailbreaking Privacy Attacks on ChatGPT
Haoran Li*, Dadi Guo*, Wei Fan, Mingshi Xu, Jie Huang, Fanpu Meng, Yangqiu Song
Findings of EMNLP 2023
[ Code ] [ Paper ]
In this paper, we study the privacy threats from OpenAI's model APIs and New Bing enhanced by ChatGPT and show that application-integrated LLMs may cause more severe privacy threats ever than before.
Sentence Embedding Leaks More Information than You Expect: Generative Embedding Inversion Attack to Recover the Whole Sentence
Haoran Li, Mingshi Xu, Yangqiu Song
Findings of ACL 2023
[ Code ] [ Paper ]
In this work, we further investigate the information leakage issue and propose a generative embedding inversion attack (GEIA) that aims to reconstruct input sequences based only on their sentence embeddings.
You Don't Know My Favorite Color: Preventing Dialogue Representations from Revealing Speakers' Private Personas
Haoran Li, Yangqiu Song, Lixin Fan
NAACL 2022 (Oral)
[ Code ] [ Paper ]
We investigate the privacy leakage of the hidden states of chatbots trained by language modeling which has not been well studied yet. We show that speakers' personas can be inferred through a simple neural network with high accuracy. To this end, we propose effective defense objectives to protect persona leakage from hidden states.
FedAssistant: Dialog Agents with Two-side Modeling
Haoran Li*, Ying Su*, Qi Hu, Jiaxin Bai, Yilun Jin, Yangqiu Song
FL-IJCAI'22
[ Code ] ] [ Paper ]
We propose a framework named FedAssistant to training neural dialog systems in a federated learning setting. Our framework can be trained on multiple data owners with no raw data leakage during the process of training and inference. (Code and paper will appear later.)
Differentially Private Federated Knowledge Graphs Embedding
Hao Peng*, Haoran Li*, Yangqiu Song, Vincent Zheng, Jianxin Li
CIKM 2021 (Oral)
[ Code ] [ Paper ]
We propose a novel decentralized scalable learning framework, Federated Knowledge Graphs Embedding (FKGE), where embeddings from different knowledge graphs can be learnt in an asynchronous and peer-to-peer manner while being privacy-preserving.
Self-supervised Dance Video Synthesis Conditioned on Music
Xuanchi Ren, Haoran Li, Zijian Huang, Qifeng Chen
ACM International Conference on Multimedia (ACM MM), 2020 (Oral)
[ Code ] [ Paper ]

Undergraduate Final Year Project.

Academic Services

Reviewer at ARR Rolling Review
Reviewer at KDD 2023

Teaching

Teaching Assistant Coordinator (TAC) of Computer Science and Engineering at HKUST, 2023-2024.
Teaching Assistant of COMP4332/RMBI4310 Big Data Mining at HKUST (Spring 2021, 2022).
Teaching Assistant of COMP2011 Programming with C++ at HKUST (Fall 2022).
Last updated: Oct/18/2023
This web template comes from my buddy Zhenmei Shi, a talented guy in Math and machine learning. Thank little Mei for the beautiful template!