I am a Research Fellow in the College of Computing and Data Science at NTU, working with Prof. XiaoFeng Wang and Prof. Wei Dong. I completed my Ph.D. with honors at Zhejiang University, co-supervised by Prof. Wenyuan Xu, Prof. Xiaoyu Ji, and Prof. Chen Yan. Previously, I obtained my B.Eng. with honors, also from Zhejiang University.
My research focuses on AI security and privacy, especially the security, privacy, and safety of multimodal LLMs & agentic AI systems. I study how to secure interactions between agentic AI systems and the real world. My goal is to help AI agents become robust and responsible partners. Robust agents should be resilient to external attacks, and responsible agents should behave in a helpful, harmless, and honest manner. My work has appeared in security and AI/ML venues including IEEE S&P, ACM CCS, USENIX Security, NDSS, NeurIPS, ICML, ICLR, KDD, CVPR, ACL, TDSC, and TIFS.
My group is broadly interested in the following research directions:
- Security and Privacy of Agentic AI Systems: building robust and responsible agentic AI systems and protecting their interactions with the physical and digital world.
- Responsible AI in Social Contexts: improving the safety, security, and privacy of multi-agent and human-agent interactions, as well as addressing risks in multimodal AIGC (e.g., deepfake generation and detection).
- Trustworthy AI for X (e.g., Science, Systems): enabling reliable AI deployment in healthcare, power grids, software engineering, IoT, and telecommunications systems.
If you are seeking academic collaboration or are interested in joining my lab, please feel free to email me at lxfmakeit(at)gmail.com or xinfeng.li(at)ntu.edu.sg.
News
- 2026.04: A-MemGuard and CentaurEval have been accepted to ICML 2026. Congrats to all collaborators.
- 2026.03: GIFT has been accepted to IEEE S&P 2026. Congrats to Lixu and all collaborators.
- 2026.01: Refusal-Index and PISTOLE have been accepted to ICLR 2026. Congrats to all collaborators.
- 2025.11: EmoRAG has been accepted to SIGKDD 2026. It’s great working with Xinyun to investigate RAG robustness.
- 2025.10: WebCloak, EnchTable have been accepted to S&P 2026. Congratulations to Jialin and all collaborators.
- 2025.09: AgentAuditor has been accepted to NeurIPS 2025. Congratulations to Hanjun and Shenyu.
- 2025.06: AudioTrust has been accepted to ICLR’26! We hope this can serve as a solid foundation for academia and industry for safe audio-based LLM system development. [Github] (Media Coverage: [量子位])
- 2025.06: Neural Invisibility Cloak has been accepted to USENIX Security’25. Congratulations to Wenjun.
- 2025.04: Led/Contributed to 3 (Trustworthy) LLM Agent survey papers are now released: (1) TrustAgent: A survey on trustworthy LLM agents: Threats and countermeasures [Paper (accepted to KDD’25)]; (2) Advances and challenges in foundation agents: From brain-inspired intelligence to evolutionary, collaborative, and safe systems [Paper Github] [HuggingFace] (Media Coverage, e.g., [SANER, 机器之心]); (3) A Comprehensive Survey in LLM (-Agent) Full Stack Safety: Data, Training, and Deployment.
- 2024.11: LightAntenna has been accepted to NDSS 2025.
- 2024.08: Raconteur has been accepted to NDSS 2025 [website].
- 2024.08: Legilimens has been accepted to CCS 2024.
- 2024.05: SafeGen has been accepted to CCS 2024! More information is on [code][pretrained model].
- 2024.05: SafeEar has been accepted to CCS 2024! More information is on [website][code].
- 2023.08: VRifle has been accepted to NDSS 2024.
- 2023.08: I attended the USENIX Security 2023 Symposium and presented our work NormDetect in person.
- 2023.07: SMA has been accepted to ACM MM 2023.
- 2022.09: Tuner and UltraBD were accepted to IoT-J 2023 and ICPADS 2022.
- 2022.07: NormDetect has been accepted to USENIX Security 2023.
- 2021.07: PROLE Score has been accepted to USENIX Security 2022.
- 2020.12: EarArray has been accepted to NDSS 2021.
📝 Selected Research
(*: Equal Contribution, ^: Corresponding Author)
- Cybersecurity vulnerabilities in IoT devices in Nature: Nature Reviews Electrical Engineering, 2026
Chen Yan, Xiaoyu Ji, Qinhong Jiang, Kai Wang, Xintong Wang, Wenjun Zhu, Shilin Xiao, Xinfeng Li, Wenyuan Xu. - A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory ICML 2026
Qianshan Wei, Tengchao Yang, Yaochen Wang, Xinfeng Li^, Lijun Li, Zhenfei Yin, Yi Zhan, Thorsten Holz, Zhiqiang Lin, XiaoFeng Wang. - CentaurEval: Benchmarking Human-in-the-Loop Value in Agentic Coding ICML 2026
Hanjun Luo, Chiming Ni, Jiaheng Wen, Zhimu Huang, Yiran Wang, Bingduo Liao, Sylvia Chung, Yingbin Jin, Xinfeng Li^, Wenyuan Xu, XiaoFeng Wang, Hanan Salam. - WebCloak: Characterizing and Mitigating the Threats of LLM-Driven Web Agents as Intelligent Scrapers IEEE S&P 2026 [Website]
Xinfeng Li, Tianze Qiu, Yingbin Jin, Lixu Wang, Hanqing Guo, Xiaojun Jia, XiaoFeng Wang, Wei Dong. - The Person Behind the Sound: Demystifying Audio Private Attribute Profiling via Multimodal Large Language Models IEEE S&P 2026
Lixu Wang, Kaixiang Yao, Xinfeng Li^, Dong Yang, Haoyang Li, XiaoFeng Wang, Wei Dong^. - EnchTable: Unified Safety Alignment Transfer in Fine-tuned Large Language Models IEEE S&P 2026
Jialin Wu, Kecen Li, Zhicong Huang, Xinfeng Li, XiaoFeng Wang, Cheng Hong. - EmoRAG: Evaluating RAG Robustness to Symbolic Perturbations KDD 2026
Xinyun Zhou*, Xinfeng Li*, Yinan Peng, Ming Xu, Xuanwang Zhang, Miao Yu, Yidong Wang, Xiaojun Jia, Kun Wang, Qingsong Wen, XiaoFeng Wang, Wei Dong. - AudioTrust: Benchmarking the Multifaceted Trustworthiness of Audio Large Language Models ICLR 2026
K. Li, C. Shen, Y. Liu, J. Han, K. Zheng, X. Zou, Z. Wang, S. Zhang, X. Du, H. Luo, Y. Jin, X. Xing, Z. Ma, Y. Liu, Y. Zhang, J. Fang, K. Wang, Y. Yan, G. Deng, H. Li, Y. Li, X. Zhuang, T. Chen, Q. Wen, T. Zhang, Y. Liu, H. Hu, Z. Wu, X. Hu, E. Chng, W. Xu, X. Wang, W. Dong, Xinfeng Li^. - Can LLMs Refuse Questions They Do Not Know? Measuring Knowledge-Aware Refusal in Factual Tasks ICLR 2026
Wenbo Pan, Jie Xu, Qiguang Chen, Junhao Dong, Libo Qin, Xinfeng Li^, Haining Yu, Xiaohua Jia. - Tug-of-War No More: Harmonizing Accuracy and Robustness in Vision-Language Models via Stability-Aware Task Vector Merging ICLR 2026
Junhao Dong, Xinghua Qu, Cong Zhang, Sua Qi Rong, Nguyen Duc Thai, Wenbo Pan, Xinfeng Li^, Tongliang Liu, Piotr Koniusz, Yew-Soon Ong^. - Critical Information Only: A Content Privacy-Preserving Framework for Detecting Audio Deepfakes TDSC 2026
Xinfeng Li, Yifan Zheng, Chen Yan, Kai Li, Chang Zeng, Xiaoyu Ji, Wenyuan Xu. - PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models TIFS 2026
Lingzhi Yuan, Xinfeng Li^, Chejian Xu, Guanhong Tao, Xiaojun Jia, Yihao Huang, Wei Dong, Yang Liu, Bo Li - RACONTEUR: A Knowledgeable, Insightful, and Portable LLM-Powered Shell Command Explainer NDSS 2025 [Website]
Jiangyi Deng*, Xinfeng Li*, Yanjiao Chen, Yijie Bai, Haiqin Weng, Yan Liu, Tao Wei, Wenyuan Xu. - Neural Invisibility Cloak: Concealing Adversary in Images via Compromised AI-driven Image Signal Processing USENIX Security 2025
Wenjun Zhu, Xiaoyu Ji, Xinfeng Li, Qihang Chen, Kun Wang, Xinyu Li, Ruoyan Xu, Wenyuan Xu. - LightAntenna: Characterizing the Limits of Fluorescent Lamp-Induced Electromagnetic Interference NDSS 2025
Fengchen Yang, Wenze Cui, Xinfeng Li, Chen Yan, Xiaoyu Ji, Wenyuan Xu. - AgentAuditor: Human-level Safety and Security Evaluation for LLM Agents NeurIPS 2025
Hanjun Luo, Shenyu Dai, Chiming Ni, Xinfeng Li^, Guibin Zhang, Kun Wang, Tongliang Liu, Hanan Salam. - A Survey on Trustworthy LLM Agents: Threats and Countermeasures KDD 2025
Miao Yu, Fanci Meng, Xinyun Zhou, Shilong Wang, Junyuan Mao, Linsey Pang, Tianlong Chen, Kun Wang, Xinfeng Li^, Yongfeng Zhang, Bo An, Qingsong Wen. - Patronus: Safeguarding Text-to-Image Models against White-Box Adversaries
Xinfeng Li*, Shengyuan Pang*, Jialin Wu, Jiangyi Deng, Huanlong Zhong, Yanjiao Chen, Jie Zhang, Wenyuan Xu. - A Vision for Access Control in LLM-based Agent Systems
Xinfeng Li, Dong Huang, Jie Li, Hongyi Cai, Zhenhong Zhou, Wei Dong, XiaoFeng Wang, Yang Liu. - MedSentry: Understanding and Mitigating Safety Risks in Medical LLM Multi-Agent Systems
Kai Chen, Taihang Zhen, Hewei Wang, Kailai Liu, Xinfeng Li^, Jing Huo, Tianpei Yang, Jinfeng Xu, Wei Dong, Yang Gao. - The Eye of Sherlock Holmes: Uncovering User Private Attribute Profiling via Vision-Language Model Agentic Framework MM 2025
Feiran Liu, Yuzhe Zhang, Xinyi Huang, Yinan Peng, Xinfeng Li^, Lixu Wang, Yutong Shen, Ranjie Duan, Simeng Qin, Xiaojun Jia, Qingsong Wen, Wei Dong. - SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models ACM CCS 2024 [Code][Weights]
Xinfeng Li*, Yuchen Yang*, Jiangyi Deng*, Chen Yan, Yanjiao Chen, Xiaoyu Ji, Wenyuan Xu. - SafeEar: Content Privacy-Preserving Audio Deepfake Detection ACM CCS 2024 [Website][Dataset][Code]
Xinfeng Li*, Kai Li*, Yifan Zheng, Chen Yan, Xiaoyu Ji, Wenyuan Xu. - Legilimens: Practical and Unified Content Moderation for Large Language Model Services ACM CCS 2024 [Code]
Jialin Wu, Jiangyi Deng, Shengyuan Pang, Yanjiao Chen, Jiayang Xu, Xinfeng Li, Wenyuan Xu. - Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time NDSS 2024 [Website1][Website2][Code]
Xinfeng Li, Chen Yan, Xuancun Lu, Zihan Zeng, Xiaoyu Ji, Wenyuan Xu. - Learning Normality is Enough: A Software-Based Mitigation Against Inaudible Voice Attacks USENIX Security 2023 [Website]
Xinfeng Li, Xiaoyu Ji, Chen Yan, Chaohao Li, Yichen Li, Zhenning Zhang, Wenyuan Xu. - Scoring Metrics of Assessing Voiceprint Distinctiveness based on Speech Content and Rate IEEE TDSC 2024
Ruiwen He, Yushi Cheng, Junning Ze, Xinfeng Li, Xiaoyu Ji, Wenyuan Xu. - Detecting Inaudible Voice Commands via Acoustic Attenuation by Multi-channel Microphones IEEE TDSC 2024
Xiaoyu Ji, Guoming Zhang, Xinfeng Li, Gang Qu, Xiuzhen Cheng, Wenyuan Xu. - Enrollment-Stage Backdoor Attacks on Speaker Recognition Systems via Adversarial Ultrasound IEEE IoT-J 2023
Xinfeng Li, Junning Ze, Chen Yan, Yushi Cheng, Xiaoyu Ji, Wenyuan Xu. - Toward Pitch-Insensitive Speaker Verification via Soundfield IEEE IoT-J 2023
Xinfeng Li, Zhicong Zheng, Chen Yan, Chaohao Li, Xiaoyu Ji, Wenyuan Xu. - The Silent Manipulator: A Practical and Inaudible Backdoor Attack against Speech Recognition Systems ACM MM 2023
Zhicong Zheng, Xinfeng Li, Chen Yan, Xiaoyu Ji, Wenyuan Xu. - "OK, Siri" or "Hey, Google": Evaluating Voiceprint Distinctiveness Via Content-based PROLE Score USENIX Security 2022 [Website]
Ruiwen He, Xiaoyu Ji, Xinfeng Li, Yushi Cheng, Wenyuan Xu. - EarArray: Defending against DolphinAttack via Acoustic Attenuation NDSS 2021
Guoming Zhang, Xiaoyu Ji, Xinfeng Li, Gang Qu, Wenyuan Xu.
📚 Professional Services
I actively contribute to the academic community through program organization and peer review for leading conferences and journals in security, AI, and systems.
Program Organization
- KDD 2025: Tutorial Organizer
Conference
- Area Chair: NeurIPS, ICLR’26
- PC Member: AsiaCCS’27, CCS’26, SaTML’26, AAAI’26
- Reviewer: ICML’26, CVPR’26
- External Reviewer: IEEE S&P’19, ‘20; CCS’21, ‘22, ‘23, ‘24; USENIX Security’19, ‘20, ‘21, ‘24; NDSS’20, ‘22, ‘23, ‘24
Journal
- Reviewer: IEEE TIFS, TDSC, TMC, TNNLS, TOSEM, IoT-J, TOIT, TCCN; ACM TOPS; IJCV.
🎖 Honors and Awards
- ACM SIGSAC China Doctoral Dissertation Award (1st), 2025
- CCS 2024 Student Grant, 2024
- NDSS 2024 Student Grant, 2024
- WANG G.S. PhD Research Excellence Award, 2023
- Best Security Partner Award (OPPO Inc.), 2022
- Edison Honors Class@ZJU, Outstanding Graduate Award, 2019
- EE@ZJU Top-10 Scholars Award, 2018
- National Scholarship, 2018