Homepage - Xuan Gong's Homepage

Selected Publications on Google Scholar (view all )

RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks

Hanbo Huang, Yiran Zhang, Hao Zheng, Xuan Gong, Yihan Li, Lin Liu, Zhuotao Liu, Shiyu Liang^# (^# corresponding author)

ICML 2026

RLCracker studies adaptive reinforcement-learning attacks against LLM watermarks, exposing watermark vulnerabilities under learned black-box attack policies.

[Paper]

RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks

Hanbo Huang, Yiran Zhang, Hao Zheng, Xuan Gong, Yihan Li, Lin Liu, Zhuotao Liu, Shiyu Liang^# (^# corresponding author)

ICML 2026

RLCracker studies adaptive reinforcement-learning attacks against LLM watermarks, exposing watermark vulnerabilities under learned black-box attack policies.

[Paper]

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

Xuan Gong, Senmiao Wang, Hanbo Huang, Ruoyu Sun, Shiyu Liang^# (^# corresponding author)

ACL 2026 Main

VCORE introduces variance-controlled optimization-based reweighting for chain-of-thought supervision, improving how reasoning traces contribute to model training.

[Paper] [Code]

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

Xuan Gong, Senmiao Wang, Hanbo Huang, Ruoyu Sun, Shiyu Liang^# (^# corresponding author)

ACL 2026 Main

VCORE introduces variance-controlled optimization-based reweighting for chain-of-thought supervision, improving how reasoning traces contribute to model training.

[Paper] [Code]

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning

Xuan Gong, Hanbo Huang, Hao Zheng, Yiran Zhang, Wenbin Dai, Weishu Zhao, Shiyu Liang^# (^# corresponding author)

CompLearn Workshop @ ICML 2026

This work introduces reflection anchors for propagation-aware visual retention, targeting long-chain multimodal reasoning where visual evidence must remain reliable across extended inference.

[Paper] [OpenReview]

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning

Xuan Gong, Hanbo Huang, Hao Zheng, Yiran Zhang, Wenbin Dai, Weishu Zhao, Shiyu Liang^# (^# corresponding author)

CompLearn Workshop @ ICML 2026

This work introduces reflection anchors for propagation-aware visual retention, targeting long-chain multimodal reasoning where visual evidence must remain reliable across extended inference.

[Paper] [OpenReview]

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

Xuan Gong*, Hanbo Huang*, Yiran Zhang*, Shiyu Liang^# (* equal contribution, ^# corresponding author)

ICASSP 2026

We revisit how supervised fine-tuning affects factual knowledge in LLMs, revealing a factuality gap between known and unknown knowledge. This gap can be mitigated at inference via in-context learning (ICL) or out-of-distribution prompts. Our theoretical and empirical results show that test-time prompts can overshadow fine-tuning data, suggesting ICL can compensate for poor fine-tuning and should be considered in evaluating fine-tuning strategies.

[Paper]

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

Xuan Gong*, Hanbo Huang*, Yiran Zhang*, Shiyu Liang^# (* equal contribution, ^# corresponding author)

ICASSP 2026

[Paper]

RLSpoofer: A Sample-Efficient Black-Box Spoofing Attack for Stress-Testing LLM Watermarks

Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng, Wenbin Dai, Jie Kuang, Shiyu Liang^# (^# corresponding author)

Trustworthy AI for Good (AI4GOOD) Workshop @ ICML 2026

This workshop paper studies sample-efficient black-box spoofing attacks for stress-testing the robustness of LLM watermarks.

[Paper]

RLSpoofer: A Sample-Efficient Black-Box Spoofing Attack for Stress-Testing LLM Watermarks

Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng, Wenbin Dai, Jie Kuang, Shiyu Liang^# (^# corresponding author)

Trustworthy AI for Good (AI4GOOD) Workshop @ ICML 2026

This workshop paper studies sample-efficient black-box spoofing attacks for stress-testing the robustness of LLM watermarks.

[Paper]

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination

Xuan Gong, Tianshi Ming, Xinpeng Wang, Zhihua Wei^# (^# corresponding author)

EMNLP 2024 Main

We propose DAMRO, a training-free method to reduce object hallucination in LVLMs by filtering misleading high-attention background tokens using the ViT CLS token. DAMRO significantly improves hallucination control on models like LLaVA and InstructBLIP across multiple benchmarks.

[Paper] [Code]

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination

Xuan Gong, Tianshi Ming, Xinpeng Wang, Zhihua Wei^# (^# corresponding author)

EMNLP 2024 Main

[Paper] [Code]

Education

Honors & Awards

News

Selected Publications on Google Scholar (view all )

RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks

RLCracker: Evaluating the Worst-Case Vulnerability of LLM Watermarks with Adaptive RL Attacks

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought Supervision

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning

Reflection Anchors for Propagation-Aware Visual Retention in Long-Chain Multimodal Reasoning

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

From Parameters to Prompts: Understanding and Mitigating the Factuality Gap between Fine-Tuned LLMs

RLSpoofer: A Sample-Efficient Black-Box Spoofing Attack for Stress-Testing LLM Watermarks

RLSpoofer: A Sample-Efficient Black-Box Spoofing Attack for Stress-Testing LLM Watermarks

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination

DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination

All publications

Visitor Map