About me
- I am currently a postdoctoral researcher at the Gaoling School of Artificial Intelligence, Renmin University of China. I earned my PhD from University of Montreal, where I was mentored by Prof. Jian-Yun Nie.
- I completed my master’s (2019) and bachelor’s (2016) degrees at Renmin University of China, under the guidance of Prof. Zhicheng Dou and Prof. Ji-Rong Wen, delving into various NLP challenges.
- Research interests: Retrieval-augmented generation, large language models for information retrieval, session-based document ranking
News
- 2025.9: Congrats! WebThinker has been accpeted by NeruIPS 2025! See more details.
- 2025.8: Congrats! Our four papers have been accepted by EMNLP 2025!
- 2024.12: Congrats! Our three papers have been accepted by AAAI 2025!
- 2024.11: Congrats! Our paper “A Text-guided Protein Design Framework” has been accepted by Nature Machine Intelligence!
- 2024.10: We write a new survey about conversational search. See more details.
- 2024.5: We publish a new toolkit ⚡FlashRAG, which can help implement RAG methods quickly! See more details.
Publications
* for corresponding author, # for equal contribution.
2025
NeurIPS 2025WebThinker: Empowering Large Reasoning Models with Deep Research Capability, Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yongkang Wu, Ji-Rong Wen, Yutao Zhu*, and Zhicheng Dou*.EMNLP 2025Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization, Yutao Zhu, Jiajie Jin, Hongjin Qian, Zheng Liu, Zhicheng Dou, and Ji-Rong Wen.EMNLP 2025Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency, Zhaoheng Huang, Yutao Zhu*, Ji-Rong Wen, and Zhicheng Dou.EMNLP 2025Search-o1: Agentic Search-Enhanced Large Reasoning Models, Xiaoxi Li, Guanting Dong, Jiajie Jin, Yuyao Zhang, Yujia Zhou, Yutao Zhu, Peitian Zhang, and Zhicheng Dou.EMNLP 2025 FindingsCoRanking: Collaborative Ranking with Small and Large Ranking Agents, Wenhan Liu, Xinyu Ma, Yutao Zhu, Lixin Su, Shuaiqiang Wang, Dawei Yin, and Zhicheng Dou.TOISLarge Language Models for Information Retrieval: A Survey, Yutao Zhu, Huaying Yuan, Shuting Wang, Jiongnan Liu, Wenhan Liu, Chenlong Deng, Haonan Chen, Zheng Liu, Zhicheng Dou, and Ji-Rong Wen.TOISA Survey of Conversational Search, Fengran Mo, Kelong Mao, Ziliang Zhao, Hongjin Qian, Haonan Chen, Yiruo Cheng, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, and Jian-Yun Nie.TOISSocial Cognitive Theory Enhanced Diversified Recommendation, Zhirui Deng, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.TOISA Model-agnostic Pre-training Framework for Search Result Diversification, Zhirui Deng, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.MM 2025From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models, Yuying Shang#, Xinyi Zeng#, Yutao Zhu#, Xiao Yang, Zhengwei Fang, Jingyuan Zhang, Jiawei Chen, Zinan Liu, and Yu Tian.arXivDecoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search, Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Zhao Yang, Hongjin Qian, and Zhicheng Dou.arXivTool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning, Guanting Dong, Yifei Chen, Xiaoxi Li, Jiajie Jin, Hongjin Qian, Yutao Zhu, Hangyu Mao, Guorui Zhou, Zhicheng Dou, and Ji-Rong Wen.ACL 2025LLMs + Persona-Plug = Personalized LLMs, Jiongnan Liu, Yutao Zhu*, Shuting Wang, Xiaochi Wei, Erxue Min, Yu Lu, Shuaiqiang Wang, Dawei Yin, and Zhicheng Dou.ACL 2025Hierarchical Document Refinement for Long-context Retrieval-augmented Generation, Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu*, Yongkang Wu, Zhonghua Li, Ye Qi, and Zhicheng Dou.ACL 2025Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models, Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, and Zhicheng Dou.ACL 2025RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation, Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, and Ji-Rong Wen.ACL 2025Progressive Multimodal Reasoning via Active Retrieval, Guanting Dong, Chenghao Zhang, Mengjie Deng, Yutao Zhu, Zhicheng Dou, and Ji-Rong Wen.ACL 2025YuLan-Mini: An Open Data-efficient Language Model, Yiwen Hu, Huatong Song, Jia Deng, Jiapeng Wang, Jie Chen, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Wayne Xin Zhao, and Ji-Rong Wen.ACL 2025Towards Effective and Efficient Continual Pre-training of Large Language Models, Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, and Ji-Rong Wen.ACL 2025 FindingsmmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data, Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, and Zhicheng Dou.TOISFrom Matching to Generation: A Survey on Generative Information Retrieval, Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, and Zhicheng Dou.NAACL 2025Little Giants: Synthesizing High-Quality Embedding Data at Scale, Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, and Zhicheng Dou.WWW 2025Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation, Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, and Ji-Rong Wen.WWW 2025 ResourceFlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research, Jiajie Jin, Yutao Zhu*, Xinyu Yang, Chenghao Zhang, and Zhicheng Dou.AAAI 2025One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models, Yutao Zhu, Zhaoheng Huang, Zhicheng Dou, and Ji-Rong Wen.AAAI 2025Toward Verifiable Instruction-Following Alignment for Retrieval Augmented Generation, Guanting Dong, Xiaoshuai Song, Yutao Zhu, Runqi Qiao, Zhicheng Dou, and Ji-Rong Wen.AAAI 2025Descriptive and Discriminative Document Identifiers for Generative Retrieval, Jiehan Cheng, Zhicheng Dou, Yutao Zhu, and Xiaoxi Li.COLING 2025RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation, Shuting Wang, Xin Xu, Mang Wang, Weipeng Chen, Yutao Zhu, and Zhicheng Dou.KDD 2025Embedding Prior Task-specific Knowledge into Language Models for Context-aware Document Ranking, Shuting Wang, Yutao Zhu, and Zhicheng Dou.
2024
NMIA Text-guided Protein Design Framework, Shengchao Liu, Yanjing Li, Zhuoxinran Li, Anthony Gitter, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Arvind Ramanathan, Chaowei Xiao, Jian Tang, Hongyu Guo, and Anima Anandkumar.TKDECAGS: Context-Aware Document Ranking with Contrastive Graph Sampling, Zhaoheng Huang, Yutao Zhu*, Zhicheng Dou, and Ji-Rong Wen.TKDEQuery-oriented Data Augmentation for Session Search, Haonan Chen, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.ACL 2024INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning, Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, and Zhicheng Dou.ACL 2024Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs, Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, and Ji-Rong Wen.ACL 2024 FindingsBIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence, Jiajie Jin, Yutao Zhu, Yujia Zhou, and Zhicheng Dou.KAISHow to Personalize and Whether to Personalize? Candidate Documents Decide, Wenhan Liu, Yujia Zhou, Yutao Zhu, and Zhicheng Dou.SIGIR 2024 ResourceJDivPS: A Diversified Product Search Dataset, Zhirui Deng, Zhicheng Dou, Yutao Zhu, Xubo Qin, Pengchao Cheng, Jiangxu Wu, and Hao Wang.SIGIR 2024 DemoAn Integrated Data Processing Framework for Pretraining Foundation Models, Yiding Sun, Feng Wang, Yutao Zhu, Wayne Xin Zhao, and Jiaxin Mao.TOISPassage-aware Search Result Diversification, Zhan Su, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.WWW 2024Mining Exploratory Queries for Conversational Search, Wenhan Liu, Ziliang Zhao, Yutao Zhu, Zhicheng Dou.WSDM 2024CL4DIV: A Contrastive Learning Framework for Search Result Diversification, Zhirui Deng, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.arXivFrom Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning, Zhirui Deng, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen, Ruibin Xiong, Mang Wang, and Weipeng Chen.arXivYuLan: An Open-source Large Language Model, Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, and Ji-Rong Wen.arXivDemoRank: Selecting Effective Demonstrations for Large Language Models in Ranking Task, Wenhan Liu, Yutao Zhu, and Zhicheng Dou.arXivDomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation, Shuting Wang, Jiongnan Liu, Shiren Song, Jiehan Cheng, Yuqi Fu, Peidong Guo, Kun Fang, Yutao Zhu, and Zhicheng Dou.arXivUFO: a Unified and Flexible Framework for Evaluating Factuality of Large Language Models, Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, and Ji-rong Wen.
2023
EMNLP 2023 FindingsJoint Semantic and Strategy Matching for Persuasive Dialogue, Chuhao Jin, Yutao Zhu, Lingzhen Kong, Shijie Li, Xiao Zhang, Ruihua Song, Xu Chen, huan chen, Yuchong Sun, Yu Chen, and Jun Xu.KDD 2023Learning to Relate to Previous Turns in Conversational Search, Fengran Mo, Jian-Yun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li, and Yang Liu.ACL 2023ConvGQR: Generative Query Reformulation for Conversational Search, Fengran Mo, Kelong Mao, Yutao Zhu, Yihong Wu, Kaiyu Huang, and Jian-Yun Nie.ACL 2023 FindingsHence, Socrates is mortal: A Benchmark for Natural Language Syllogistic Reasoning, Yongkang Wu, Meng Han, Yutao Zhu, Lei Li, Xinyu Zhang, Ruofei Lai, Xiaoguang Li, Yuanhang Ren, Zhicheng Dou, and Zhao Cao.TOISContrastive Learning for Legal Judgment Prediction, Han Zhang, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.AAAI 2023Learning from the Wisdom of Crowds: Exploiting Similar Sessions for Session Search, Yuhang Ye, Zhonghua Li, Zhicheng Dou, Yutao Zhu, Changwang Zhang, Shangquan Wu, and Zhao Cao.WSDM 2023Heterogeneous Graph-based Context-aware Document Ranking, Shuting Wang, Zhicheng Dou, and Yutao Zhu.arXivDon’t Make Your LLM an Evaluation Benchmark Cheater, Kun Zhou, Yutao Zhu, Zhipeng Chen, Wentong Chen, Wayne Xin Zhao, Xu Chen, Yankai Lin, Ji-Rong Wen, and Jiawei Han.arXivWebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus, Hongjing Qian#, Yutao Zhu#, Zhicheng Dou, Haoqi Gu, Xinyu Zhang, Zheng Liu, Ruofei Lai, Zhao Cao, Jian-Yun Nie, and Ji-Rong Wen.arXivAn Empirical Study of Uniform-Architecture Knowledge Distillation in Document Ranking, Xubo Qin, Xiyuan Liu, Xiongfeng Zheng, Jie Liu, and Yutao Zhu.
2022
EMNLP 2022 FindingsMCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling, Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, and Zhengyi Ma.COLING 2022Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding, Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, and Xipeng Qiu.CIKM 2022From Easy to Hard: A Dual Curriculum Learning Framework for Context-Aware Document Ranking, Yutao Zhu, Jian-Yun Nie, Yixuan Su, Haonan Chen, Xinyu Zhang, and Zhicheng Dou.CIKM 2022Enhancing User Behavior Sequence Modeling by Generative Tasks for Session Search, Haonan Chen, Zhicheng Dou, Yutao Zhu, Zhao Cao, Xiaohua Cheng, and Ji-Rong Wen.TOISGDESA: Greedy Diversity Encoder with Self-Attention for Search Results Diversification, Xubo Qin, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.KDD 2022Knowledge Enhanced Search Result Diversification, Zhan Su, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.NAACL-HLT 2022Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation, Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, and Ji-Rong Wen.TOISLeveraging Narrative to Generate Movie Script, Yutao Zhu, Ruihua Song, Jian-Yun Nie, Pan Du, Zhicheng Dou, and Jin Zhou.arXivPReGAN: Answer Oriented Passage Ranking with Weakly Supervised GAN, Du Pan, Jian-Yun Nie, Yutao Zhu, Hao Jiang, Lixin Zou, and Xiaohui Yan.
2021
CIKM 2021Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking, Yutao Zhu, Jian-Yun Nie, Zhicheng Dou, Zhengyi Ma, Xinyu Zhang, Pan Du, Xiaochen Zuo, and Hao Jiang.CIKM 2021PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling, Yujia Zhou, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.CIKM 2021Learning Implicit User Profile for Personalized Retrieval-Based Chatbot, Hongjin Qian, Zhicheng Dou, Yutao Zhu, Yueyuan Ma, and Ji-Rong Wen.CCIR 2021(Best Paper Award) Interaction-Based Document Matching for Implicit Search Result Diversification, Xubo Qin, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.TOISGraph Neural Collaborative Topic Model for Citation Recommendation, Qianqian Xie, Yutao Zhu, Jimin Huang, Pan Du, and Jian-Yun Nie.CCL 2021Few-Shot Charge Prediction with Multi-grained Features and Mutual Information, Han Zhang, Zhicheng Dou, Yutao Zhu, and Ji-Rong Wen.SIGIR 2021 ShortProactive Retrieval-based Chatbots based on Relevant Knowledge and Goals, Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, Hao Jiang, and Zhicheng Dou.SIGIR 2021Modeling Intent Graph for Search Result Diversification, Zhan Su, Zhicheng Dou, Yutao Zhu, Xubo Qin, and Ji-Rong Wen.SIGIR 2021One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles, Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, and Ji-Rong Wen.SIGIR 2021 ResourcePchatbot: A Large-Scale Dataset for Personalized Chatbot, Hongjin Qian, Xiaohe Li, Hanxun Zhong, Yu Guo, Yueyuan Ma, Yutao Zhu, Zhanliang Liu, Zhicheng Dou, and Ji-Rong Wen.ECIR 2021Content Selection Network for Document-grounded Retrieval-based Chatbots, Yutao Zhu, Jian-Yun Nie, Kun Zhou, Pan Du, and Zhicheng Dou.AAAI 2021Neural Sentence Ordering Based on Constraint Graphs, Yutao Zhu, Kun Zhou, Jian-Yun Nie, Shengchao Liu, and Zhicheng Dou.arXivEmotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator, Hao Jiang, Yutao Zhu, Xinyu Zhang, Zhicheng Dou, Pan Du, Te Pi, and Yantao Jia.arXivBERT4SO: Neural Sentence Ordering by Fine-tuning BERT, Yutao Zhu, Jian-Yun Nie, Kun Zhou, Shengchao Liu, Yabo Ling, and Pan Du.
2020
CIKM 2020S^3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization, Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen.ACL 2020ScriptWriter: Narrative-Guided Script Generation, Yutao Zhu, Ruihua Song, Zhicheng Dou, Jian-Yun Nie, and Jin Zhou.PAKDD 2020Improving Multi-Turn Response Selection Models with Complementary Last-Utterance Selection by Instance Weighting, Kun Zhou, Wayne Xin Zhao, Yutao Zhu, Ji-Rong Wen, and Jingsong Yu.
2019
IRJReBoost: A Retrieval-Boosted Sequence-to-SequenceModel for Neural Response Generation, Yutao Zhu, Zhicheng Dou, Jian-Yun Nie, and Ji-Rong Wen.IRJDeep Cross-platform Product Matching in E-commerce, Juan Li, Zhicheng Dou, Yutao Zhu, Xiaochen Zuo, and Ji-Rong Wen.NTCIR 2019A Hybrid Framework of Emotion-Aware Seq2Seq Model for Emotional Conversation Generation, Xiaohe Li, Jiaqing Liu, Weihao Zheng, Xiangbo Wang, Yutao Zhu, and Zhicheng Dou.
2018
SIGIR 2018 ShortAn Attribute-aware Neural Attentive Model for Next Basket Recommendation, Ting Bai, Jian-Yun Nie, Wayne Xin Zhao, Yutao Zhu, Pan Du, and Ji-Rong Wen.
Experiences
- 2021.12 - 2022.12, Research Intern, Poisson Lab, Huawei
. Supervised by Xinyu Zhang - 2018.8 - 2019.6, Research Intern, XiaoIce, Microsoft Asia
. Supervised by Ruihua Song - 2016.9 - 2019.6, Research Assistant, Beijing Key Lab of Big Data Management and Analysis Methods. Supervised by Zhicheng Dou and Ji-Rong Wen
- 2016.6 - 2016.9, Software Engineer, Infosys Technology Limited
. Supervised by Anjaneyulu Pasala
Academic Services
- AC/SPC: ACL Rolling Review
- PC Member: ACL, SIGIR, NeurIPS, ICLR, ICML, WWW, SIGKDD, AAAI, MM, EMNLP, CIKM, WSDM, COLING, COLM
- Journal Reviewer: PNAS, TOIS, JASIST, KAIS, TALLIP, Computing Surveys