Publications
A collection of my research work.
CAGS: Context-Aware Document Ranking With Contrastive Graph Sampling
Zhaoheng Huang, Yutao Zhu†, Zhicheng Dou, Ji-Rong Wen
IEEE Transactions on Knowledge and Data Engineering, TKDE 2025
Embedding Prior Task-specific Knowledge into Language Models for Context-aware Document Ranking
Shuting Wang, Yutao Zhu, Zhicheng Dou
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation
Shuting Wang, Xin Yu, Mang Wang, Weipeng Chen, Yutao Zhu, Zhicheng Dou
Proceedings of the 31st International Conference on Computational Linguistics, COLING 2025
Descriptive and Discriminative Document Identifiers for Generative Retrieval
Jiehan Cheng, Zhicheng Dou, Yutao Zhu, Xiaoxi Li
Thirty-Ninth AAAI Conference on Artificial Intelligence, AAAI 2025
Little Giants: Synthesizing High-Quality Embedding Data at Scale
Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025
mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou
Findings of the Association for Computational Linguistics, ACL 2025
Towards Effective and Efficient Continual Pre-training of Large Language Models
Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model
Yiwen Hu, Huatong Song, Jie Chen, Jia Deng, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Yang Lu, Xu Miao, Xin Zhao, Ji-Rong Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Progressive Multimodal Reasoning via Active Retrieval
Guanting Dong, Chenghao Zhang, Mengjie Deng, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation
Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models
Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Hierarchical Document Refinement for Long-context Retrieval-augmented Generation
Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu†, Yongkang Wu, Zhonghua Li, Ye Qi, Zhicheng Dou
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
LLMs + Persona-Plug = Personalized LLMs
Jiongnan Liu, Yutao Zhu†, Shuting Wang, Xiaochi Wei, Erxue Min, Yu Lu, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
A Model-agnostic Pre-training Framework for Search Result Diversification
Zhirui Deng, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
ACM Transactions on Information Systems, TOIS 2025
Social Cognitive Theory Enhanced Diversified Recommendation
Zhirui Deng, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
ACM Transactions on Information Systems, TOIS 2025
CoRanking: Collaborative Ranking with Small and Large Ranking Agents
Wenhan Liu, Xinyu Ma, Yutao Zhu, Lixin Su, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
Findings of the Association for Computational Linguistics: EMNLP 2025
Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency
Zhaoheng Huang, Yutao Zhu†, Ji-Rong Wen, Zhicheng Dou
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
Single LLM, Multiple Roles: A Unified Retrieval-Augmented Generation Framework Using Role-Specific Token Optimization
Yutao Zhu, Jiajie Jin, Hongjin Qian, Zheng Liu, Zhicheng Dou, Ji-Rong Wen
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, EMNLP 2025
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yongkang Wu, Ji-Rong Wen, Yutao Zhu†, Zhicheng Dou
Advances in Neural Information Processing Systems 39: Annual Conference on Neural Information Processing Systems, NeurIPS 2025
YuLan: An Open-source Large Language Model
Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, Lei Zhang, Junyi Li, Xiaolei Wang, Lei Wang, Beichen Zhang, Zican Dong, Xiaoxue Cheng, Yuhan Chen, Xinyu Tang, Yupeng Hou, Qiangqiang Ren, Xincheng Pang, Shufang Xie, Wayne Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ze-Feng Gao, Yueguo Chen, Weizheng Lu, Ji-Rong Wen
CoRR 2024
Mining Exploratory Queries for Conversational Search
Wenhan Liu, Ziliang Zhao, Yutao Zhu, Zhicheng Dou
Proceedings of the ACM on Web Conference, WWW 2024
JDivPS: A Diversified Product Search Dataset
Zhirui Deng, Zhicheng Dou, Yutao Zhu, Xubo Qin, Pengchao Cheng, Jiangxu Wu, Hao Wang
Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024
How to personalize and whether to personalize? Candidate documents decide
Wenhan Liu, Yujia Zhou, Yutao Zhu, Zhicheng Dou
Knowledge and Information Systems, KAIS 2024
BIDER: Bridging Knowledge Inconsistency for Efficient Retrieval-Augmented LLMs via Key Supporting Evidence
Jiajie Jin, Yutao Zhu, Yujia Zhou, Zhicheng Dou
Findings of the Association for Computational Linguistics: ACL 2024
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning
Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Query-Oriented Data Augmentation for Session Search
Haonan Chen, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
IEEE Transactions on Knowledge and Data Engineering, TKDE 2024
Learning from the Wisdom of Crowds: Exploiting Similar Sessions for Session Search
Yuhang Ye, Zhonghua Li, Zhicheng Dou, Yutao Zhu, Changwang Zhang, Shangquan Wu, Zhao Cao
Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023
Contrastive Learning for Legal Judgment Prediction
Han Zhang, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
ACM Transactions on Information Systems, TOIS 2023
Hence, Socrates is mortal: A Benchmark for Natural Language Syllogistic Reasoning
Yongkang Wu, Meng Han, Yutao Zhu, Lei Li, Xinyu Zhang, Ruofei Lai, Xiaoguang Li, Yuanhang Ren, Zhicheng Dou, Zhao Cao
Findings of the Association for Computational Linguistics: ACL 2023
Joint Semantic and Strategy Matching for Persuasive Dialogue
Chuhao Jin, Yutao Zhu, Lingzhen Kong, Shijie Li, Xiao Zhang, Ruihua Song, Xu Chen, Huan Chen, Yuchong Sun, Yu Chen, Jun Xu
Findings of the Association for Computational Linguistics: EMNLP 2023
Graph Neural Collaborative Topic Model for Citation Recommendation
Qianqian Xie, Yutao Zhu, Jimin Huang, Pan Du, Jian-Yun Nie
ACM Transactions on Information Systems, TOIS 2022
Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation
Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-Rong Wen
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
Knowledge Enhanced Search Result Diversification
Zhan Su, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2022
Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding
Zhaoye Fei, Yu Tian, Yongkang Wu, Xinyu Zhang, Yutao Zhu, Zheng Liu, Jiawen Wu, Dejiang Kong, Ruofei Lai, Zhao Cao, Zhicheng Dou, Xipeng Qiu
Proceedings of the 29th International Conference on Computational Linguistics, COLING 2022
MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling
Zhaoheng Huang, Zhicheng Dou, Yutao Zhu, Zhengyi Ma
Findings of the Association for Computational Linguistics: EMNLP 2022
BERT4SO: Neural Sentence Ordering by Fine-tuning BERT
Yutao Zhu, Jian-Yun Nie, Kun Zhou, Shengchao Liu, Pan Du
CoRR 2021
Emotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator
Hao Jiang, Yutao Zhu, Xinyu Zhang, Zhicheng Dou, Pan Du, Te Pi, Yantao Jia
CoRR 2021
Few-Shot Charge Prediction with Multi-grained Features and Mutual Information
Han Zhang, Zhicheng Dou, Yutao Zhu, Jirong Wen
Chinese Computational Linguistics - 20th China National Conference, CCL 2021
Interaction-Based Document Matching for Implicit Search Result Diversification
Xubo Qin, Zhicheng Dou, Yutao Zhu, Ji-Rong Wen
Information Retrieval - 27th China Conference, CCIR 2021
Deep cross-platform product matching in e-commerce
Juan Li, Zhicheng Dou, Yutao Zhu, Xiaochen Zuo, Ji-Rong Wen
Information Retrieval Journal 2020
S3-Rec: Self-Supervised Learning for Sequential Recommendation with Mutual Information Maximization
Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, Ji-Rong Wen
The 29th ACM International Conference on Information and Knowledge Management, CIKM 2020
A Hybrid Framework of Emotion-Aware Seq2Seq Model for Emotional Conversation Generation
Xiaohe Li, Jiaqing Liu, Weihao Zheng, Xiangbo Wang, Yutao Zhu, Zhicheng Dou
NII Testbeds and Community for Information Access Research - 14th International Conference, NTCIR 2019
An Attribute-aware Neural Attentive Model for Next Basket Recommendation
Ting Bai, Jian-Yun Nie, Wayne Xin Zhao, Yutao Zhu, Pan Du, Ji-Rong Wen
The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018