Hello! I am a forth-year PhD student in the Language Analysis Group at HIT-SCIR, under the supervision of Prof. Wanxiang Che and Assoc. Prof. Qingfu Zhu. Currently, I am a KStar Research Intern at Kuaishou Technology.
My primary research interest is Code Intelligence. I focus on identifying and addressing bottlenecks across the full pipeline: Pretrain, Post-Train, Application and Acceleration of Inference.
If you are interested in my research or potential collaborations, please feel free to reach out to me at xzluo@ir.hit.edu.cn~π
I am interest in algorithm competitions. During my undergraduate years, I participated in various programming contests and served as the president of the Programming and Algorithms Association and vice president of the Federation of Student Associations.
π₯ News
- 2025.12: π₯ Our survey A Practical Guide to Code Intelligence is publicly available! Honored to have participated as a core contributor.
- 2025.07: π Our Token Recycling is selected as ACL2025 Outstanding Paper!
- 2025.06: π Our Token Recycling and OpenCoder are selected as Oral presentation at ACL2025! See you in Vienna! π¦πΉ
- 2025.05: Β π Our Token Recycling, ChartCoder, OpenCoder are accepted by ACL 2025! Our ChartEdit is accepted by findings of ACL 2025! And Tool-MVRL is accepted by KDD 2025! Congratulations to all our collaborators!
- 2024.09: Β π Our MultiPoT and Make Some Noise are accepted by EMNLP 2024! Congratulations to all our collaborators!
- 2024.09: π₯Β We release Abacus, a 2.7B Code LLM, complete with open weights and detailed training documentation!
π Publications
Pretrain
- ACL 2025 Oral OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models, Siming Huang, Tianhao Cheng, Jason Klein Liu, Weidi Xu, JIARAN HAO, Liuyihan Song, Yang Xu, Jian Yang, Jiaheng Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Xianzhen Luo, Qiufeng Wang, YuanTao Fan, Qingfu Zhu, Zhaoxiang Zhang, Yang Gao, Jie Fu, Qian Liu, Houyi Li, Ge Zhang, Yuan Qi, Xu Yinghui, Wei Chu, Zili Wang.
- Tech Report Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding, Core Contributor.
- Arxiv 2025 Scaling Laws for Code: A More Data-Hungry Regime, Xianzhen Luoβ , Wenzhen Zhengβ , Qingfu Zhu, Rongyi Zhang, Houyi Li, Siming Huang, Yuantao Fan, Wanxiang Che.
- Arxiv 2025 Is Compression Really Linear with Code Intelligence?, Xianzhen Luoβ , Shijie Xuyangβ , Tianhao Cheng, Zheng Chu, Houyi Li, Ziqi Wang, Siming Huang, Qingfu Zhu, Qiufeng Wang, Xiangyu Zhang, Shuigeng Zhou, Wanxiang Che.
Post-Train
- ACL 2025 ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation, Xuanle Zhaoβ , Xianzhen Luoβ , Qi Shi, Chi Chen, Shuo Wang, Zhiyuan Liu, Maosong Sun.
- KDD 2025 Advancing Tool-Augmented Large Language Models via Meta-Verification and Reflection Learning, Zhiyuan Ma, Jiayu Liu, Xianzhen Luo, Zhenya Huang, Qingfu Zhu, Wanxiang Che.
- EMNLP 2024 Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training, Yixuan Wangβ , Xianzhen Luoβ , Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che.
- Arxiv 2025 Success is in the Details: Evaluate and Enhance Details Sensitivity of Code LLMs through Counterfactuals, Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Mingzheng Xu, Tianhao Cheng, Yixuan Wang, Zheng Chu, Shijie Xuyang, Zhiyuan Ma, YuanTao Fan, Wanxiang Che.
- Arxiv 2025 Automated Snippet-Alignment Data Augmentation for Code Translation, Zhiming Zhang, Qingfu Zhu, Xianzhen Luo, Yixuan Wang, Bohan Li, Wanxiang Che.
- Arxiv 2024 Semi-Instruct: Bridging Natural-Instruct and Self-Instruct for Code Large Language Models, Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che.
Inference
- ACL 2025 Outstanding Paper Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling, Xianzhen Luo, Yixuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu.
- ACL 2025 (Findings) ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMsβ Capability via Chart Editing, Xuanle Zhaoβ , Xuexin Liuβ , Yang Haoyueβ , Xianzhen Luo, Fanhu Zeng, Jianling Li, Qi Shi, Chi Chen.
- EMNLP 2024 Python is Not Always the Best Choice: Embracing Multilingual Program of Thoughts, Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che.
- Arxiv 2025 How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective, Xianzhen Luoβ , Jinyang Huangβ , Wenzhen Zheng, Qingfu Zhu, Mingzheng Xu, Yiheng Xu, Yuantao Fan, Libo Qin, Wanxiang Che.
- Arxiv 2025 Format-Adapter: Improving Reasoning Capability of LLMs by Adapting Suitable Format, Dingzirui Wang, Xuanliang Zhang, Rongyu Cao, Longxu Dou, Xianzhen Luo, Yingwei Ma, Qingfu Zhu, Wanxiang Che, Binhua Li, Fei Huang, Yongbin Li.
Survey
- LREC-COLING 2024 A Survey on Natural Language Processing for Programming, Qingfu Zhu, Xianzhen Luo, Fang Liu, Cuiyun Gao, Wanxiang Che.
- Arxiv 2025 From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence, Core Contributor.
Others
- ACL 2022 (Findings) Inverse is better! fast and accurate prompt for few-shot slot tagging, Yutai Hou, Cheng Chen, Xianzhen Luo, Bohan Li, Wanxiang Che.
- AI Open, 2022 Augmented and challenging datasets with multi-step reasoning and multi-span questions for Chinese judicial reading comprehension,Qingye Meng, Ziyue Wang, Hang Chen, Xianzhen Luo, Baoxin Wang, Zhipeng Chen, Yiming Cui, Dayong Wu, Zhigang Chen, Shijin Wang.
β indicates equal contribution.
π Honors and Awards
- 2025.10 Merit Student (δΈε₯½ε¦η) of Heilongjiang Province.
- 2025.10 (PhD Student) National Scholarship.
- 2025.07 ACL Outstanding Paper.
- 2022.06 Outstanding Graduate.
- 2021.04 International Collegiate Programming Contest Asia-East Continent Final Contest: Bronze Medal.
- 2020.12 National Encouragement Scholarship.
- 2020.12 International Collegiate Programming Contest Asia Shanghai Regional Contest: Silver Medal.
- 2020.11 China Collegiate Programming Contest Mianyang Site: Silver Medal.
- 2020.10 Northeast Collegiate Programming Contest: First Prize.
- 2019.12 (Undergraduate) National Scholarship.
- 2019.12 International Collegiate Programming Contest Asia-East Continent Final Contest: Bronze Medal.
- 2019.11 International Collegiate Programming Contest Asia Shenyang Regional Contest: Silver Medal.
π Educations
- 2022.09 - now, Ph.D. student, Harbin Institute of Technology.
- 2018.09 - 2022.07, Undergraduate, Harbin Engineering University.
π¬ Invited Talks
- 2025.08, I was invited to give a talk at Alibaba International Consumer Business Unit to share and discuss our paper Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling.
- 2024.03, I was invited to give a talk at Qiyuan Lab about the Training and Application of Code Large Language Models.
π» Internships
- 2025.08 - Present, KStar Research Intern, Kuaishou Technology, China.
- Adviser: Jingyuan Zhang
- Research Focus: Terminal agent development and research.
- 2024.12 - 2025.07, Research Intern, StepFun AI, China.
- Adviser: Xiangyu Zhang
- Research Focus: The code aspects of LLM pretraining.
- Key Contributions: Developed code data cleaning, training & evaluation pipelines. Provided core code pretraining data for Step3 LLM. Implemented several specialized pretraining tasks/strategies on code.
- 2023.11 - 2024.09, University-Industry Collaboration Researcher, Du Xiaoman (Beijing) Science Technology Co., Ltd., China.
- 2022.03 - 2022.08, Research Intern, Joint Laboratory of HIT and iFLYTEK Research (HFL), China.