About Me

I am a tenure-track Assistant Professor and PI at Westlake University, where I lead the AGI Lab. Before joining Westlake University, I worked as a scientist at Tencent.

I obtained my Ph.D. degree at the School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore, where I worked under the supervision of Prof. Guosheng Lin. I also work closely with Prof. Chunhua Shen and Prof. Rui Yao in research. I was recognized among World's Top 2% Scientists by Stanford University in 2023 and 2024.

[2025] 📢 We're hiring! Multiple positions for PhD students, postdocs, visiting students, and research assistants are available! See 招生信息 →

To learn more about our lab, feel free to contact me or any member of our team😊

Research Interests

My current research focuses on Generative AI, including theoretical foundations of generative models, multimodal generative modeling, and multimodal intelligent agents. In the past, my work has also spanned broader areas in machine learning and computer vision.

News

Scroll for more
May 2026
🎉 EPD-solver v2 is accepted by T-PAMI.
May 2026
🎉 One paper is accepted by ICML 2026.
Apr 2026
🎉 3 papers (1 main + 2 findings) are accepted by ACL 2026.
Feb 2026
🎉🎉🎉 12 papers (10 main + 2 findings) are accepted by CVPR 2026. 7 of them are from undergraduate students. Congratulations to the students!!
Jan 2026
🎉 PMI is accepted by ICLR 2026.
Dec 2025
EPD-Solver V2 is released.
Dec 2025
🎉AdaSDE is accepted by NeurIPS 2025 .
Nov 2025
Serving as Area Chair for ICML 2026.
Sep 2025
Presenting a tutorial on reasoning in GUI agents at ICCV 2025.
Sep 2025
🚀 We present WorldForge, a training-free world model built on video diffusion.
Sep 2025
Serving as Area Chair for CVPR 2026.
Aug 2025
Serving as Area Chair for ICLR 2026.
Jul 2025
🚀 We’ve launched Ultra3D, enabling efficient high-resolution 3D generation.
Jul 2025
🎉 3 papers are accepted by ICCV 2025. See you guys in Hawaii!
Jun 2025
🎉 5 papers are accepted by CVPR 2025.
Jun 2025
🎬 We present FlowDirector, a training-free video editing model with SOTA performance.
Mar 2025
Serving as Area Chair for ACM MM 2025 and IJCNN 2025.
Mar 2025
🔥🔥🔥 AppAgentX is released! The next generation of AppAgent supports self-evolution.
Feb 2025
🔥 We present Distill Any Depth, setting new SOTA for monocular depth estimation.
Feb 2025
🎉 MeshAnything is accepted by ICLR 2025.
Dec 2024
Excited to join the Editorial Board of IEEE T-CSVT as an Associate Editor!
Nov 2024
🔥 We present StyleStudio, setting new SOTA for text-driven image style transfer tasks.
Jul 2024
🎉 I joined Westlake University as an Assistant Professor (PI) and established AGI lab.

Academic Service

Hobbies

I like singing and playing football. I am a loyal fan of FC Barcelona PSG Inter Miami .

My favorite singers are Jacky Cheung and Freddie Mercury.

Selected Projects

AppAgent

AppAgent: Multimodal Agents as Smartphone Users

Chi Zhang, Zhao Yang, Jiaxuan Liu, Yuchen Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu
CHI 2025
MeshAnything

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang
ICLR 2025
MeshAnything V2

MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization

Yiwen Chen, Yikai Wang*, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang*, Guosheng Lin*
ICCV 2025
GaussianEditor

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin
CVPR 2024
Metric3D

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

Wei Yin*, Chi Zhang*, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen
ICCV 2023
EPD-Solver

Parallel Diffusion Solver via Residual Dirichlet Policy Optimization

Ruoyu Wang, Ziyu Li, Beier Zhu, Liangyu Yuan, Hanwang Zhang, Xun Yang, Xiaojun Chang, Chi Zhang
IEEE TPAMI 2026
StyleAvatar3D

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
IEEE JSTSP 2026
SwitchCraft

SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls

Qianxun Xu, Chenxi Song, Yujun Cai, Chi Zhang
CVPR 2026
CRAFT-LoRA

CRAFT-LoRA: Content-Style Personalization via Rank-Constrained Adaptation and Training-Free Fusion

Yu Li, Yujun Cai, Chi Zhang
CVPR 2026
PDF
Free Lunch for Stabilizing Rectified Flow Inversion

Free Lunch for Stabilizing Rectified Flow Inversion

Chenru Wang, Beier Zhu, Chi Zhang
ICLR 2026
PDF
FreeLOC

Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

Jiahao Tian, Chenxi Song, Wei Cheng, Chi Zhang
CVPR 2026
Instance-Aware Discretizations

Few-Step Diffusion Sampling Through Instance-Aware Discretizations

Liangyu Yuan, Ruoyu Wang, Tong Zhao, Dingwen Fu, Mingkun Lei, Beier Zhu, Chi Zhang
CVPR 2026
Weak-to-Strong Segmented Guidance

Improving Diffusion Generalization with Weak-to-Strong Segmented Guidance

Liangyu Yuan, Yufei Huang, Mingkun Lei, Tong Zhao, Ruoyu Wang, Changxi Chi, Yiwei Wang, Chi Zhang
CVPR 2026
PDF
SQLAgent

SQLAgent: Learning to Explore Before Generating as a Data Engineer

Wenjia Jiang, Yiwei Wang, Boyan Han, Joey Tianyi Zhou, Chi Zhang
Findings of ACL 2026
PDF
Fast3Dcache

Fast3Dcache: Training-Free 3D Geometry Synthesis Acceleration

Mengyu Yang, Yanming Yang, Chenyi Xu, Chenxi Song, Yufan Zuo, Tong Zhao, Ruibo Li, Chi Zhang
CVPR 2026
Auto-Slides

Auto-Slides: An Interactive Multi-Agent System for Creating and Customizing Research Presentations

Yuheng Yang, Wenjia Jiang, Yang Wang, Yi Song, Yiwei Wang, Chi Zhang
ICME 2026
ViStoryBench

ViStoryBench: Comprehensive Benchmark Suite for Story Visualization

Cailin Zhuang, Ailin Huang, Yaoqi Hu, Jingwei Wu, Wei Cheng, Jiaqi Liao, Hongyuan Wang, Xinyao Liao, Weiwei Cai, Hengyuan Xu, Xuanyang Zhang, Xianfang Zeng, Zhewei Huang, Gang Yu, Chi Zhang
CVPR 2026
Distill Any Depth

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang
CVPR 2026 Findings
WorldForge

Taming Video Models for 3D and 4D Generation via Zero-Shot Camera Control

Chenxi Song, Yanming Yang, Tong Zhao, Ruibo Li, Chi Zhang
CVPR 2026
FlowDirector

FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing

Guangzhao Li, Yanming Yang, Chenxi Song, Chi Zhang
CVPR 2026
AdaSDE

Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling

Ruoyu Wang, Beier Zhu, Junzhi Li, Liangyu Yuan, Chi Zhang
NeurIPS 2025
EPD

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

Beier Zhu, Ruoyu Wang, Tong Zhao, Hanwang Zhang, Chi Zhang
ICCV 2025
MotionAgent

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang, Gang Yu, Guosheng Lin, Chi Zhang
ICCV 2025
Learning to Be A Doctor

Learning to Be A Doctor: Searching for Effective Medical Agent Architectures

Yangyang Zhuang, Wenjia Jiang, Jiayu Zhang, Ze Yang, Joey Tianyi Zhou, Chi Zhang
ACM MM 2025
PDF
Video-Bench

Video-Bench: Human-Aligned Video Generation Benchmark

Hui Han, Siyuan Li, Jiaqi Chen, Yiwen Yuan, Yuling Wu, Chak Tou Leong, Hanwen Du, Junchen Fu, Youhua Li, Jie Zhang, Chi Zhang, Li-jia Li, Yongxin Ni
CVPR 2025
Project-Probe-Aggregate

Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness

Beier Zhu, Jiequan Cui, Hanwang Zhang, Chi Zhang
CVPR 2025 Highlight
PDF
ShapeGPT

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen
IEEE Transactions on Multimedia 2025
DreamFrame

DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes

Zhende Song, Chenchen Wang, Jiamu Sheng, Chi Zhang, Shengji Tang, Jiayuan Fan, Tao Chen
ACM MM 2025
CADCrafter

CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images

Cheng Chen, Jiacheng Wei, Tianrun Chen, Chi Zhang, Xiaofeng Yang, Shangzhan Zhang, Bingchen Yang, Chuan-Sheng Foo, Guosheng Lin, Qixing Huang, Fayao Liu
CVPR 2025
PDF
MVPaint

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Wei Cheng, Juncheng Mu, Xianfang Zeng, Xin Chen, Anqi Pang, Chi Zhang, Zhibin Wang, Bin Fu, Gang Yu, Ziwei Liu, Liang Pan
CVPR 2025
StyleStudio

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Mingkun Lei, Xue Song, Beier Zhu, Hao Wang, Chi Zhang
CVPR 2025
AppAgentX

AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Wenjia Jiang, Yangyang Zhuang, Chenxi Song, Xu Yang, Joey Tianyi Zhou, Chi Zhang
Arxiv 2025
Ultra3D

Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Yiwen Chen, Zhihao Li, Yikai Wang, Hu Zhang, Qin Li, Chi Zhang, Guosheng Lin
Arxiv 2025
StableLLaVA

StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data

Yanda Li*, Chi Zhang*, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei
ACL 2024
M3DBench

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen
ECCV 2024
MotionChain

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang YU, Jiayuan Fan
ECCV 2024
LL3DA

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.

Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen
CVPR 2024
IT3D

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Yiwen Chen, Chi Zhang*, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
AAAI 2024
EMMA

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Yucheng Han, Rui Wang, Chi Zhang*, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang
Arxiv 2024
ChartLlama

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

Yucheng Han*, Chi Zhang*, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
Arxiv 2023
FaceStudio

FaceStudio: Put Your Face Everywhere in Seconds

Yuxuan Yan*, Chi Zhang*, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Bin Fu, Gang Yu
Arxiv 2023
Robust Depth

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen
ICCV 2023
PDF

Lab Gallery