About Me

I am a tenure-track Assistant Professor and PI at Westlake University, where I lead the AGI Lab. Before joining Westlake University, I worked as a scientist at Tencent.

I obtained my Ph.D. degree at the School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore, where I worked under the supervision of Prof. Guosheng Lin. I also work closely with Prof. Chunhua Shen and Prof. Rui Yao in research. I was recognized among World's Top 2% Scientists by Stanford University in 2023 and 2024.

[2025] 📢 We're hiring! Multiple positions for PhD students, postdocs, visiting students, and research assistants are available! See 招生信息 →

Research Interests

My current research focuses on Generative AI, including theoretical foundations of generative models, multimodal generative modeling, and multimodal intelligent agents. In the past, my work has also spanned broader areas in machine learning and computer vision.

News

Scroll for more
Dec 2025
AdaSDE is accepted by NeurIPS 2025 .
Nov 2025
Serving as Area Chair for ICML 2026.
Sep 2025
Presenting a tutorial on reasoning in GUI agents at ICCV 2025.
Sep 2025
🚀 We present WorldForge, a training-free world model built on video diffusion.
Sep 2025
Serving as Area Chair for CVPR 2026.
Aug 2025
Serving as Area Chair for ICLR 2026.
Jul 2025
🚀 We’ve launched Ultra3D, enabling efficient high-resolution 3D generation.
Jul 2025
🎉 3 papers are accepted by ICCV 2025. See you guys in Hawaii!
Jun 2025
🎉 5 papers are accepted by CVPR 2025.
Jun 2025
🎬 We present FlowDirector, a training-free video editing model with SOTA performance.
Mar 2025
Serving as Area Chair for ACM MM 2025 and IJCNN 2025.
Mar 2025
🔥🔥🔥 AppAgentX is released! The next generation of AppAgent supports self-evolution.
Feb 2025
🔥 We present Distill Any Depth, setting new SOTA for monocular depth estimation.
Feb 2025
🎉 MeshAnything is accepted by ICLR 2025.
Dec 2024
Excited to join the Editorial Board of IEEE T-CSVT as an Associate Editor!
Nov 2024
🔥 We present StyleStudio, setting new SOTA for text-driven image style transfer tasks.
Jul 2024
🎉 I joined Westlake University as an Assistant Professor (PI) and established AGI lab.

Academic Service

Hobbies

I like singing and playing football. I am a loyal fan of FC Barcelona PSG Inter Miami .

My favorite singers are Jacky Cheung and Freddie Mercury.

Selected Projects

AppAgent

AppAgent: Multimodal Agents as Smartphone Users

Chi Zhang, Zhao Yang, Jiaxuan Liu, Yuchen Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu
CHI 2025
MeshAnything

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang
ICLR 2025
MeshAnything V2

MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization

Yiwen Chen, Yikai Wang*, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang*, Guosheng Lin*
ICCV 2025
GaussianEditor

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin
CVPR 2024
Metric3D

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

Wei Yin*, Chi Zhang*, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen
ICCV 2023
StyleStudio

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Mingkun Lei, Xue Song, Beier Zhu, Hao Wang, Chi Zhang
CVPR 2025
AdaSDE

Adaptive Stochastic Coefficients for Accelerating Diffusion Sampling

Ruoyu Wang, Beier Zhu, Junzhi Li, Liangyu Yuan, Chi Zhang
NeurIPS 2025
EPD

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

Beier Zhu, Ruoyu Wang, Tong Zhao, Hanwang Zhang, Chi Zhang
ICCV 2025
MotionAgent

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang, Gang Yu, Guosheng Lin, Chi Zhang
ICCV 2025
Learning to Be A Doctor

Learning to Be A Doctor: Searching for Effective Medical Agent Architectures

Yangyang Zhuang, Wenjia Jiang, Jiayu Zhang, Ze Yang, Joey Tianyi Zhou, Chi Zhang
ACM MM 2025
PDF
AppAgentX

AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Wenjia Jiang, Yangyang Zhuang, Chenxi Song, Xu Yang, Joey Tianyi Zhou, Chi Zhang
Arxiv 2025
Ultra3D

Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention

Yiwen Chen, Zhihao Li, Yikai Wang, Hu Zhang, Qin Li, Chi Zhang, Guosheng Lin
Arxiv 2025
WorldForge

WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance

Chenxi Song, Yanming Yang, Tong Zhao, Ruibo Li, Chi Zhang
Arxiv 2025
FlowDirector

FlowDirector: Training-Free Flow Steering for Precise Text-to-Video Editing

Guangzhao Li, Yanming Yang, Chenxi Song, Chi Zhang
Arxiv 2025
StableLLaVA

StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data

Yanda Li*, Chi Zhang*, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei
ACL 2024
M3DBench

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts

Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen
ECCV 2024
MotionChain

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang YU, Jiayuan Fan
ECCV 2024
LL3DA

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.

Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen
CVPR 2024
IT3D

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Yiwen Chen, Chi Zhang*, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
AAAI 2024
ChartLlama

ChartLlama: A Multimodal LLM for Chart Understanding and Generation

Yucheng Han*, Chi Zhang*, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
Arxiv 2023
EMMA

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Yucheng Han, Rui Wang, Chi Zhang*, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang
Arxiv 2024
StyleAvatar3D

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
Arxiv 2023
FaceStudio

FaceStudio: Put Your Face Everywhere in Seconds

Yuxuan Yan*, Chi Zhang*, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Bin Fu, Gang Yu
Arxiv 2023
Robust Depth

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering

Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen
ICCV 2023
PDF

Lab Gallery