I am a tenure-track Assistant Professor and PI at Westlake University, where I lead the AGI Lab. Before joining Westlake University, I worked as a scientist at Tencent. I obtained my Ph.D. degree at the School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore, where I worked under the supervision of Prof. Guosheng Lin. I also work closely with Prof. Chunhua Shen and Prof. Rui Yao in research. I was recognized among World’s Top 2% Scientists by Stanford University in 2023 and 2024.
We have numerous positions available for PhD students, postdoctoral researchers, visiting students, and research assistants. Interested candidates are welcome to email me for inquiries. See 招生信息.
My research primarily revolves around vision and learning. At present, I am focusing on the development of large models to solve AI problems. Recent endeavors include large vision foundation models, multimodal models, and generative AI models.
I like singing and was in Top 8 of Good Voice of Universities 2015 in CUMT. I play football regularly. I am a loyal fan of Football Club of Barcelona PSG Inter Miami for years. My favorite singers are 张学友 and Freddie Mercury.
AppAgent: Multimodal Agents as Smartphone Users
Chi Zhang*, Zhao Yang*, Jiaxuan Liu*, Yuchen Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu
Arxiv Preprint 2023.
[Project Page][PDF][Code]
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang*
Arxiv Preprint 2024.
[Project Page][PDF][Code]
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts
Yucheng Han, Rui Wang, Chi Zhang*, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang
Arxiv Preprint 2024.
[Project Page][PDF][Code]
GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting
Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin
IEEE Conference on Computer Vision and Pattern Recognition CVPR 2024.
[Project Page][PDF][Code]
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Wei Yin*, Chi Zhang*, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen
IEEE International Conference on Computer Vision ICCV 2023.
[PDF][Code]
MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization
Yiwen Chen, Yikai Wang*, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang*, Guosheng Lin*
Arxiv Preprint 2024.
[Project Page][PDF][Code]
StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data
Yanda Li*, Chi Zhang*, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei
Findings of the Association for Computational Linguistics ACL 2024.
[Project Page][PDF][Code]
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Mingsheng Li, Xin Chen, Chi Zhang, Sijin Chen, Hongyuan Zhu, Fukun Yin, Gang Yu, Tao Chen
The European Conference on Computer Vision ECCV 2024.
[Project Page][PDF][Code]
MotionChain: Conversational Motion Controllers via Multimodal Prompts
Biao Jiang, Xin Chen, Chi Zhang, Fukun Yin, Zhuoyuan Li, Gang YU, Jiayuan Fan
The European Conference on Computer Vision ECCV 2024.
[Project Page][PDF][Code]
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.
Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen
IEEE Conference on Computer Vision and Pattern Recognition CVPR 2024.
[Project Page][PDF][Code]
IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
Yiwen Chen, Chi Zhang*, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
AAAI Conference on Artificial Intelligence AAAI2024.
[PDF][Code]
ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Yucheng Han*, Chi Zhang*, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
Arxiv Preprint 2023.
[Project Page][PDF][Code]
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
Arxiv Preprint 2023.
[PDF][Code]
FaceStudio: Put Your Face Everywhere in Seconds
Yuxuan Yan*, Chi Zhang*, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Bin Fu, Gang Yu
Arxiv Preprint 2023.
[Project Page][PDF][Code]
Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen
IEEE International Conference on Computer Vision ICCV 2023.
[PDF]
ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen
Arxiv Preprint 2023.
[Project Page][PDF][Code]
Powered by Jekyll and Minimal Light theme.