avatar

Chi Zhang

AI Scientist
Tencent
dr(dot)zhang(dot)chi@outlook.com

About Me

I am a scientist at Tencent (Shanghai), working with Dr. Gang Yu. I obtained my Ph.D. degree at the School of Computer Science and Engineering, Nanyang Technological University (NTU), Singapore, where I worked under the supervision of Prof. Guosheng Lin. After graduation, I joined Tencent as a T10 scientist. I also work closely with Prof. Chunhua Shen and Prof. Rui Yao in research.

I welcome discussions from academia and industry, especially regarding technology implementation and real-world impact. Feel free to reach out via email.

Research Interests

My research primarily revolves around vision and learning.

At present, I am focusing on the development of large models to solve AI problems. Recent endeavors include large zero-shot depth estimation models (HDN, Robust Depth and Metric3D), multimodal large language models (AppAgent, StableLLaVA,ChartLlama,ShapeGPT, LL3DA ), and 2D/3D generative models (StyleAvatar3D, IT3D, GaussianEditor, FaceStudio).

Hobbies

I like singing and was in Top 8 of Good Voice of Universities 2015 in CUMT. I play football regularly. I am a loyal fan of Football Club of Barcelona PSG Inter Miami. My favorite singers are 张学友 and Freddie Mercury.

News

Recent Projects

AppAgent: Multimodal Agents as Smartphone Users
Chi Zhang*, Zhao Yang*, Jiaxuan Liu*, Yuchen Han, Xin Chen, Zebiao Huang, Bin Fu, Gang Yu
Arxiv Preprint 2023.
[Project Page][PDF][Code]


ChartLlama: A Multimodal LLM for Chart Understanding and Generation
Yucheng Han*, Chi Zhang*, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
Arxiv Preprint 2023.
[Project Page][PDF][Code]


StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data
Yanda Li*, Chi Zhang*, Gang Yu, Zhibin Wang, Bin Fu, Guosheng Lin, Chunhua Shen, Ling Chen, Yunchao Wei
Arxiv Preprint 2023.
[Project Page][PDF][Code]


StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
Chi Zhang, Yiwen Chen, Yijun Fu, Zhenglin Zhou, Gang YU, Billzb Wang, Bin Fu, Tao Chen, Guosheng Lin, Chunhua Shen
Arxiv Preprint 2023.
[PDF][Code]


FaceStudio: Put Your Face Everywhere in Seconds
Yuxuan Yan*, Chi Zhang*, Rui Wang, Yichao Zhou, Gege Zhang, Pei Cheng, Bin Fu, Gang Yu
Arxiv Preprint 2023.
[Project Page][PDF][Code]


IT3D: Improved Text-to-3D Generation with Explicit View Synthesis
Yiwen Chen, Chi Zhang*, Xiaofeng Yang, Zhongang Cai, Gang Yu, Lei Yang, Guosheng Lin
AAAI Conference on Artificial Intelligence AAAI2024.
[PDF][Code]


GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting
Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin
Arxiv Preprint 2023.
[Project Page][PDF][Code]


Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image
Wei Yin*, Chi Zhang*, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen
IEEE International Conference on Computer Vision ICCV 2023.
[PDF][Code]


Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering
Chi Zhang, Wei Yin, Gang Yu, Zhibin Wang, Tao Chen, Bin Fu, Joey Tianyi Zhou, Chunhua Shen
IEEE International Conference on Computer Vision ICCV 2023.
[PDF]


ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Fukun Yin, Xin Chen, Chi Zhang, Biao Jiang, Zibo Zhao, Jiayuan Fan, Gang Yu, Taihao Li, Tao Chen
Arxiv Preprint 2023.
[Project Page][PDF][Code]


LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.
Sijin Chen, Xin Chen, Chi Zhang, Mingsheng Li, Gang Yu, Hao Fei, Hongyuan Zhu, Jiayuan Fan, Tao Chen
Arxiv Preprint 2023.
[Project Page][PDF][Code]






Powered by Jekyll and Minimal Light theme.