Research
I do research in (M)LLM post-training, including reasoning, efficiency, and safety.
|
|
LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
Dachuan Shi, Yonggan Fu, Xiangchi Yuan, Zhongzhi Yu, Haoran You, Sixu Li, Xin Dong, Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin
ICML, 2025
[Paper] [ArXiv] [Code]
|
|
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang
ICML, 2024
[Paper] [ArXiv] [Code]
|
|
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang
ICML, 2023
[Paper] [ArXiv] [Code] [Website]
|
Experience
- Research Intern @ Microsoft, Redmond, WA, USA, 2025
- Research Intern @ Shanghai AILab, Shanghai, CN, 2022-24
|
Services
- TA @ CS4476 Computer Vision, Gerogia Tech, Spring 2025
- Reviewer @ ICML 20-[24,25], NeurIPS 20-[23,24], ICLR 20-[24,25], CVPR 20-[24,25], ICCV 2025, ECCV 2024, and ACL ARR 2025
|
|