| 
        
          | Research 
              I have been working on LLMs/MLLMs reasoning and inference, with a focus on efficiency and alignment.
             |  
        
          
            | SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs
              
              Dachuan Shi, Abedelkadir Asi, Keying Li, Xiangchi Yuan, Leyan Pan, Wenke Lee, Wen Xiao Preprint, 2025
 [Paper]
              [Code]
              [Website]
 |  
            | LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models
              
              Dachuan Shi, Yonggan Fu, Xiangchi Yuan, Zhongzhi Yu, Haoran You, Sixu Li, Xin Dong,
              Jan Kautz, Pavlo Molchanov, Yingyan Celine Lin ICML, 2025
 [Paper]
              [Code]
 |  
            | CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
              
              Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang ICML, 2024
 [Paper]
              [Code]
 |  
            | UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
              
              Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang ICML, 2023
 [Paper]
              [Code]
              [Website]
 |  
            | Mitigating Forgetting Between Supervised and Reinforcement Learning Yields Stronger Reasoners
              
              Xiangchi Yuan, Xiang Chen, Tong Yu, Dachuan Shi, Can Jin, Wenke Lee, Saayan Mitra Preprint, 2025
 [Paper]
 |  
            | Superficial Self-Improved Reasoners Benefit from Model Merging
              
              Xiangchi Yuan, Chunhui Zhang, Zheyuan Liu, Dachuan Shi, Leyan Pan, Soroush Vosoughi, Wenke Lee EMNLP, 2025
 [Paper]
              [Code]
 |  
            | Supervised Fine-tuning in turn Improves Visual Foundation Models
              
              Xiaohu Jiang, Yixiao Ge, Yuying Ge, Dachuan Shi, Chun Yuan, Ying Shan Preprint, 2024
 [Paper]
              [Code]
 |  
            | Masked Generative Distillation
              
              Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan ECCV, 2022
 [Paper]
              [Code]
 |  
 
        
          | Experience 
              Research Intern @ Microsoft, Redmond, WA, USA, 2025Research Intern @ Shanghai AILab, Shanghai, CN, 2022โ24 |  
        
          | Honors 
              Tsinghua Outstanding Master's Thesis, 2024Tsinghua Outstanding Bachelor's Thesis, 2021 |  
        
          | Services 
              TA @ CS4476 Computer Vision, Georgia Tech, Spring 2025Reviewer @ ICML 20-[24,25], NeurIPS 20-[23,24,25], ICLR 20-[24,25], CVPR 20-[24,25], ICCV 2025, ECCV 2024, ACL ARR 2025, and TPAMI |  |