Dachuan Shi

I am currently a master's student at Tsinghua University, where I am advised by Prof. Chun Yuan. Previously, I received my B.Eng. in Computer Science and Technology also from Tsinghua University, advised by Prof. Linmin Tao. I also work closely with Dr. Jiaqi Wang at Shanghai AI Laboratory.

My research focuses on the combination of efficient deep learning and foundation models (multimodality, vision, and LLM). Over the past year, I have been working on compression & acceleration frameworks for efficient vison-language and unimodal Transformers. Prior to that, I built efficient and lightweight structures for medical image analysis.

Contact: sdc21@mails.tsinghua.edu.cn

Homepage  /  Curriculum Vitae  /  Github  /  Google Scholar

profile photo
Publications and Manuscripts
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers [Efficient Deep Learning, Multimodal Learning]
Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang
In ArXiv, 2023
[Paper] [ArXiv] [Code] [Bibtex]

TL;DR: Proposed a universal token ensemble framework CrossGET for accelerating various vision-language Transformers.

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers [Efficient Deep Learning, Multimodal Learning]
Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang
In ICML, 2023
[Paper] [ArXiv] [Code] [Website] [Poster] [Blog] [Blog(in Chinese)] [Bibtex]

TL;DR: Proposed a universal structured pruning framework UPop for compressing various vision-language and unimodal Transformers.

Heuristic Dropout: An Efficient Regularization Method for Medical Image Segmentation Models [Efficient Deep Learning, Medical Image Analysis]
Dachuan Shi, Ruiyang Liu, Linmi Tao, Chun Yuan
In ICASSP, 2022
[Paper] [Code] [Poster] [Bibtex]

TL;DR: Proposed Heuristic Dropout to more efficiently drop features suffering from the co-adaptation problem for medical image segmentation tasks.

Multi-Encoder Parse-Decoder Network for Sequential Medical Image Segmentation [Efficient Deep Learning, Medical Image Analysis]
Dachuan Shi, Ruiyang Liu, Linmi Tao, Zuoxiang He, Li Huo
In ICIP, 2021
[Paper] [Code] [Poster] [Bibtex]

TL;DR: Proposed MEPDNet that comprises efficient parameter-shared encoders and a lightweight decoder for sequential medical image segmentation.

Masked Generative Distillation [Efficient Deep Learning, Computer Vision]
Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan
In ECCV, 2022
[Paper] [ArXiv] [Code] [Bibtex]

TL;DR: Proposed a generative distillation method MGD that randomly masks pixels of the student's feature and forces it to generate the teacher's full feature.

PTeacher: a ComputerAided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback [Human-Computer Interaction, Multimodal Learning]
Yaohua Bu, Tianyi Ma, Weijun Li, Hang Zhou, Jia Jia, Shengqi Chen, Kaiyuan Xu, Dachuan Shi, Haozhe Wu, Zhihan Yang, Kun Li, Zhiyong Wu, Yuanchun Shi, Xiaobo Lu, Ziwei Liu
In CHI, 2021
[Paper] [ArXiv] [Bibtex]

TL;DR: Proposed PTeacher as a pronunciation training system that provides personalized exaggerated audio-visual corrective feedback for mispronunciations.

Awards
  • Tsinghua Comprehensive Excellence Award
  • Huiyan Scholarship
  • Tsinghua Outstanding Bachelor Thesis
  • Tsinghua John Ma Cup Taekwondo Competition, 2nd place
  • Beijing Capital University Taekwondo Elite Tournament, 5th place
  • Chinese University Physics Competition in Selected Regions, 3rd Prize
Academic Services
Reviewer of CVPR 2024, ICLR 2024, NeurIPS 2023, and ECCV 2022.

Modified from Jon Barron's.