Publications and Manuscripts
|
|
CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers
[Efficient Deep Learning, Multimodal Learning]
Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang
In ArXiv, 2023
[Paper] [ArXiv] [Code] [Bibtex]
TL;DR: Proposed a universal token ensemble framework CrossGET for accelerating various vision-language Transformers.
|
|
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
[Efficient Deep Learning, Multimodal Learning]
Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang
In ICML, 2023
[Paper] [ArXiv] [Code] [Website]
[Poster] [Blog]
[Blog(in Chinese)] [Bibtex]
TL;DR: Proposed a universal structured pruning framework UPop for compressing various vision-language and unimodal Transformers.
|
|
Heuristic Dropout: An Efficient Regularization Method for Medical Image Segmentation Models
[Efficient Deep Learning, Medical Image Analysis]
Dachuan Shi, Ruiyang Liu, Linmi Tao, Chun Yuan
In ICASSP, 2022
[Paper] [Code] [Poster] [Bibtex]
TL;DR: Proposed Heuristic Dropout to more efficiently drop features suffering from the co-adaptation problem for medical image segmentation tasks.
|
|
Multi-Encoder Parse-Decoder Network for Sequential Medical Image Segmentation
[Efficient Deep Learning, Medical Image Analysis]
Dachuan Shi, Ruiyang Liu, Linmi Tao, Zuoxiang He, Li Huo
In ICIP, 2021
[Paper] [Code] [Poster] [Bibtex]
TL;DR: Proposed MEPDNet that comprises efficient parameter-shared encoders and a lightweight decoder for sequential medical image segmentation.
|
|
Masked Generative Distillation
[Efficient Deep Learning, Computer Vision]
Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan
In ECCV, 2022
[Paper] [ArXiv] [Code] [Bibtex]
TL;DR: Proposed a generative distillation method MGD that randomly masks pixels of the student's feature and forces it to generate the teacher's full feature.
|
|
PTeacher: a ComputerAided Personalized Pronunciation Training System with Exaggerated Audio-Visual Corrective Feedback
[Human-Computer Interaction, Multimodal Learning]
Yaohua Bu, Tianyi Ma, Weijun Li, Hang Zhou, Jia Jia, Shengqi Chen, Kaiyuan Xu, Dachuan Shi, Haozhe Wu, Zhihan Yang, Kun Li, Zhiyong Wu, Yuanchun Shi, Xiaobo Lu, Ziwei Liu
In CHI, 2021
[Paper] [ArXiv] [Bibtex]
TL;DR: Proposed PTeacher as a pronunciation training system that provides personalized exaggerated audio-visual corrective feedback for mispronunciations.
|
Awards
- Tsinghua Comprehensive Excellence Award
- Huiyan Scholarship
- Tsinghua Outstanding Bachelor Thesis
- Tsinghua John Ma Cup Taekwondo Competition, 2nd place
- Beijing Capital University Taekwondo Elite Tournament, 5th place
- Chinese University Physics Competition in Selected Regions, 3rd Prize
|
Academic Services
|
Reviewer of CVPR 2024, ICLR 2024, NeurIPS 2023, and ECCV 2022.
|
|