Back to Search
Start Over
Using Mixture of Experts to accelerate dataset distillation.
- Source :
-
Journal of Visual Communication & Image Representation . Apr2024, Vol. 100, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Recently, large datasets have become increasingly necessary for most deep learning tasks, however, large datasets may bring some problems, such as disk storage and huge computational expense. Dataset distillation is an emerging field that aims to synthesize a small dataset from the original dataset, then a random model trained on the distillation dataset can achieve comparable performances to the same architecture model trained on the original dataset. Matching Training Trajectories (MTT) achieves a leading performance in this field, but it needs to pre-train 200 expert models before the formal distillation process, which is called buffer process. In this paper, we propose a new method to reduce the consumed time of buffer process. Concretely, we use Mixture of Experts (MoE) to train several expert models parallelly in buffer process. The experiments show our method can achieve a speedup of up to approximately 4 ∼ 8 × in buffer process with getting comparable distillation performances. • Apply Mixture of Experts to accelerate dataset distillation. • Our method is widely effective for trajectories-matching distillation methods. • Experiments show that our method gets comparable performances. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10473203
- Volume :
- 100
- Database :
- Academic Search Index
- Journal :
- Journal of Visual Communication & Image Representation
- Publication Type :
- Academic Journal
- Accession number :
- 176784567
- Full Text :
- https://doi.org/10.1016/j.jvcir.2024.104137