1. Closed-Loop Transcription via Convolutional Sparse Coding
- Author
-
Dai, Xili, Chen, Ke, Tong, Shengbang, Zhang, Jingyuan, Gao, Xingjian, Li, Mingyang, Pai, Druv, Zhai, Yuexiang, Yuan, Xiaojun, Shum, Heung Yeung, Ni, Lionel Ming-shuan, Ma, Yi, Dai, Xili, Chen, Ke, Tong, Shengbang, Zhang, Jingyuan, Gao, Xingjian, Li, Mingyang, Pai, Druv, Zhai, Yuexiang, Yuan, Xiaojun, Shum, Heung Yeung, Ni, Lionel Ming-shuan, and Ma, Yi
- Abstract
Autoencoders excel at generating models for natural images, but often lack structure and interpretability due to their use of generic deep networks. In this work, we make the explicit assumption that the image distribution is generated from a multi-stage sparse deconvolution. The corresponding inverse map, which we use as an encoder, is a multi-stage convolution sparse coding (CSC), with each stage obtained from unrolling an optimization algorithm for solving the corresponding (convexified) sparse coding program. Instead of directly minimizing the distributional gap between actual and generated images, we employ the closed-loop transcription (CTRL) framework to enhance the efficiency of the sparse representations. Our approach achieves comparable results on datasets like ImageNet-1K while using simpler networks and less computational power. Our method enjoys several side benefits, including more structured and interpretable representations, more stable convergence, and scalability to large datasets. Our method is arguably the first to demonstrate that a concatenation of multiple convolution sparse coding/decoding layers leads to an interpretable and effective autoencoder for modeling the distribution of large-scale natural image datasets. © 2024 Proceedings of Machine Learning Research
- Published
- 2024