Back to Search Start Over

GoLDFormer: A global–local deformable window transformer for efficient image restoration.

Authors :
Chen, Quan
Zheng, Bolun
Yan, Chenggang
Zhu, Zunjie
Wang, Tingyu
Slabaugh, Gregory
Yuan, Shanxin
Source :
Journal of Visual Communication & Image Representation. Apr2024, Vol. 100, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Thanks to the powerful modeling capabilities of multi-head self attention (MSA), transformers have shown significant performance gains in vision tasks. However, as transformers require heavy computation, more efficient designs are required. In this paper, we present an efficient transformer architecture named GoLDFormer for image restoration. GoLDFormer extends the capability of window-based self-attention through two core designs. First, We propose a globally-enhanced window-based transformer block (G-WTB), which applies transposed attention to a compressed window representation rather than the spatial features, thus establishing connections between all windows with less computational complexity. Second, since the interactions between image content and window attention weights can be interpreted as spatially varying convolution, we introduce an adaptive filter structure into transformer models and propose a deformable filtering block (DFB) to enable cross-window connections. By adjusting the shape of the generated filters in the DFB, we can balance the computational costs and the degree of adjacent window interaction. Extensive experiments on several image restoration tasks demonstrate that GoLDFormer achieves competitive results against recent methods but with optimal computational costs. • We propose GoLDFormer, including two blocks, to enhance W-MSA globally and locally. • Globally-enhanced attention block captures global dependencies between all windows. • Deformable filtering block captures local dependencies by modifying filter size. • Our GoLDFormer achieves competitive results against recent lightweight methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10473203
Volume :
100
Database :
Academic Search Index
Journal :
Journal of Visual Communication & Image Representation
Publication Type :
Academic Journal
Accession number :
176784547
Full Text :
https://doi.org/10.1016/j.jvcir.2024.104117