Back to Search
Start Over
Align-then-abstract representation learning for low-resource summarization.
- Source :
-
Neurocomputing . Sep2023, Vol. 548, pN.PAG-N.PAG. 1p. - Publication Year :
- 2023
-
Abstract
- Generative transformer-based models have achieved state-of-the-art performance in text summarization. Nevertheless, they still struggle in real-world scenarios with long documents when trained in low-resource settings of a few dozen labeled training instances, namely in low-resource summarization (LRS). This paper bridges the gap by addressing two key research challenges when summarizing long documents, i.e., long-input processing and document representation, in one coherent model trained for LRS. Specifically, our novel align-then-abstract representation learning model (Athena) jointly trains a segmenter and a summarizer by maximizing the alignment between the chunk-target pairs in output from the text segmentation. Extensive experiments reveal that Athena outperforms the current state-of-the-art approaches in LRS on multiple long document summarization datasets from different domains. [ABSTRACT FROM AUTHOR]
- Subjects :
- *TEXT summarization
Subjects
Details
- Language :
- English
- ISSN :
- 09252312
- Volume :
- 548
- Database :
- Academic Search Index
- Journal :
- Neurocomputing
- Publication Type :
- Academic Journal
- Accession number :
- 164857472
- Full Text :
- https://doi.org/10.1016/j.neucom.2023.126356