Back to Search
Start Over
Learning-Free Unsupervised Extractive Summarization Model
- Source :
- IEEE Access, Vol 9, Pp 14358-14368 (2021)
- Publication Year :
- 2021
- Publisher :
- IEEE, 2021.
-
Abstract
- Text summarization is an information condensation technique that abbreviates a source document to a few representative sentences with the intention to create a coherent summary containing relevant information of source corpora. This promising subject has been rapidly developed since the advent of deep learning. However, summarization models based on deep neural network have several critical shortcomings. First, a large amount of labeled training data is necessary. This problem is standard for low-resource languages in which publicly available labeled data do not exist. In addition, a significant amount of computational ability is required to train neural models with enormous network parameters. In this study, we propose a model called Learning Free Integer Programming Summarizer (LFIP-SUM), which is an unsupervised extractive summarization model. The advantage of our approach is that parameter training is unnecessary because the model does not require any labeled training data. To achieve this, we formulate an integer programming problem based on pre-trained sentence embedding vectors. We also use principal component analysis to automatically determine the number of sentences to be extracted and to evaluate the importance of each sentence. Experimental results demonstrate that the proposed model exhibits generally acceptable performance compared with deep learning summarization models although it does not learn any parameters during the model construction process.
- Subjects :
- General Computer Science
Computer science
Process (engineering)
sentence representation vector
Feature extraction
02 engineering and technology
010501 environmental sciences
Machine learning
computer.software_genre
01 natural sciences
integer linear programming
0202 electrical engineering, electronic engineering, information engineering
General Materials Science
natural language processing
0105 earth and related environmental sciences
Artificial neural network
business.industry
Deep learning
General Engineering
Automatic summarization
Text summarization
Task analysis
Embedding
020201 artificial intelligence & image processing
Artificial intelligence
lcsh:Electrical engineering. Electronics. Nuclear engineering
business
computer
lcsh:TK1-9971
Sentence
Subjects
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 9
- Database :
- OpenAIRE
- Journal :
- IEEE Access
- Accession number :
- edsair.doi.dedup.....0cf5ab4beba565177eed2d39b7af4591