Descriptor: "TEXT summarization" / Publication Year Range: Last 10 years / Topic: abstractive summarization - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"TEXT summarization"' showing total 76 results

Start Over Descriptor "TEXT summarization" Topic abstractive summarization Publication Year Range Last 10 years

76 results on '"TEXT summarization"'

1. Enhancing abstractive summarization of implicit datasets with contrastive attention.

Author: Kwon, Soonki and Lee, Younghoon
Subjects: *TEXT summarization, *AUTOMATIC summarization, *LANGUAGE models, *VANILLA
Abstract: It is important for abstractive summarization models to understand the important parts of the original document and create a natural summary accordingly. Recently, studies have been conducted to incorporate important parts of the original document during learning and have shown good performance. However, these studies are effective for explicit datasets but not implicit datasets which are relatively more abstract. This study addresses the challenge of summarizing implicit datasets, which have a lower deviation in the significance of important sentences compared to explicit datasets. A multi-task learning approach that reflects information about salient and incidental objects during the learning process was proposed. This was achieved by adding a contrastive objective to the fine-tuning process of the encoder-decoder language model. The salient and incidental parts were selected based on the ROUGE-L F1 score and their relationships were learned through triplet loss. The proposed method was evaluated using five benchmark summarization datasets, including two explicit and three implicit. The experimental results showed a greater improvement in implicit datasets, particularly for the highly abstractive XSum dataset, compared to the vanilla fine-tuning method in both the BART-base and T5-small models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

2. Cross-Domain Document Summarization Model via Two-Stage Curriculum Learning.

Author: Lee, Seungsoo, Kim, Gyunyeop, and Kang, Sangwoo
Subjects: AUTOMATIC summarization, TEXT summarization
Abstract: Generative document summarization is a natural language processing technique that generates short summary sentences while preserving the content of long texts. Various fine-tuned pre-trained document summarization models have been proposed using a specific single text-summarization dataset. However, each text-summarization dataset usually specializes in a particular downstream task. Therefore, it is difficult to treat all cases involving multiple domains using a single dataset. Accordingly, when a generative document summarization model is fine-tuned to a specific dataset, it performs well, whereas the performance is degraded by up to 45% for datasets that are not used during learning. In short, summarization models perform well with in-domain cases, as the dataset domain during training and evaluation is the same but perform poorly with out-domain inputs. In this paper, we propose a new curriculum-learning method using mixed datasets while training a generative summarization model to be more robust on out-domain datasets. Our method performed better than XSum with 10%, 20%, and 10% lower performance degradation in CNN/DM, which comprised one of two test datasets used, compared to baseline model performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. A survey on the dataset, techniques, and evaluation metric used for abstractive text summarization.

Author: Sharma, Shivani, Aggarwal, Gaurav, and Rai, Bipin Kumar
Subjects: *TEXT summarization, *AUTOMATIC summarization
Abstract: Whenever there is too much information out there, it is desirable to summarize. If humans are trying to create the summary, it will take lot of time. Now to make the problem of summarizing information easier and more effortless one can automate the summarization process which can reduce the time taken in creating summary. This is called as automatic summarization. The two ways of summarization are extractive summarization and abstractive summarization. Extractive summarization and its applications have been the subject of extensive research and have received state of art solution. But abstractive summarization still is a progressive field as it is difficult to create abstractive summary as humans do. Also, it is still a question i.e., how to evaluate the quality of a summary? Therefore, this paper is a comprehensive survey on the dataset used with its details and statistics, analysis of various abstractive summarization techniques and important parameters for evaluating the quality of summary. Deep leaning based models have given new direction in this field. The author also focuses on problems and challenges faced in the generation of summary which are opening the future research scope in this domain. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. A hierarchical framework based on transformer technology to achieve factual consistent and non-redundant abstractive text summarization.

Author: Swetha, G. and Kumar, S. Phani
Subjects: TEXT summarization, TRANSFORMER models, AUTOMATIC summarization
Abstract: Abstractive summarization is one of the popular topics that has been the researchers' attention for several years. This is because of the widespread application frameworks included in this field. Most of the existing summarization frameworks cannot provide effective abstracts as the contextual information of the input is not given importance. To deal with the problem, this work introduces a hierarchical framework using transformer technology to produce effective abstracts. The proposed framework includes preprocessing, extractive summarization, and abstractive summarization as the basic steps of the work. Initially, the input contents are preprocessed to obtain a clean document, and then the contents are provided to the extractive summarization unit. This unit consists of a fine-tuned BERTSum model (FTBS), which is a pre-trained model to produce the required extractive summary. The output is then provided to the proposed convolutional bidirectional gated recurrent unit transformer (CBi-GRUT) model, where an additional encoder model is introduced with the traditional transformer technology to obtain the output. The outcomes of the model are then assessed with the existing models to prove its efficacy, and the evaluations are carried out using the CNN/Daily Mail dataset. The proposed method achieved an average ROUGE-1 score of 0.78, average ROUGE-2 score of 0.68 and an average ROUGE-L score of 0.77. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Abstractive summarization with deep reinforcement learning using semantic similarity rewards.

Author: Beken Fikri, Figen, Oflazer, Kemal, and Yanıkoğlu, Berrin
Subjects: DEEP reinforcement learning, TEXT summarization, MACHINE learning, AUTOMATIC summarization, REINFORCEMENT learning, NATURAL languages
Abstract: Abstractive summarization is an approach to document summarization that is not limited to selecting sentences from the document but can generate new sentences as well. We address the two main challenges in abstractive summarization: how to evaluate the performance of a summarization model and what is a good training objective. We first introduce new evaluation measures based on the semantic similarity of the input and corresponding summary. The similarity scores are obtained by the fine-tuned BERTurk model using either the cross-encoder or a bi-encoder architecture. The fine-tuning is done on the Turkish Natural Language Inference and Semantic Textual Similarity benchmark datasets. We show that these measures have better correlations with human evaluations compared to Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores and BERTScore. We then introduce a deep reinforcement learning algorithm that uses the proposed semantic similarity measures as rewards, together with a mixed training objective, in order to generate more natural summaries in terms of human readability. We show that training with a mixed training objective function compared to only the maximum-likelihood objective improves similarity scores. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Advancements and Challenges in Text Summarization: An Overview of Methods and Strategies in Brief

Author: Yarlagadda, Madhulika, Nadendla, Hanumantha Rao, Rao, Kongara Srinivasa, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Venu Gopal Rao, K., editor, Krishna Prasad, A. V., editor, and Vijaya Bhaskar, Seelam Ch., editor
Published: 2024
Full Text: View/download PDF

7. Enhancing Abstractive Summarization with Pointer Generator Networks and Coverage Mechanisms in NLP

Author: Yarlagadda, Madhulika and Nadendla, Hanumantha Rao
Published: 2024
Full Text: View/download PDF

8. Surveying the landscape of text summarization with deep learning: A comprehensive review.

Author: Wang, Guanghua and Wu, Weili
Subjects: *DEEP learning, *TEXT summarization, *ARTIFICIAL neural networks, *NATURAL language processing, *AUTOMATIC summarization
Abstract: In recent years, deep learning has revolutionized natural language processing (NLP) by enabling the development of models that can learn complex representations of language data, leading to significant improvements in performance across a wide range of NLP tasks. Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data. This is in contrast to traditional NLP approaches, which rely on hand-engineered features and rules to perform NLP tasks. The ability of deep neural networks to learn hierarchical representations of language data, handle variable-length input sequences, and perform well on large datasets makes them well-suited for NLP applications. Driven by the exponential growth of textual data and the increasing demand for condensed, coherent, and informative summaries, text summarization has been a critical research area in the field of NLP. Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks. In this survey, we begin with a review of fashionable text summarization tasks in recent years, including extractive, abstractive, multi-document, and so on. Next, we discuss most deep learning-based models and their experimental results on these tasks. The paper also covers datasets and data representation for summarization tasks. Finally, we delve into the opportunities and challenges associated with summarization tasks and their corresponding methodologies, aiming to inspire future research efforts to advance the field further. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific setting. This survey aims to provide a comprehensive review of existing techniques, evaluation methodologies, and practical applications of automatic text summarization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. An Efficient Summarisation and Search Tool for Research Articles.

Author: Garg, Shruti, Anand, Pushkar, Chanda, Parnab Kumar, and Payyavula, Srinivasa Rao
Subjects: AUTOMATIC summarization, TEXT summarization, WEB development, SEARCH engines, USER experience, NATURAL language processing, EXPERTISE, DATABASE management
Abstract: Building an efficient summarization and search tool for research articles is a complex task that involves interdisciplinary expertise in NLP, database management, web development, and user experience design. With the rapid growth of the scientific content, manually reading and selecting important content of research articles became challenging. Thus, there is a need for a summarization tool to help scholars reading their content fast along with a search tool that will find important contents and keep them in organized way. To summarize the contents of different articles a summarization tool is proposed in this work that generates extractive and abstractive summaries. Along with summarization a search engine also been proposed in this work that save the searched results in a comma-separated value (CSV) format including the search queries and meta information of articles such as keyword, title, author name, URL, year of publication, abstracts and summaries. These CSVs help users to get idea about article contents in offline mode without reading or searching the whole text. The efficiency of the summarizer tool is evaluated in terms of precision(pr), recall(re) and F-measure(F-m) of Rouge-1(R1), Rouge-2(R2), Rouge_sum(R_sum) and Bertscore(BS) measures for ten research articles. The average pr, re and F-m obtained from BS are 42%, 42% and 42% for extractive summarization and 41%, 41% and 41% for abstractive summarization. This tool will be helpful to research scholars in the collection of literature and the preparation of related work for their research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. A Novel Gravity Optimization Algorithm for Extractive Arabic Text Summarization.

Author: Hadi, Mustafa J., Abbas, Ayad R., and Fadhil, Osamah Y.
Subjects: OPTIMIZATION algorithms, TEXT summarization, METAHEURISTIC algorithms, AUTOMATIC summarization, ARABIC language, GRAVITY
Abstract: Copyright of Baghdad Science Journal is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

11. Summary-Sentence Level Hierarchical Supervision for Re-Ranking Model of Two-Stage Abstractive Summarization Framework.

Author: Yoo, Eunseok, Kim, Gyunyeop, and Kang, Sangwoo
Subjects: *TEXT summarization, *LANGUAGE models, *AUTOMATIC summarization, *NATURAL language processing, *STOCHASTIC programming, *PREDICATE calculus
Abstract: Fine-tuning a pre-trained sequence-to-sequence-based language model has significantly advanced the field of abstractive summarization. However, the early models of abstractive summarization were limited by the gap between training and inference, and they did not fully utilize the potential of the language model. Recent studies have introduced a two-stage framework that allows the second-stage model to re-rank the candidate summary generated by the first-stage model, to resolve these limitations. In this study, we point out that the supervision method performed in the existing re-ranking model of the two-stage abstractive summarization framework cannot learn detailed and complex information of the data. In addition, we present the problem of positional bias in the existing encoder–decoder-based re-ranking model. To address these two limitations, this study proposes a hierarchical supervision method that jointly performs summary and sentence-level supervision. For sentence-level supervision, we designed two sentence-level loss functions: intra- and inter-intra-sentence ranking losses. Compared to the existing abstractive summarization model, the proposed method exhibited a performance improvement for both the CNN/DM and XSum datasets. The proposed model outperformed the baseline model under a few-shot setting. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. A Comprehensive Survey of Deep Learning Models for Legal Document Summarization.

Author: Reddy, Kancharla Bharath and Jayabharathy, J.
Subjects: DEEP learning, TEXT summarization, LEGAL documents, EVIDENCE gaps, LEGAL language, RESEARCH personnel
Abstract: Legal document Summarization poses a difficult challenge because of the complexity and diversity of legal language and concepts. In recent years, deep learning models have shown promising results in summarizing legal documents. The article examines the challenges of legal document summarization, reviews the state-of-the-art deep learning models, and identifies research gaps in this area. Potential future research directions and applications of legal document summarization using deep learning models are also discussed. The paper is a valuable resource for researchers and practitioners interested in legal document summarization using deep learning models. [ABSTRACT FROM AUTHOR]
Published: 2024

13. Abstractive Summarizers Become Emotional on News Summarization.

Author: Ahuir, Vicent, González, José-Ángel, Hurtado, Lluís-F., and Segarra, Encarna
Subjects: TEXT summarization, AUTOMATIC summarization, CORPORA
Abstract: Emotions are central to understanding contemporary journalism; however, they are overlooked in automatic news summarization. Actually, summaries are an entry point to the source article that could favor some emotions to captivate the reader. Nevertheless, the emotional content of summarization corpora and the emotional behavior of summarization models are still unexplored. In this work, we explore the usage of established methodologies to study the emotional content of summarization corpora and the emotional behavior of summarization models. Using these methodologies, we study the emotional content of two widely used summarization corpora: Cnn/Dailymail and Xsum, and the capabilities of three state-of-the-art transformer-based abstractive systems for eliciting emotions in the generated summaries: Bart, Pegasus, and T5. The main significant findings are as follows: (i) emotions are persistent in the two summarization corpora, (ii) summarizers approach moderately well the emotions of the reference summaries, and (iii) more than 75% of the emotions introduced by novel words in generated summaries are present in the reference ones. The combined use of these methodologies has allowed us to conduct a satisfactory study of the emotional content in news summarization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Survey of Text Summarization Stratification

Author: Jamwal, Arvind, Singh, Pardeep, Kumari, Namrata, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Singh, Yashwant, editor, Verma, Chaman, editor, Zoltán, Illés, editor, Chhabra, Jitender Kumar, editor, and Singh, Pradeep Kumar, editor
Published: 2023
Full Text: View/download PDF

15. Automatic text summarization based on extractive-abstractive method

Author: Md. Ahsan Habib, Romana Rahman Ema, Tajul Islam, Md. Yasir Arafat, and Mahedi Hasan
Subjects: text summarization, extractive summarization, abstractive summarization, sentence ranking algorithm, text generation, noun pronoun conversion, Computer engineering. Computer hardware, TK7885-7895, Electronic computers. Computer science, QA75.5-76.95
Abstract: The choice of this study has a significant impact on daily life. In various fields such as journalism, academia, business, and more, large amounts of text need to be processed quickly and efficiently. Text summarization is a technique used to generate a precise and shortened summary of spacious texts. The generated summary sustains overall meaning without losing any information and focuses on those parts that contain useful information. The goal is to develop a model that converts lengthy articles into concise versions. The task to be solved is to select an effective procedure to develop the model. Although the present text summarization models give us good results in many recognized datasets such as cnn/daily- mail, newsroom, etc. All the problems can not be resolved by these models. In this paper, a new text summarization method has been proposed: combining the Extractive and Abstractive Text Summarization technique. In the extractive-based method, the model generates a summary using Sentence Ranking Algorithm and passes this generated summary through an abstractive method. When using the sentence ranking algorithm, after rearranging the sentences, the relationship between one sentence and another sentence is destroyed. To overcome this situation, Pronoun to Noun conversion has been proposed with the new system. After generating the extractive summary, the generated summary is passed through the abstractive method. The proposed abstractive model consists of three pre-trained models: google/pegusus-xsum, face-book/bart-large-cnn model, and Yale-LILY/brio-cnndm-uncased, which generates a final summary depending on the maximum final score. The following results were obtained: experimental results on CNN/daily-mail dataset show that the proposed model obtained scores of ROUGE-1, ROUGE-2 and ROUGE-L are respectively 42.67 %, 19.35 %, and 39.57 %. Then, the result has been compared with three state-of-the-art methods: JEANS, DEATS and PGAN-ATSMT. The results outperform state-of-the-art models. Experimental results also show that the proposed model is qualitatively readable and can generate abstract summaries. Conclusion: In terms of ROUGE score, the model outperforms some art-of-the-state models for ROUGE-1 and ROUGE-L, but doesn’t achieve good result in ROUGE-2.
Published: 2023
Full Text: View/download PDF

16. Automatic Short Text Summarization Techniques in Social Media Platforms.

Author: Ghanem, Fahd A., Padma, M. C., and Alkhatib, Ramez
Subjects: TEXT summarization, SOCIAL media, USER-generated content
Abstract: The rapid expansion of social media platforms has resulted in an unprecedented surge of short text content being generated on a daily basis. Extracting valuable insights and patterns from this vast volume of textual data necessitates specialized techniques that can effectively condense information while preserving its core essence. In response to this challenge, automatic short text summarization (ASTS) techniques have emerged as a compelling solution, gaining significant importance in their development. This paper delves into the domain of summarizing short text on social media, exploring various types of short text and the associated challenges they present. It also investigates the approaches employed to generate concise and meaningful summaries. By providing a survey of the latest methods and potential avenues for future research, this paper contributes to the advancement of ASTS in the ever-evolving landscape of social media communication. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

17. Improving Abstractive Dialogue Summarization Using Keyword Extraction.

Author: Yoo, Chongjae and Lee, Hwanhee
Subjects: TEXT summarization
Abstract: Abstractive dialogue summarization aims to generate a short passage that contains important content for a particular dialogue spoken by multiple speakers. In abstractive dialogue summarization systems, capturing the subject in the dialogue is challenging owing to the properties of colloquial texts. Moreover, the system often generates uninformative summaries. In this paper, we propose a novel keyword-aware dialogue summarization system (KADS) that easily captures the subject in the dialogue to alleviate the problem mentioned above through the efficient usage of keywords. Specifically, we first extract the keywords from the input dialogue using a pre-trained keyword extractor. Subsequently, KADS efficiently leverages the keywords information of the dialogue to the transformer-based dialogue system by using the pre-trained keyword extractor. Extensive experiments performed on three benchmark datasets show that the proposed method outperforms the baseline system. Additionally, we demonstrate that the proposed keyword-aware dialogue summarization system exhibits a high-performance gain in low-resource conditions where the number of training examples is highly limited. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

18. Summary-Sentence Level Hierarchical Supervision for Re-Ranking Model of Two-Stage Abstractive Summarization Framework

Author: Eunseok Yoo, Gyunyeop Kim, and Sangwoo Kang
Subjects: abstractive summarization, text summarization, natural language processing, deep learning, Mathematics, QA1-939
Abstract: Fine-tuning a pre-trained sequence-to-sequence-based language model has significantly advanced the field of abstractive summarization. However, the early models of abstractive summarization were limited by the gap between training and inference, and they did not fully utilize the potential of the language model. Recent studies have introduced a two-stage framework that allows the second-stage model to re-rank the candidate summary generated by the first-stage model, to resolve these limitations. In this study, we point out that the supervision method performed in the existing re-ranking model of the two-stage abstractive summarization framework cannot learn detailed and complex information of the data. In addition, we present the problem of positional bias in the existing encoder–decoder-based re-ranking model. To address these two limitations, this study proposes a hierarchical supervision method that jointly performs summary and sentence-level supervision. For sentence-level supervision, we designed two sentence-level loss functions: intra- and inter-intra-sentence ranking losses. Compared to the existing abstractive summarization model, the proposed method exhibited a performance improvement for both the CNN/DM and XSum datasets. The proposed model outperformed the baseline model under a few-shot setting.
Published: 2024
Full Text: View/download PDF

19. Research on Automatic Chinese Summarization Combining Pre-Training and Attention Enhancement.

Author: LI Xujun, WANG Jun, and YU Meng
Subjects: AUTOMATIC summarization, TEXT summarization, SEARCH algorithms
Abstract: Automatic summarization extracts the core content of the text by compressing the source text information. At present, most abstractive summarization tasks use a sequence-to-sequence model based on attention mechanism, but the model decodes the generated summary with low semantic accuracy and high content repetition rate. So this paper proposes an automatic text summarization method combining pre-training and attention enhancement to improve the quality of the generated summary. This model is based on the PGN model with coverage mechanism. Firstly, the Transformer encoder pre-training text acquires the semantic relationship between characters. Then, it uses the attention enhancement mechanism to make the current moment attention distribution of the decoder refer to the historical moment attention distribution in the decoder of the sequence-to-sequence model. Finally, it optimizes beam search algorithm to suppress the model's decoder predictive output short sentences. The experimental evaluation index uses the ROUGE value. The experimental results on the public datasets of NLPCC2018 and LCSTS indicate that, compared with the PGN model training results with the coverage mechanism, ROUGE-1, ROUGE-2 and ROUGE-L indicators are all obtained improved, which fully verifies the advancement and effectiveness of the method proposed in this paper. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

20. Improving Coverage and Novelty of Abstractive Text Summarization Using Transfer Learning and Divide and Conquer Approaches.

Author: Alomari, Ayham, Idris, Norisma, Md Sabri, Aznul Qalid, and Alsmadi, Izzat
Subjects: TEXT summarization
Abstract: Automatic Text Summarization (ATS) models yield outcomes with insufficient coverage of crucial details and poor degrees of novelty. The first issue resulted from the lengthy input, while the second problem resulted from the characteristics of the training dataset itself. This research employs the divide-and-conquer approach to address the first issue by breaking the lengthy input into smaller pieces to be summarized, followed by the conquest of the results in order to cover more significant details. For the second challenge, these chunks are summarized by models trained on datasets with higher novelty levels in order to produce more human-like and concise summaries with more novel words that do not appear in the input article. The results demonstrate an improvement in both coverage and novelty levels. Moreover, we defined a new metric to measure the novelty of the summary. Finally, the findings led us to conclude that the novelty levels are more significantly influenced by the training dataset itself, as in CNN/DM, than by other factors like the training model or its training objective, as in Pegasus. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

21. Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers.

Author: Chen, Tong, Wang, Xuewei, Yue, Tianwei, Bai, Xiaoyu, Le, Cindy X., and Wang, Wenping
Subjects: TEXT summarization, KNOWLEDGE graphs, LANGUAGE models, CHATGPT
Abstract: As the popularity of large language models (LLMs) has risen over the course of the last year, led by GPT-3/4 and especially its productization as ChatGPT, we have witnessed the extensive application of LLMs to text summarization. However, LLMs do not intrinsically have the power to verify the correctness of the information they supply and generate. This research introduces a novel approach to abstractive summarization, aiming to address the limitations of LLMs in that they struggle to understand the truth. The proposed method leverages extracted knowledge graph information and structured semantics as a guide for summarization. Building upon BART, one of the state-of-the-art sequence-to-sequence pre-trained LLMs, multi-source transformer modules are developed as an encoder, which are capable of processing textual and graphical inputs. Decoding is performed based on this enriched encoding to enhance the summary quality. The Wiki-Sum dataset, derived from Wikipedia text dumps, is introduced for evaluation purposes. Comparative experiments with baseline models demonstrate the strengths of the proposed approach in generating informative and relevant summaries. We conclude by presenting our insights into utilizing LLMs with graph external information, which will become a powerful aid towards the goal of factually correct and verified LLMs. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

22. Abstractive vs. Extractive Summarization: An Experimental Review.

Author: Giarelis, Nikolaos, Mastrokostas, Charalampos, and Karacapilidis, Nikos
Subjects: LANGUAGE models, TEXT summarization, COMPUTATIONAL linguistics, COMPARATIVE method, NATURAL language processing, LITERATURE reviews, DEEP learning
Abstract: Text summarization is a subtask of natural language processing referring to the automatic creation of a concise and fluent summary that captures the main ideas and topics from one or multiple documents. Earlier literature surveys focus on extractive approaches, which rank the top-n most important sentences in the input document and then combine them to form a summary. As argued in the literature, the summaries of these approaches do not have the same lexical flow or coherence as summaries that are manually produced by humans. Newer surveys elaborate abstractive approaches, which generate a summary with potentially new phrases and sentences compared to the input document. Generally speaking, contrary to the extractive approaches, the abstractive ones create summaries that are more similar to those produced by humans. However, these approaches still lack the contextual representation needed to form fluent summaries. Recent advancements in deep learning and pretrained language models led to the improvement of many natural language processing tasks, including abstractive summarization. Overall, these surveys do not present a comprehensive evaluation framework that assesses the aforementioned approaches. Taking the above into account, the contribution of this survey is fourfold: (i) we provide a comprehensive survey of the state-of-the-art approaches in text summarization; (ii) we conduct a comparative evaluation of these approaches, using well-known datasets from the related literature, as well as popular evaluation scores such as ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-LSUM, BLEU-1, BLEU-2 and SACREBLEU; (iii) we report on insights gained on various aspects of the text summarization process, including existing approaches, datasets and evaluation methods, and we outline a set of open issues and future research directions; (iv) we upload the datasets and the code used in our experiments in a public repository, aiming to increase the reproducibility of this work and facilitate future research in the field. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

23. T5-Based Model for Abstractive Summarization: A Semi-Supervised Learning Approach with Consistency Loss Functions.

Author: Wang, Mingye, Xie, Pan, Du, Yao, and Hu, Xiaohui
Subjects: TEXT summarization, NATURAL language processing, SUPERVISED learning, CHINESE language
Abstract: Text summarization is a prominent task in natural language processing (NLP) that condenses lengthy texts into concise summaries. Despite the success of existing supervised models, they often rely on datasets of well-constructed text pairs, which can be insufficient for languages with limited annotated data, such as Chinese. To address this issue, we propose a semi-supervised learning method for text summarization. Our method is inspired by the cycle-consistent adversarial network (CycleGAN) and considers text summarization as a style transfer task. The model is trained by using a similar procedure and loss function to those of CycleGAN and learns to transfer the style of a document to its summary and vice versa. Our method can be applied to multiple languages, but this paper focuses on its performance on Chinese documents. We trained a T5-based model and evaluated it on two datasets, CSL and LCSTS, and the results demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

24. Semantic Hierarchical Indexing for Online Video Lessons Using Natural Language Processing.

Author: Arazzi, Marco, Ferretti, Marco, and Nocera, Antonino
Subjects: STREAMING video & television, NATURAL language processing, TEXT summarization, AUDIO equipment
Abstract: Huge quantities of audio and video material are available at universities and teaching institutions, but their use can be limited because of the lack of intelligent search tools. This paper describes a possible way to set up an indexing scheme that offers a smart search modality, that combines semantic analysis of video/audio transcripts with the exact time positioning of uttered words. The proposal leverages NLP methods for topic modeling with lexical analysis of lessons' transcripts and builds a semantic hierarchical index into the corpus of lessons analyzed. Moreover, using abstracting summarization, the system can offer short summaries on the subject semantically implied by the search carried out. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

25. A Multitask Cross-Lingual Summary Method Based on ABO Mechanism.

Author: Li, Qing, Wan, Weibing, and Zhao, Yuming
Subjects: LANGUAGE models, TEXT summarization
Abstract: Recent cross-lingual summarization research has pursued the use of a unified end-to-end model which has demonstrated a certain level of improvement in performance and effectiveness, but this approach stitches together multiple tasks and makes the computation more complex. Less work has focused on alignment relationships across languages, which has led to persistent problems of summary misordering and loss of key information. For this reason, we first simplify the multitasking by converting the translation task into an equal proportion of cross-lingual summary tasks so that the model can perform only cross-lingual summary tasks when generating cross-lingual summaries. In addition, we splice monolingual and cross-lingual summary sequences as an input so that the model can fully learn the core content of the corpus. Then, we propose a reinforced regularization method based on the model to improve its robustness, and build a targeted ABO mechanism to enhance the semantic relationship alignment and key information retention of the cross-lingual summaries. Ablation experiments are conducted on three datasets of different orders of magnitude to demonstrate the effective enhancement of the model by the optimization approach; they outperform the mainstream approaches on the cross-lingual summarization task and the monolingual summarization task for the full dataset. Finally, we validate the model's capabilities on a cross-lingual summary dataset of professional domains, and the results demonstrate its superior performance and ability to improve cross-lingual sequencing. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

26. Weakly Supervised Abstractive Summarization with Enhancing Factual Consistency for Chinese Complaint Reports.

Author: Ren Tao and Chen Shuang
Subjects: TEXT summarization, AUTOMATIC summarization, CONSUMERS' reviews
Abstract: A large variety of complaint reports reflect subjective information expressed by citizens. A key challenge of text summarization for complaint reports is to ensure the factual consistency of generated summary. Therefore, in this paper, a simple and weakly supervised framework considering factual consistency is proposed to generate a summary of city-based complaint reports without pre-labeled sentences/words. Furthermore, it considers the importance of entity in complaint reports to ensure factual consistency of summary. Experimental results on the customer review datasets (Yelp and Amazon) and complaint report dataset (complaint reports of Shenyang in China) show that the proposed framework outperforms state-of-the-art approaches in ROUGE scores and human evaluation. It unveils the effectiveness of our approach to helping in dealing with complaint reports. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

27. A novel semantic-enhanced generative adversarial network for abstractive text summarization.

Author: Vo, Tham
Subjects: *TEXT summarization, *GENERATIVE adversarial networks, *RECURRENT neural networks, *SEQUENTIAL analysis
Abstract: Recently, for the abstractive summarization task, most of proposed techniques have adopted the deep recurrent neural network (RNN)-based sequential auto-encoding architecture to effectively learn and generate meaningful summaries for different input documents. However, most of recent RNN-based models always suffer the challenges related to the involvement of much capturing high-frequency/reparative phrases in long documents during the training process which leads to the outcome of trivial and generic summaries are generated. In addition, the lack of thorough analysis on the sequential and long-range dependency relationships between words within different contexts while learning the textual representation also makes the achieved summaries unnatural and incoherent. In order to deal with these challenges, in this paper we proposed a novel semantic-enhanced generative adversarial network (GAN)-based approach for abstractive text summarization task, called as: SGAN4AbSum. We use an adversarial training strategy for our text summarization model which trains the generator and discriminator to simultaneously handle the summary generation and distinguishing the generated summary with the ground-truth one. The input of generator is the jointed rich-semantic and global structural latent representations of training documents which are achieved by applying a combined BERT and graph convolutional network textual embedding mechanism. Extensive experiments in benchmark datasets demonstrate the effectiveness of our proposed SGAN4AbSum which achieve the competitive ROUGE-based scores in comparing with state-of-the-art abstractive text summarization baselines. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

28. An Analysis of Abstractive Text Summarization Using Pre-trained Models

Author: Rehman, Tohida, Das, Suchandan, Sanyal, Debarshi Kumar, Chattopadhyay, Samiran, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Mandal, Lopa, editor, Tavares, Joao Manuel R. S., editor, and Balas, Valentina E., editor
Published: 2022
Full Text: View/download PDF

29. A Comparative Analysis of Automatic Extractive and Abstractive Text Summarization

Author: Yadav, Madhuri, Katarya, Rahul, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Goyal, Vishal, editor, Gupta, Manish, editor, Mirjalili, Seyedali, editor, and Trivedi, Aditya, editor
Published: 2022
Full Text: View/download PDF

30. A Survey on Domain-Specific Summarization Techniques

Author: Rajan, Reshmi P., Jose, Deepa V., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Tiwari, Shailesh, editor, Trivedi, Munesh C., editor, Kolhe, Mohan Lal, editor, Mishra, K.K., editor, and Singh, Brajesh Kumar, editor
Published: 2022
Full Text: View/download PDF

31. A Survey on Statistical Approaches for Abstractive Summarization of Low Resource Language Documents

Author: Deshpande, Pranjali, Jahirabadkar, Sunita, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Zhang, Yu-Dong, editor, Senjyu, Tomonobu, editor, So-In, Chakchai, editor, and Joshi, Amit, editor
Published: 2022
Full Text: View/download PDF

32. Automatic Extractive Summarization for English Text: A Brief Survey

Author: Dhankhar, Sunil, Gupta, Mukesh Kumar, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Gupta, Deepak, editor, Khanna, Ashish, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
Published: 2022
Full Text: View/download PDF

33. Efficient Memory-Enhanced Transformer for Long-Document Summarization in Low-Resource Regimes.

Author: Moro, Gianluca, Ragazzi, Luca, Valgimigli, Lorenzo, Frisoni, Giacomo, Sartori, Claudio, and Marfia, Gustavo
Subjects: *MNEMONICS, *LANGUAGE models, *TEXT summarization
Abstract: Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today's state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of different domains while consuming significantly less GPU memory than competitors do, even in low-resource settings. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

34. AUTOMATIC TEXT SUMMARIZATION BASED ON EXTRACTIVE-ABSTRACTIVE METHOD.

Author: HABIB, Md. Ahsan, EMA, Romana Rahman, ISLAM, Tajul, ARAFAT, Md. Yasir, and HASAN, Mahedi
Subjects: TEXT summarization, DECISION support systems, SEMANTICS, DATA encryption, REINFORCEMENT learning
Abstract: The choice of this study has a significant impact on daily life. In various fields such as journalism, academia, business, and more, large amounts of text need to be processed quickly and efficiently. Text summarization is a technique used to generate a precise and shortened summary of spacious texts. The generated summary sustains overall meaning without losing any information and focuses on those parts that contain useful information. The goal is to develop a model that converts lengthy articles into concise versions. The task to be solved is to select an effective procedure to develop the model. Although the present text summarization models give us good results in many recognized datasets such as cnn/daily-mail, newsroom, etc. All the problems can not be resolved by these models. In this paper, a new text summarization method has been proposed: combining the Extractive and Abstractive Text Summarization technique. In the extractive-based method, the model generates a summary using Sentence Ranking Algorithm and passes this generated summary through an abstractive method. When using the sentence ranking algorithm, after rearranging the sentences, the relationship between one sentence and another sentence is destroyed. To overcome this situation, Pronoun to Noun conversion has been proposed with the new system. After generating the extractive summary, the generated summary is passed through the abstractive method. The proposed abstractive model consists of three pre-trained models: google/pegusus-xsum, facebook/bart-large-cnn model, and Yale-LILY/brio-cnndm-uncased, which generates a final summary depending on the maximum final score. The following results were obtained: experimental results on CNN/daily-mail dataset show that the proposed model obtained scores of ROUGE-1, ROUGE-2 and ROUGE-L are respectively 42.67 %, 19.35 %, and 39.57 %. Then, the result has been compared with three state-of-the-art methods: JEANS, DEATS and PGAN-ATSMT. The results outperform state-of-the-art models. Experimental results also show that the proposed model is qualitatively readable and can generate abstract summaries. Conclusion: In terms of ROUGE score, the model outperforms some art-of-the-state models for ROUGE-1 and ROUGE-L, but doesn't achieve good result in ROUGE-2. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

35. An Abstractive Summarization Model Based on Joint-Attention Mechanism and a Priori Knowledge.

Author: Li, Yuanyuan, Huang, Yuan, Huang, Weijian, Yu, Junhao, and Huang, Zheng
Subjects: TEXT summarization
Abstract: An abstractive summarization model based on the joint-attention mechanism and a priori knowledge is proposed to address the problems of the inadequate semantic understanding of text and summaries that do not conform to human language habits in abstractive summary models. Word vectors that are most relevant to the original text should be selected first. Second, the original text is represented in two dimensions—word-level and sentence-level, as word vectors and sentence vectors, respectively. After this processing, there will be not only a relationship between word-level vectors but also a relationship between sentence-level vectors, and the decoder discriminates between word-level and sentence-level vectors based on their relationship with the hidden state of the decoder. Then, the pointer generation network is improved using a priori knowledge. Finally, reinforcement learning is used to improve the quality of the generated summaries. Experiments on two classical datasets, CNN/DailyMail and DUC 2004, show that the model has good performance and effectively improves the quality of generated summaries. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

36. Neural text generation in regulatory medical writing.

Author: Meyer, Claudia, Adkins, Daniel, Pal, Koyena, Galici, Ruggero, Garcia-Agundez, Augusto, and Eickhoff, Carsten
Subjects: MEDICAL writing, TEXT summarization, NATURAL language processing, DRUG labeling, MANUAL labor, LIFTING & carrying (Human mechanics)
Abstract: Background: A steep increase in new drug applications has increased the overhead of writing technical documents such as medication guides. Natural language processing can contribute to reducing this burden. Objective: To generate medication guides from texts that relate to prescription drug labeling information. Materials and Methods: We collected official drug label information from the DailyMed website. We focused on drug labels containing medication guide sections to train and test our model. To construct our training dataset, we aligned “source” text from the document with similar “target” text from the medication guide using three families of alignment techniques: global, manual, and heuristic alignment. The resulting source-target pairs were provided as input to a Pointer Generator Network, an abstractive text summarization model. Results: Global alignment produced the lowest ROUGE scores and relatively poor qualitative results, as running the model frequently resulted in mode collapse. Manual alignment also resulted in mode collapse, albeit higher ROUGE scores than global alignment. Within the family of heuristic alignment approaches, we compared different methods and found BM25-based alignments to produce significantly better summaries (at least 6.8 ROUGE points above the other techniques). This alignment surpassed both the global and manual alignments in terms of ROUGE and qualitative scoring. Conclusion: The results of this study indicate that a heuristic approach to generating inputs for an abstractive summarization model increased ROUGE scores, compared to a global or manual approach when automatically generating biomedical text. Such methods hold the potential to significantly reduce the manual labor burden in medical writing and related disciplines. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

37. 基于改进 Transformer 的生成式文本摘要模型.

Author: 赵　伟, 王文娟, 任彦凝, 刘　群, 胥钟予, and 彭　露
Subjects: TEXT summarization, DATA mining, SCALABILITY, INFORMATION services, RECURRENT neural networks, SEMANTICS
Abstract: Copyright of Journal of Chongqing University of Posts & Telecommunications (Natural Science Edition) is the property of Chongqing University of Posts & Telecommunications and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2023
Full Text: View/download PDF

38. Fine-tuning and multilingual pre-training for abstractive summarization task for the Arabic language.

Author: Kahla, Mram, Novák, Attila, and Zijian Győző Yang
Subjects: *TEXT summarization, *ARABIC language
Abstract: The main task of our research is to train various abstractive summarization models for the Arabic language. The work for abstractive Arabic text summarization has hardly begun so far due to the unavailability of the datasets needed for that. In our previous research, we created the first monolingual corpus in the Arabic language for abstractive text summarization. Based on this corpus, we fine-tuned various transformer models. We tested the PreSumm and multilingual BART models. We achieved a "state of the art" result in this area with the PreSumm method. The present study continues the same series of research. We extended our corpus "AraSum" and managed to reach up to 50 thousand items, each consisting of an article and its corresponding lead. In addition, we pretrained our own monolingual and trilingual BART models for the Arabic language and fine-tuned them in addition to the mT5 model for abstractive text summarization for the same language, using the AraSum corpus. While there is room for improvement in the resources and the infrastructure we possess, the results clearly demonstrate that most of our models surpassed the XL-Sum which is considered to be state of the art for abstractive Arabic text summarization so far. Our corpus "AraSum" will be released to facilitate future work on abstractive Arabic text summarization. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

39. Solving Hungarian natural language processing tasks with multilingual generative models.

Author: Yang, Zijian Győző and Laki, László János
Subjects: *MACHINE translating, *NATURAL language processing, *TEXT summarization, *TRANSFORMER models, *ARTIFICIAL intelligence, *CHATBOTS
Abstract: Generative ability is a crucial need for artificial intelligence applications, such as chatbots, virtual assistants, machine translation systems etc. In recent years, the transformer-based neural architectures gave a huge boost to generate human-like English texts. In our research we did experiments to create pre-trained generative transformer models for Hungarian language and fine-tune them for multiple types of natural language processing tasks. In our focus, multilingual models were trained. We have pre-trained a multilingual BART, then fine-tuned it to various NLP tasks, such as text classification, abstractive summarization. In our experiments, we focused on transfer learning techniques to increase the performance. Furthermore, a M2M100 multilingual model was fine-tuned for a 12-lingual Hungarian- Centric machine translation. Last but not least, a Marian NMT based machine translation system was also built from scratch for the 12-lingual Hungarian-Centric machine translation task. In our results, using the cross-lingual transfer method we could achieve higher performance in all of our tasks. In our machine translation experiment, using our fine-tuned M2M100 model we could outperform the Google Translate, Microsoft Translator and eTranslation. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

40. Hierarchical Sliding Inference Generator for Question-driven Abstractive Answer Summarization.

Author: BING LI, PENG YANG, HANLIN ZHAO, PENGHUI ZHANG, and ZIJIAN LIU
Subjects: *TEXT summarization, *NATURAL languages
Abstract: Text summarization on non-factoid question answering (NQA) aims at identifying the core information of redundant answer guidance using questions, which can dramatically improve answer readability and comprehensibility. Most existing approaches focus on extracting query-related sentences to construct a summary, where the logical connection of natural language and the hierarchical interpretable semantic association are often neglected, thus degrading performance. To address these issues, we propose a novel question-driven abstractive answer summarization model, called the Hierarchical Sliding Inference Generator (HSIG), to form inferable and interpretable summaries by explicitly introducing hierarchical information reasoning between questions and corresponding answers. Specifically, we first apply an elaborately designed hierarchical sliding fusion inference model to determine the most relevant question sentence-level representation that provides a deeper interpretable basis for sentence selection in summarization, which further increases computational performance on the premise of following the semantic inheritance structure. Additionally, to improve summary fluency, we construct a double-driven selective generator to integrate various semantic information from two mutual question-and-answer perspectives. Experimental results illustrate that compared with stateof-the-art baselines, our model achieves remarkable improvement on two benchmark datasets and specifically improves the 2.46 ROUGE-1 points on PubMedQA, which demonstrates the superiority of our model on abstractive summarization with hierarchical sequential reasoning. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

41. Follow the Timeline! Generating an Abstractive and Extractive Timeline Summary in Chronological Order .

Author: XIUYING CHEN, MINGZHE LI, SHEN GAO, ZHANGMING CHAN, DONGYAN ZHAO, XIN GAO, XIANGLIANG ZHANG, and RUI YAN
Subjects: *TEXT summarization, *TIME series analysis, *GLOBAL method of teaching
Abstract: Today, timestamped web documents related to a general news query flood the Internet, and timeline summarization targets this concisely by summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this article we propose our Unified Timeline Summarizer, which can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information retained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting a summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that our Unified Timeline Summarizer achieves state-of-the-art performance in terms of both automatic and human evaluations. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

42. A Systematic Survey of Text Summarization Techniques.

Author: Kochrekar, Shivangi, Kale, Neha, Mehta, Darsh, and Ghane, Sunil
Subjects: TEXT summarization
Abstract: Text Summarization is a technique in which a long, lengthy document can be converted into a brief document while maintaining the crux of information in it. A lot of progress has been made over the past few years in this domain, researchers have been able to perform summarization on single paged as well as multi-page documents. However, the biggest challenge that remains so far is the correct ordering of sentences and then forming a coherent summary from those sentences. In this paper, we endeavor to provide a detailed comparison of the different techniques of summarization put forward by researchers. This paper highlights the different approaches used, along with the results and disadvantages of the same. This will help the researchers to learn more about the topic in a comprehensive manner and identify gaps in approach to work on them and come up with novel approaches to address the issue. [ABSTRACT FROM AUTHOR]
Published: 2023

43. Auto Survey of Research Papers.

Author: Mehta, Darsh, Kochrekar, Shivangi, Kale, Neha, and Ghane, Sunil
Subjects: TEXT summarization
Abstract: Researching takes a lot of time and effort, especially for novices. Our system proposes a way to make research easier by first showing various trends in that particular field, shortlisting the papers based on multiple filters, and then providing a coherent summary of those shortlisted papers. We have used Elastic Search and NLP for finding trends in form of bigrams and trigrams. The current research does not provide rational summaries. We aim to use modern summarizing techniques such as Summarizer from BERTSUM, and Long Encoder Decoder(LED) to provide coherent summaries as well as use SciSpacy library which helps to identify scientific terms easily. In the end, the user will not only get the top papers in the field but also a gist of each of those papers without breaking a sweat, thus saving time and effort. [ABSTRACT FROM AUTHOR]
Published: 2023

44. Abstractive text summarization: State of the art, challenges, and improvements.

Author: Shakil, Hassan, Farooq, Ahmad, and Kalita, Jugal
Subjects: *TEXT summarization, *LANGUAGE models, *AUTOMATIC summarization, *KNOWLEDGE representation (Information theory), *RESEARCH personnel, *REINFORCEMENT learning
Abstract: Specifically focusing on the landscape of abstractive text summarization, as opposed to extractive techniques, this survey presents a comprehensive overview, delving into state-of-the-art techniques, prevailing challenges, and prospective research directions. We categorize the techniques into traditional sequence-to-sequence models, pre-trained large language models, reinforcement learning, hierarchical methods, and multi-modal summarization. Unlike prior works that did not examine complexities, scalability and comparisons of techniques in detail, this review takes a comprehensive approach encompassing state-of-the-art methods, challenges, solutions, comparisons, limitations and charts out future improvements — providing researchers an extensive overview to advance abstractive summarization research. We provide vital comparison tables across techniques categorized — offering insights into model complexity, scalability and appropriate applications. The paper highlights challenges such as inadequate meaning representation, factual consistency, controllable text summarization, cross-lingual summarization, and evaluation metrics, among others. Solutions leveraging knowledge incorporation and other innovative strategies are proposed to address these challenges. The paper concludes by highlighting emerging research areas like factual inconsistency, domain-specific, cross-lingual, multilingual, and long-document summarization, as well as handling noisy data. Our objective is to provide researchers and practitioners with a structured overview of the domain, enabling them to better understand the current landscape and identify potential areas for further research and improvement. [Display omitted] • Overview of state-of-the-art techniques in abstractive text summarization. • Comparative analysis of models in abstractive summarization. • Identification of challenges and potential improvements in the field. • Exploration of future research directions and emerging frontiers. • Holistic survey of abstractive text summarization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Automatic Short Text Summarization Techniques in Social Media Platforms

Author: Fahd A. Ghanem, M. C. Padma, and Ramez Alkhatib
Subjects: text summarization, social media, evaluation metrics, abstractive summarization, extractive summarization, machine learning, Information technology, T58.5-58.64
Abstract: The rapid expansion of social media platforms has resulted in an unprecedented surge of short text content being generated on a daily basis. Extracting valuable insights and patterns from this vast volume of textual data necessitates specialized techniques that can effectively condense information while preserving its core essence. In response to this challenge, automatic short text summarization (ASTS) techniques have emerged as a compelling solution, gaining significant importance in their development. This paper delves into the domain of summarizing short text on social media, exploring various types of short text and the associated challenges they present. It also investigates the approaches employed to generate concise and meaningful summaries. By providing a survey of the latest methods and potential avenues for future research, this paper contributes to the advancement of ASTS in the ever-evolving landscape of social media communication.
Published: 2023
Full Text: View/download PDF

46. Review of Text Summarization in Indian Regional Languages

Author: Thapa, Surendrabikram, Adhikari, Surabhi, Mishra, Sushruti, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Castillo, Oscar, editor, and Virmani, Deepali, editor
Published: 2021
Full Text: View/download PDF

47. Enhancing N-Gram Based Metrics with Semantics for Better Evaluation of Abstractive Text Summarization.

Author: He, Jia-Wei, Jiang, Wen-Jun, Chen, Guo-Bang, Le, Yu-Quan, and Ding, Xiao-Fei
Subjects: NATURAL language processing, TEXT summarization, DEEP learning, NEW words, SEMANTICS, PROBLEM solving
Abstract: Text summarization is an important task in natural language processing and it has been applied in many applications. Recently, abstractive summarization has attracted many attentions. However, the traditional evaluation metrics that consider little semantic information, are unsuitable for evaluating the quality of deep learning based abstractive summarization models, since these models may generate new words that do not exist in the original text. Moreover, the out-of-vocabulary (OOV) problem that affects the evaluation results, has not been well solved yet. To address these issues, we propose a novel model called ENMS, to enhance existing N-gram based evaluation metrics with semantics. To be specific, we present two types of methods: N-gram based Semantic Matching (NSM for short), and N-gram based Semantic Similarity (NSS for short), to improve several widely-used evaluation metrics including ROUGE (Recall-Oriented Understudy for Gisting Evaluation), BLEU (Bilingual Evaluation Understudy), etc. NSM and NSS work in different ways. The former calculates the matching degree directly, while the latter mainly improves the similarity measurement. Moreover we propose an N-gram representation mechanism to explore the vector representation of N-grams (including skip-grams). It serves as the basis of our ENMS model, in which we exploit some simple but effective integration methods to solve the OOV problem efficiently. Experimental results over the TAC AESOP dataset show that the metrics improved by our methods are well correlated with human judgements and can be used to better evaluate abstractive summarization methods. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

48. Abstractive vs. Extractive Summarization: An Experimental Review

Author: Nikolaos Giarelis, Charalampos Mastrokostas, and Nikos Karacapilidis
Subjects: text summarization, deep learning, language models, natural language processing, abstractive summarization, extractive summarization, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Text summarization is a subtask of natural language processing referring to the automatic creation of a concise and fluent summary that captures the main ideas and topics from one or multiple documents. Earlier literature surveys focus on extractive approaches, which rank the top-n most important sentences in the input document and then combine them to form a summary. As argued in the literature, the summaries of these approaches do not have the same lexical flow or coherence as summaries that are manually produced by humans. Newer surveys elaborate abstractive approaches, which generate a summary with potentially new phrases and sentences compared to the input document. Generally speaking, contrary to the extractive approaches, the abstractive ones create summaries that are more similar to those produced by humans. However, these approaches still lack the contextual representation needed to form fluent summaries. Recent advancements in deep learning and pretrained language models led to the improvement of many natural language processing tasks, including abstractive summarization. Overall, these surveys do not present a comprehensive evaluation framework that assesses the aforementioned approaches. Taking the above into account, the contribution of this survey is fourfold: (i) we provide a comprehensive survey of the state-of-the-art approaches in text summarization; (ii) we conduct a comparative evaluation of these approaches, using well-known datasets from the related literature, as well as popular evaluation scores such as ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-LSUM, BLEU-1, BLEU-2 and SACREBLEU; (iii) we report on insights gained on various aspects of the text summarization process, including existing approaches, datasets and evaluation methods, and we outline a set of open issues and future research directions; (iv) we upload the datasets and the code used in our experiments in a public repository, aiming to increase the reproducibility of this work and facilitate future research in the field.
Published: 2023
Full Text: View/download PDF

49. CATS: Customizable Abstractive Topic-based Summarization.

Author: BAHRAINIAN, SEYED ALI, ZERVEAS, GEORGE, CRESTANI, FABIO, and EICKHOFF, CARSTEN
Subjects: *TEXT summarization, *CATS, *MEETING minutes, *COMPUTER science, *NARRATION, *FELIDAE
Abstract: Neural sequence-to-sequence models are the state-of-the-art approach used in abstractive summarization of textual documents, useful for producing condensed versions of source text narratives without being restricted to using only words from the original text. Despite the advances in abstractive summarization, custom generation of summaries (e.g., towards a user’s preference) remains unexplored. In this article, we present CATS, an abstractive neural summarization model that summarizes content in a sequence-to-sequence fashion while also introducing a new mechanism to control the underlying latent topic distribution of the produced summaries. We empirically illustrate the efficacy of our model in producing customized summaries and present findings that facilitate the design of such systems. We use the well-known CNN/DailyMail dataset to evaluate our model. Furthermore, we present a transfer-learning method and demonstrate the effectiveness of our approach in a low resource setting, i.e., abstractive summarization of meetings minutes, where combining the main available meetings’ transcripts datasets, AMI and International Computer Science Institute(ICSI), results in merely a few hundred training documents. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

50. Attention based Abstractive Summarization of Malayalam Document.

Author: Nambiar, Sindhya K, Peter S, David, and Idicula, Sumam Mary
Subjects: TEXT messages, ATTENTION
Abstract: There are different textual content summarization processes available in natural Language Processing. Amongst them abstractive textual content summarization is one of the challenging problems in natural language processing and that too, with very little research done in regional languages. Unlike other summarization techniques, which reuses the words and phrases from the source text, abstractive text summarization builds a short and concise precis of a huge text document built from the underlying message of the text not necessarily using the same words and phrases from the source. The objective of the proposed work is to create a brief and understandable abstractive summary of a Malayalam document. Malayalam is one of the 22 scheduled languages of India spoken by over 34 million people and is designated as a Classical Language in India. Being a Classical language, Malayalam has a very unique syntactic and semantic rules which makes this work more important. The proposed work attempts to create an attention mechanism to generate the summary of the source document. In this work, the goal was to compare the efficiency of Attention model with sequence to sequence baseline model of Malayalam text and thereby implementing a better abstractive text summarizer for a malayalam document. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

76 results on '"TEXT summarization"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources