217 results on '"TEXT summarization"'
Search Results
2. A hierarchical framework based on transformer technology to achieve factual consistent and non-redundant abstractive text summarization
- Author
-
Swetha, G. and Kumar, S. Phani
- Published
- 2024
- Full Text
- View/download PDF
3. A survey on the dataset, techniques, and evaluation metric used for abstractive text summarization.
- Author
-
Sharma, Shivani, Aggarwal, Gaurav, and Rai, Bipin Kumar
- Subjects
- *
TEXT summarization , *AUTOMATIC summarization - Abstract
Whenever there is too much information out there, it is desirable to summarize. If humans are trying to create the summary, it will take lot of time. Now to make the problem of summarizing information easier and more effortless one can automate the summarization process which can reduce the time taken in creating summary. This is called as automatic summarization. The two ways of summarization are extractive summarization and abstractive summarization. Extractive summarization and its applications have been the subject of extensive research and have received state of art solution. But abstractive summarization still is a progressive field as it is difficult to create abstractive summary as humans do. Also, it is still a question i.e., how to evaluate the quality of a summary? Therefore, this paper is a comprehensive survey on the dataset used with its details and statistics, analysis of various abstractive summarization techniques and important parameters for evaluating the quality of summary. Deep leaning based models have given new direction in this field. The author also focuses on problems and challenges faced in the generation of summary which are opening the future research scope in this domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Surveying the landscape of text summarization with deep learning: A comprehensive review.
- Author
-
Wang, Guanghua and Wu, Weili
- Subjects
- *
DEEP learning , *TEXT summarization , *ARTIFICIAL neural networks , *NATURAL language processing , *AUTOMATIC summarization - Abstract
In recent years, deep learning has revolutionized natural language processing (NLP) by enabling the development of models that can learn complex representations of language data, leading to significant improvements in performance across a wide range of NLP tasks. Deep learning models for NLP typically use large amounts of data to train deep neural networks, allowing them to learn the patterns and relationships in language data. This is in contrast to traditional NLP approaches, which rely on hand-engineered features and rules to perform NLP tasks. The ability of deep neural networks to learn hierarchical representations of language data, handle variable-length input sequences, and perform well on large datasets makes them well-suited for NLP applications. Driven by the exponential growth of textual data and the increasing demand for condensed, coherent, and informative summaries, text summarization has been a critical research area in the field of NLP. Applying deep learning to text summarization refers to the use of deep neural networks to perform text summarization tasks. In this survey, we begin with a review of fashionable text summarization tasks in recent years, including extractive, abstractive, multi-document, and so on. Next, we discuss most deep learning-based models and their experimental results on these tasks. The paper also covers datasets and data representation for summarization tasks. Finally, we delve into the opportunities and challenges associated with summarization tasks and their corresponding methodologies, aiming to inspire future research efforts to advance the field further. A goal of our survey is to explain how these methods differ in their requirements as understanding them is essential for choosing a technique suited for a specific setting. This survey aims to provide a comprehensive review of existing techniques, evaluation methodologies, and practical applications of automatic text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Automatic text summarization based on extractive-abstractive method
- Author
-
Md. Ahsan Habib, Romana Rahman Ema, Tajul Islam, Md. Yasir Arafat, and Mahedi Hasan
- Subjects
text summarization ,extractive summarization ,abstractive summarization ,sentence ranking algorithm ,text generation ,noun pronoun conversion ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The choice of this study has a significant impact on daily life. In various fields such as journalism, academia, business, and more, large amounts of text need to be processed quickly and efficiently. Text summarization is a technique used to generate a precise and shortened summary of spacious texts. The generated summary sustains overall meaning without losing any information and focuses on those parts that contain useful information. The goal is to develop a model that converts lengthy articles into concise versions. The task to be solved is to select an effective procedure to develop the model. Although the present text summarization models give us good results in many recognized datasets such as cnn/daily- mail, newsroom, etc. All the problems can not be resolved by these models. In this paper, a new text summarization method has been proposed: combining the Extractive and Abstractive Text Summarization technique. In the extractive-based method, the model generates a summary using Sentence Ranking Algorithm and passes this generated summary through an abstractive method. When using the sentence ranking algorithm, after rearranging the sentences, the relationship between one sentence and another sentence is destroyed. To overcome this situation, Pronoun to Noun conversion has been proposed with the new system. After generating the extractive summary, the generated summary is passed through the abstractive method. The proposed abstractive model consists of three pre-trained models: google/pegusus-xsum, face-book/bart-large-cnn model, and Yale-LILY/brio-cnndm-uncased, which generates a final summary depending on the maximum final score. The following results were obtained: experimental results on CNN/daily-mail dataset show that the proposed model obtained scores of ROUGE-1, ROUGE-2 and ROUGE-L are respectively 42.67 %, 19.35 %, and 39.57 %. Then, the result has been compared with three state-of-the-art methods: JEANS, DEATS and PGAN-ATSMT. The results outperform state-of-the-art models. Experimental results also show that the proposed model is qualitatively readable and can generate abstract summaries. Conclusion: In terms of ROUGE score, the model outperforms some art-of-the-state models for ROUGE-1 and ROUGE-L, but doesn’t achieve good result in ROUGE-2.
- Published
- 2023
- Full Text
- View/download PDF
6. Automatic Text Summarization Methods: A Comprehensive Review
- Author
-
Sharma, Grishma and Sharma, Deepak
- Published
- 2023
- Full Text
- View/download PDF
7. A novel semantic-enhanced generative adversarial network for abstractive text summarization
- Author
-
Vo, Tham
- Published
- 2023
- Full Text
- View/download PDF
8. Improving Coverage and Novelty of Abstractive Text Summarization Using Transfer Learning and Divide and Conquer Approaches.
- Author
-
Alomari, Ayham, Idris, Norisma, Md Sabri, Aznul Qalid, and Alsmadi, Izzat
- Subjects
TEXT summarization - Abstract
Automatic Text Summarization (ATS) models yield outcomes with insufficient coverage of crucial details and poor degrees of novelty. The first issue resulted from the lengthy input, while the second problem resulted from the characteristics of the training dataset itself. This research employs the divide-and-conquer approach to address the first issue by breaking the lengthy input into smaller pieces to be summarized, followed by the conquest of the results in order to cover more significant details. For the second challenge, these chunks are summarized by models trained on datasets with higher novelty levels in order to produce more human-like and concise summaries with more novel words that do not appear in the input article. The results demonstrate an improvement in both coverage and novelty levels. Moreover, we defined a new metric to measure the novelty of the summary. Finally, the findings led us to conclude that the novelty levels are more significantly influenced by the training dataset itself, as in CNN/DM, than by other factors like the training model or its training objective, as in Pegasus. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Anaphora resolved abstractive text summarization (AR-ATS) system
- Author
-
Moratanch, N. and Chitrakala, S.
- Published
- 2023
- Full Text
- View/download PDF
10. Automatic Short Text Summarization Techniques in Social Media Platforms
- Author
-
Fahd A. Ghanem, M. C. Padma, and Ramez Alkhatib
- Subjects
text summarization ,social media ,evaluation metrics ,abstractive summarization ,extractive summarization ,machine learning ,Information technology ,T58.5-58.64 - Abstract
The rapid expansion of social media platforms has resulted in an unprecedented surge of short text content being generated on a daily basis. Extracting valuable insights and patterns from this vast volume of textual data necessitates specialized techniques that can effectively condense information while preserving its core essence. In response to this challenge, automatic short text summarization (ASTS) techniques have emerged as a compelling solution, gaining significant importance in their development. This paper delves into the domain of summarizing short text on social media, exploring various types of short text and the associated challenges they present. It also investigates the approaches employed to generate concise and meaningful summaries. By providing a survey of the latest methods and potential avenues for future research, this paper contributes to the advancement of ASTS in the ever-evolving landscape of social media communication.
- Published
- 2023
- Full Text
- View/download PDF
11. Enhancing N-Gram Based Metrics with Semantics for Better Evaluation of Abstractive Text Summarization
- Author
-
He, Jia-Wei, Jiang, Wen-Jun, Chen, Guo-Bang, Le, Yu-Quan, and Ding, Xiao-Fei
- Published
- 2022
- Full Text
- View/download PDF
12. Advancements and Challenges in Text Summarization: An Overview of Methods and Strategies in Brief
- Author
-
Yarlagadda, Madhulika, Nadendla, Hanumantha Rao, Rao, Kongara Srinivasa, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Venu Gopal Rao, K., editor, Krishna Prasad, A. V., editor, and Vijaya Bhaskar, Seelam Ch., editor
- Published
- 2024
- Full Text
- View/download PDF
13. Abstractive text summarization: State of the art, challenges, and improvements.
- Author
-
Shakil, Hassan, Farooq, Ahmad, and Kalita, Jugal
- Subjects
- *
TEXT summarization , *LANGUAGE models , *AUTOMATIC summarization , *KNOWLEDGE representation (Information theory) , *RESEARCH personnel , *REINFORCEMENT learning - Abstract
Specifically focusing on the landscape of abstractive text summarization, as opposed to extractive techniques, this survey presents a comprehensive overview, delving into state-of-the-art techniques, prevailing challenges, and prospective research directions. We categorize the techniques into traditional sequence-to-sequence models, pre-trained large language models, reinforcement learning, hierarchical methods, and multi-modal summarization. Unlike prior works that did not examine complexities, scalability and comparisons of techniques in detail, this review takes a comprehensive approach encompassing state-of-the-art methods, challenges, solutions, comparisons, limitations and charts out future improvements — providing researchers an extensive overview to advance abstractive summarization research. We provide vital comparison tables across techniques categorized — offering insights into model complexity, scalability and appropriate applications. The paper highlights challenges such as inadequate meaning representation, factual consistency, controllable text summarization, cross-lingual summarization, and evaluation metrics, among others. Solutions leveraging knowledge incorporation and other innovative strategies are proposed to address these challenges. The paper concludes by highlighting emerging research areas like factual inconsistency, domain-specific, cross-lingual, multilingual, and long-document summarization, as well as handling noisy data. Our objective is to provide researchers and practitioners with a structured overview of the domain, enabling them to better understand the current landscape and identify potential areas for further research and improvement. [Display omitted] • Overview of state-of-the-art techniques in abstractive text summarization. • Comparative analysis of models in abstractive summarization. • Identification of challenges and potential improvements in the field. • Exploration of future research directions and emerging frontiers. • Holistic survey of abstractive text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Neural Attention Model for Abstractive Text Summarization Using Linguistic Feature Space
- Author
-
Aniqa Dilawari, Muhammad Usman Ghani Khan, Summra Saleem, Zahoor-Ur-Rehman, and Fatema Sabeen Shaikh
- Subjects
Abstractive summarization ,encoder-decoder ,extractive summarization ,feature rich model ,linguistic features ,summarization evaluation ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Summarization generates a brief and concise summary which portrays the main idea of the source text. There are two forms of summarization: abstractive and extractive. Extractive summarization chooses important sentences from the text to form a summary whereas abstractive summarization paraphrase using advanced and nearer-to human explanation by adding novel words or phrases. For a human annotator, producing summary of a document is time consuming and expensive because it requires going through the long document and composing a short summary. An automatic feature-rich model for text summarization is proposed that can reduce the amount of labor and produce a quick summary by using both extractive and abstractive approach. A feature-rich extractor highlights the important sentences in the text and linguistic characteristics are used to enhance results. The extracted summary is then fed to an abstracter to further provide information using features such as named entity tags, part of speech tags and term weights. Furthermore, a loss function is introduced to normalize the inconsistency between word-level and sentence-level attentions. The proposed two-staged network achieved a ROUGE score of 37.76% on the benchmark CNN/DailyMail dataset, outperforming the earlier work. Human evaluation is also conducted to measure the comprehensiveness, conciseness and informativeness of the generated summary.
- Published
- 2023
- Full Text
- View/download PDF
15. A Novel Gravity Optimization Algorithm for Extractive Arabic Text Summarization.
- Author
-
Hadi, Mustafa J., Abbas, Ayad R., and Fadhil, Osamah Y.
- Subjects
OPTIMIZATION algorithms ,TEXT summarization ,METAHEURISTIC algorithms ,AUTOMATIC summarization ,ARABIC language ,GRAVITY - Abstract
Copyright of Baghdad Science Journal is the property of Republic of Iraq Ministry of Higher Education & Scientific Research (MOHESR) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
16. Text Summarization Using Document and Sentence Clustering.
- Author
-
Pawar, Sumathi, Manjula Gururaj, H, and Chiplunar, Niranajan N
- Subjects
DOCUMENT clustering ,INFORMATION overload ,INFORMATION retrieval ,INFORMATION needs - Abstract
Text documents have important information and it will be very large in size. Getting the relevant information from the text document is very much challenging criteria in the field of information retrieval. This can be done using the text summarization method. A text document is compressed using a summarizing system to produce a new form that conveys the core idea of the content it contains. The issue of information overload demands access to reliable and properly crafted summaries. Users can quickly find the information they need using data minimization. Saving the time and effort from browsing through the entire collection of documents is main advantage of text summarization. The proposed system is focused on an extractive technique of text summarization using a text clustering and word-graph approach. The proposed System uses the term Frequency, Inverse Document Frequency (TFIDF), Jaccard similarity and Euclidian distance which are important techniques for clustering the text. This hybrid approach deals with the novel method for comprising of document and sentence clustering. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Feature Based Automatic Text Summarization Methods: A Comprehensive State-of-the-Art Survey
- Author
-
Divakar Yadav, Rishabh Katna, Arun Kumar Yadav, and Jorge Morato
- Subjects
Abstractive summarization ,cosine-similarity ,deep learning ,extractive summarization ,graph-based algorithm ,neural networks ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
With the advent of the World Wide Web, there are numerous online platforms that generate huge amounts of textual material, including social networks, online blogs, magazines, etc. This textual content contains useful information that can be used to advance humanity. Text summarization has been a significant area of research in natural language processing (NLP). With the expansion of the internet, the amount of data in the world has exploded. Large volumes of data make locating the required and best information time-consuming. It is impractical to manually summarize petabytes of data; hence, computerized text summarization is rising in popularity. This study presents a comprehensive overview of the current status of text summarizing approaches, techniques, standard datasets, assessment criteria, and future research directions. The summarizing approaches are assessed based on several characteristics, including approach-based, document-number-based, Summarization domain-based, document-language-based, output summary nature, etc. This study concludes with a discussion of many obstacles and research opportunities linked to text summarizing research that may be relevant for future researchers in this field.
- Published
- 2022
- Full Text
- View/download PDF
18. Abstractive text summarization of low-resourced languages using deep learning
- Author
-
Nida Shafiq, Isma Hamid, Muhammad Asif, Qamar Nawaz, Hanan Aljuaid, and Hamid Ali
- Subjects
Urdu ,Abstractive summarization ,LSTM ,BERT2BERT ,Pars-BERT ,Seq-to-Seq ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Background Humans must be able to cope with the huge amounts of information produced by the information technology revolution. As a result, automatic text summarization is being employed in a range of industries to assist individuals in identifying the most important information. For text summarization, two approaches are mainly considered: text summarization by the extractive and abstractive methods. The extractive summarisation approach selects chunks of sentences like source documents, while the abstractive approach can generate a summary based on mined keywords. For low-resourced languages, e.g., Urdu, extractive summarization uses various models and algorithms. However, the study of abstractive summarization in Urdu is still a challenging task. Because there are so many literary works in Urdu, producing abstractive summaries demands extensive research. Methodology This article proposed a deep learning model for the Urdu language by using the Urdu 1 Million news dataset and compared its performance with the two widely used methods based on machine learning, such as support vector machine (SVM) and logistic regression (LR). The results show that the suggested deep learning model performs better than the other two approaches. The summaries produced by extractive summaries are processed using the encoder-decoder paradigm to create an abstractive summary. Results With the help of Urdu language specialists, the system-generated summaries were validated, showing the proposed model’s improvement and accuracy.
- Published
- 2023
- Full Text
- View/download PDF
19. Enhancing abstractive summarization of implicit datasets with contrastive attention.
- Author
-
Kwon, Soonki and Lee, Younghoon
- Subjects
- *
TEXT summarization , *AUTOMATIC summarization , *LANGUAGE models , *VANILLA - Abstract
It is important for abstractive summarization models to understand the important parts of the original document and create a natural summary accordingly. Recently, studies have been conducted to incorporate important parts of the original document during learning and have shown good performance. However, these studies are effective for explicit datasets but not implicit datasets which are relatively more abstract. This study addresses the challenge of summarizing implicit datasets, which have a lower deviation in the significance of important sentences compared to explicit datasets. A multi-task learning approach that reflects information about salient and incidental objects during the learning process was proposed. This was achieved by adding a contrastive objective to the fine-tuning process of the encoder-decoder language model. The salient and incidental parts were selected based on the ROUGE-L F1 score and their relationships were learned through triplet loss. The proposed method was evaluated using five benchmark summarization datasets, including two explicit and three implicit. The experimental results showed a greater improvement in implicit datasets, particularly for the highly abstractive XSum dataset, compared to the vanilla fine-tuning method in both the BART-base and T5-small models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. AUTOMATIC TEXT SUMMARIZATION BASED ON EXTRACTIVE-ABSTRACTIVE METHOD.
- Author
-
HABIB, Md. Ahsan, EMA, Romana Rahman, ISLAM, Tajul, ARAFAT, Md. Yasir, and HASAN, Mahedi
- Subjects
TEXT summarization ,DECISION support systems ,SEMANTICS ,DATA encryption ,REINFORCEMENT learning - Abstract
The choice of this study has a significant impact on daily life. In various fields such as journalism, academia, business, and more, large amounts of text need to be processed quickly and efficiently. Text summarization is a technique used to generate a precise and shortened summary of spacious texts. The generated summary sustains overall meaning without losing any information and focuses on those parts that contain useful information. The goal is to develop a model that converts lengthy articles into concise versions. The task to be solved is to select an effective procedure to develop the model. Although the present text summarization models give us good results in many recognized datasets such as cnn/daily-mail, newsroom, etc. All the problems can not be resolved by these models. In this paper, a new text summarization method has been proposed: combining the Extractive and Abstractive Text Summarization technique. In the extractive-based method, the model generates a summary using Sentence Ranking Algorithm and passes this generated summary through an abstractive method. When using the sentence ranking algorithm, after rearranging the sentences, the relationship between one sentence and another sentence is destroyed. To overcome this situation, Pronoun to Noun conversion has been proposed with the new system. After generating the extractive summary, the generated summary is passed through the abstractive method. The proposed abstractive model consists of three pre-trained models: google/pegusus-xsum, facebook/bart-large-cnn model, and Yale-LILY/brio-cnndm-uncased, which generates a final summary depending on the maximum final score. The following results were obtained: experimental results on CNN/daily-mail dataset show that the proposed model obtained scores of ROUGE-1, ROUGE-2 and ROUGE-L are respectively 42.67 %, 19.35 %, and 39.57 %. Then, the result has been compared with three state-of-the-art methods: JEANS, DEATS and PGAN-ATSMT. The results outperform state-of-the-art models. Experimental results also show that the proposed model is qualitatively readable and can generate abstract summaries. Conclusion: In terms of ROUGE score, the model outperforms some art-of-the-state models for ROUGE-1 and ROUGE-L, but doesn't achieve good result in ROUGE-2. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. Automatic Short Text Summarization Techniques in Social Media Platforms.
- Author
-
Ghanem, Fahd A., Padma, M. C., and Alkhatib, Ramez
- Subjects
TEXT summarization ,SOCIAL media ,USER-generated content - Abstract
The rapid expansion of social media platforms has resulted in an unprecedented surge of short text content being generated on a daily basis. Extracting valuable insights and patterns from this vast volume of textual data necessitates specialized techniques that can effectively condense information while preserving its core essence. In response to this challenge, automatic short text summarization (ASTS) techniques have emerged as a compelling solution, gaining significant importance in their development. This paper delves into the domain of summarizing short text on social media, exploring various types of short text and the associated challenges they present. It also investigates the approaches employed to generate concise and meaningful summaries. By providing a survey of the latest methods and potential avenues for future research, this paper contributes to the advancement of ASTS in the ever-evolving landscape of social media communication. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. Inclusive Review on Extractive and Abstractive Text Summarization: Taxonomy, Datasets, Techniques and Challenges
- Author
-
Mishra, Gitanjali, Sethi, Nilambar, Agilandeeswari, L., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Pllana, Sabri, editor, Casalino, Gabriella, editor, Ma, Kun, editor, and Bajaj, Anu, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Survey of Text Summarization Stratification
- Author
-
Jamwal, Arvind, Singh, Pardeep, Kumari, Namrata, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Singh, Yashwant, editor, Verma, Chaman, editor, Zoltán, Illés, editor, Chhabra, Jitender Kumar, editor, and Singh, Pradeep Kumar, editor
- Published
- 2023
- Full Text
- View/download PDF
24. A Systematic Survey of Automatic Text Summarization Using Deep Learning Techniques
- Author
-
Yadav, Madhuri, Katarya, Rahul, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Agrawal, Rajeev, editor, Kishore Singh, Chandramani, editor, Goyal, Ayush, editor, and Singh, Dinesh Kumar, editor
- Published
- 2023
- Full Text
- View/download PDF
25. Abstractive Text Summarization for Tamil Language Using m-T5
- Author
-
Saraswathi, C., Prinitha, V., Briskilal, J., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Suma, V., editor, Lorenz, Pascal, editor, and Baig, Zubair, editor
- Published
- 2023
- Full Text
- View/download PDF
26. Cross-Domain Document Summarization Model via Two-Stage Curriculum Learning.
- Author
-
Lee, Seungsoo, Kim, Gyunyeop, and Kang, Sangwoo
- Subjects
AUTOMATIC summarization ,TEXT summarization - Abstract
Generative document summarization is a natural language processing technique that generates short summary sentences while preserving the content of long texts. Various fine-tuned pre-trained document summarization models have been proposed using a specific single text-summarization dataset. However, each text-summarization dataset usually specializes in a particular downstream task. Therefore, it is difficult to treat all cases involving multiple domains using a single dataset. Accordingly, when a generative document summarization model is fine-tuned to a specific dataset, it performs well, whereas the performance is degraded by up to 45% for datasets that are not used during learning. In short, summarization models perform well with in-domain cases, as the dataset domain during training and evaluation is the same but perform poorly with out-domain inputs. In this paper, we propose a new curriculum-learning method using mixed datasets while training a generative summarization model to be more robust on out-domain datasets. Our method performed better than XSum with 10%, 20%, and 10% lower performance degradation in CNN/DM, which comprised one of two test datasets used, compared to baseline model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Review on Recent Advances in Text Summarization Techniques
- Author
-
Vinitha, M., Vasundra, S., Powers, David M. W., Series Editor, Leibbrandt, Richard, Series Editor, Kumar, Amit, editor, Ghinea, Gheorghita, editor, and Merugu, Suresh, editor
- Published
- 2023
- Full Text
- View/download PDF
28. Towards End-to-End Speech-to-Text Summarization
- Author
-
Monteiro, Raul, Pernes, Diogo, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ekštein, Kamil, editor, Pártl, František, editor, and Konopík, Miloslav, editor
- Published
- 2023
- Full Text
- View/download PDF
29. A Systematic Survey of Text Summarization Techniques.
- Author
-
Kochrekar, Shivangi, Kale, Neha, Mehta, Darsh, and Ghane, Sunil
- Subjects
TEXT summarization - Abstract
Text Summarization is a technique in which a long, lengthy document can be converted into a brief document while maintaining the crux of information in it. A lot of progress has been made over the past few years in this domain, researchers have been able to perform summarization on single paged as well as multi-page documents. However, the biggest challenge that remains so far is the correct ordering of sentences and then forming a coherent summary from those sentences. In this paper, we endeavor to provide a detailed comparison of the different techniques of summarization put forward by researchers. This paper highlights the different approaches used, along with the results and disadvantages of the same. This will help the researchers to learn more about the topic in a comprehensive manner and identify gaps in approach to work on them and come up with novel approaches to address the issue. [ABSTRACT FROM AUTHOR]
- Published
- 2023
30. An Analysis of Abstractive Text Summarization Using Pre-trained Models
- Author
-
Rehman, Tohida, Das, Suchandan, Sanyal, Debarshi Kumar, Chattopadhyay, Samiran, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Mandal, Lopa, editor, Tavares, Joao Manuel R. S., editor, and Balas, Valentina E., editor
- Published
- 2022
- Full Text
- View/download PDF
31. A Comparison Study of Abstractive and Extractive Methods for Text Summarization
- Author
-
Bhargav, Shashank, Choudhury, Abhinav, Kaushik, Shruti, Shukla, Ravindra, Dutt, Varun, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Dua, Mohit, editor, Jain, Ankit Kumar, editor, Yadav, Anupam, editor, Kumar, Nitin, editor, and Siarry, Patrick, editor
- Published
- 2022
- Full Text
- View/download PDF
32. Abstractive Text Summarization Using Hierarchical Reinforcement Learning
- Author
-
Koupaee, Mahnaz
- Subjects
Computer science ,Abstractive summarization ,Hierarchical reinforcement learning ,Reinforcement learning ,Text summarization - Abstract
Sequence-to-sequence models have recently gained the state of the art performance in summarization. However, not too many large-scale high-quality datasets are available and almost all the available ones are mainly news articles with the specific writing style. Moreover, abstractive human-style systems involving a description of the content at a deeper level require data with higher levels of abstraction.On the other hand, attention-based sequence-to-sequence neural networks optimizing log-likelihoods at word-level or discrete metrics such as ROUGE at sequence-level has achieved promising results on abstractive text summarization but they are far from perfect: the first group of models may fail to handle out of vocabulary words and often produce repetitive words and incorrect facts. The latter methods using reinforcement training while beating the state of the art methods in terms of discrete evaluation metrics, produce non-readable, sometimes irrelevant summaries. We initially present WikiHow, a dataset of more than 230,000 article and summary pairs extracted and constructed from an online knowledge base written by different human authors. The articles span a wide range of topics and therefore represent high diversity styles. We also evaluate the performance of the existing methods on WikiHow to present its challenges and set some baselines to further improve it.Moreover, to overcome the problems of existing summarization systems, we propose a novel hierarchical reinforcement learning architecture which makes decisions in two steps: the high-level policy decides on the sub-goal for generating the next chunk of summary and the low-level policy performs primitive actions to fulfill the specified goal. By reinforcing summarization at different levels, our proposed model outperforms the existing approaches in terms of ROUGE and METEOR scores.
- Published
- 2018
33. Creating a tool for automatic text summarization
- Author
-
Mihalić, Danko and Kocijan, Kristina
- Subjects
obrada prirodnog jezika ,automatic text summarization ,abstractive summarization ,sažimanje temeljeno na neuronskoj mreži ,SOCIAL SCIENCES. Information and Communication Sciences ,automatsko sažimanje teksta ,ekstraktivne metode sažimanja ,python ,apstraktivne metode sažimanja ,neural network summarization ,sažimanje temeljeno na grafovima ,natural language processing ,extractive summarization ,graph-based summarization ,DRUŠTVENE ZNANOSTI. Informacijske i komunikacijske znanosti - Abstract
Cilj ovog rada je dati pregled teorije automatskog sažimanja teksta, te primjerom prikazati praktičnu izvedba programskog rješenja sažimanja teksta. Uspoređuju se rezultati tih rješenja temeljem tekstova na hrvatskom i engleskom jeziku te se ukazuje na njihove prednosti i mane. The goal of this theses is going trough the theory and practical implementations for automatic text summarization. Results of those implementations are shown for English and Croatian text with a quick look at their faults and advantages.
- Published
- 2022
34. Summary-Sentence Level Hierarchical Supervision for Re-Ranking Model of Two-Stage Abstractive Summarization Framework.
- Author
-
Yoo, Eunseok, Kim, Gyunyeop, and Kang, Sangwoo
- Subjects
- *
TEXT summarization , *LANGUAGE models , *AUTOMATIC summarization , *NATURAL language processing , *STOCHASTIC programming , *PREDICATE calculus - Abstract
Fine-tuning a pre-trained sequence-to-sequence-based language model has significantly advanced the field of abstractive summarization. However, the early models of abstractive summarization were limited by the gap between training and inference, and they did not fully utilize the potential of the language model. Recent studies have introduced a two-stage framework that allows the second-stage model to re-rank the candidate summary generated by the first-stage model, to resolve these limitations. In this study, we point out that the supervision method performed in the existing re-ranking model of the two-stage abstractive summarization framework cannot learn detailed and complex information of the data. In addition, we present the problem of positional bias in the existing encoder–decoder-based re-ranking model. To address these two limitations, this study proposes a hierarchical supervision method that jointly performs summary and sentence-level supervision. For sentence-level supervision, we designed two sentence-level loss functions: intra- and inter-intra-sentence ranking losses. Compared to the existing abstractive summarization model, the proposed method exhibited a performance improvement for both the CNN/DM and XSum datasets. The proposed model outperformed the baseline model under a few-shot setting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Enhancing Abstractive Summarization with Pointer Generator Networks and Coverage Mechanisms in NLP
- Author
-
Yarlagadda, Madhulika and Nadendla, Hanumantha Rao
- Published
- 2024
- Full Text
- View/download PDF
36. A Comparative Analysis of Automatic Extractive and Abstractive Text Summarization
- Author
-
Yadav, Madhuri, Katarya, Rahul, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Goyal, Vishal, editor, Gupta, Manish, editor, Mirjalili, Seyedali, editor, and Trivedi, Aditya, editor
- Published
- 2022
- Full Text
- View/download PDF
37. Review of Text Summarization in Indian Regional Languages
- Author
-
Thapa, Surendrabikram, Adhikari, Surabhi, Mishra, Sushruti, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Castillo, Oscar, editor, and Virmani, Deepali, editor
- Published
- 2021
- Full Text
- View/download PDF
38. Jointly Extractive and Abstractive Training Paradigm for Text Summarization
- Author
-
Gao, Yang, Li, Shasha, Wang, Pancheng, Wang, Ting, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
39. AI‐based abstractive text summarization towards AIoT and edge computing.
- Author
-
Ma, Jun, Li, Tong, and Zhang, Yanling
- Abstract
The task of text summarization has been widely concerned in the fields of news, document indexing and literature retrieval. Recently, due to the rise of mobile Internet devices, natural language processing for AIoT and edge computing has become a hot spot. This paper focuses on the research of text summarization for AIoT and edge computing. For a long time, abstractive summarization is limited to academic research due to the lack of controllability of the generated content. Recently, the appearance of Transformer has changed the current situation of abstractive text summarization. Transformer follows encoder‐decoder architecture, including attention mechanism and feed‐forward network. The encoder encodes the semantic information of source text, and decoder adaptively selects the effective context information through the attention mechanism to generate a coherent summary. To extract more semantic information and control the generated text better, this paper proposes multi‐scale semantic information Transformer (MSIT). Specifically, we introduce depth‐wise separable convolution to the encoder to extract more local semantic information, so that the attention mechanism can make better use of contextual semantic information. Additionally, we combine the encoding vector of encoder and target summary as the input to the attention layer of the decoder, and introduce time series mechanism so that the decoder can consider context information when generating text. Experiments on CNN‐Daily Mail Dataset show that this model is superior to other methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Abstractive summarization with deep reinforcement learning using semantic similarity rewards.
- Author
-
Beken Fikri, Figen, Oflazer, Kemal, and Yanıkoğlu, Berrin
- Subjects
DEEP reinforcement learning ,TEXT summarization ,MACHINE learning ,AUTOMATIC summarization ,REINFORCEMENT learning ,NATURAL languages - Abstract
Abstractive summarization is an approach to document summarization that is not limited to selecting sentences from the document but can generate new sentences as well. We address the two main challenges in abstractive summarization: how to evaluate the performance of a summarization model and what is a good training objective. We first introduce new evaluation measures based on the semantic similarity of the input and corresponding summary. The similarity scores are obtained by the fine-tuned BERTurk model using either the cross-encoder or a bi-encoder architecture. The fine-tuning is done on the Turkish Natural Language Inference and Semantic Textual Similarity benchmark datasets. We show that these measures have better correlations with human evaluations compared to Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scores and BERTScore. We then introduce a deep reinforcement learning algorithm that uses the proposed semantic similarity measures as rewards, together with a mixed training objective, in order to generate more natural summaries in terms of human readability. We show that training with a mixed training objective function compared to only the maximum-likelihood objective improves similarity scores. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Automatic Text Summarization: Methods, Metrics and Datasets
- Author
-
Varghese, Tiju George, Priya, C. V., Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Mumtaz, Shahid, editor, Rawat, Danda B., editor, and Menon, Varun G., editor
- Published
- 2024
- Full Text
- View/download PDF
42. Text summarization using natural language processing methods with application to news articles
- Author
-
Zrnić, Leon and Šnajder, Jan
- Subjects
enkoder-dekoder struktura ,transformeri ,TECHNICAL SCIENCES. Computing ,TEHNIČKE ZNANOSTI. Računarstvo ,ROUGE metrics ,abstractive summarization ,apstraktivna sumarizacija ,novinski članci ,transformers ,encoder-decoder structure ,ROUGE metrike ,news articles ,BERT - Abstract
Apstraktno sažimanje teksta snažan je alat pomoću kojega možemo sažeti tekstualne informacije u današnjem svijetu. U sklopu ovog završnog rada, implementirali smo malen, fino podešen, "sequence to sequence" BERT model kojim smo htjeli izvršiti zadatak sažimanja. Taj BERT model smo usporedili s dva osnovna modela od kojih je prvi model "Sequential Sentence Selection" model, koji izvlačenjem rečenica iz originalnog teksta takozvanim "ekstraktivnim" sažimanjem teksta, gradi sažetke tekstova. Drugi osnovni model je apstraktni sažimatelj teksta BERT -koji nismo prethodno trenirali-, koji gleda kontekst samog teksta te na temelju njega gradi sažetak koji ima smisla. Naše modele smo trenirali i testirali nad jednim dijelom Newsroom skupova podataka. Newsroom je trenutno jedan od najvećih skupa podataka novinskih članaka u svijetu te on sadrži preko 1.3 milijuna novinskih članaka iz 38 velikih izdavačkih kuća. Svaki članak uz sebe ima i sažetak članka, napisan od strane autora, s kojim smo uspoređivali sažetke koje su generirali modeli. Fino podešen BERT daje ROUGE-2 vrijednost od 18.79 nad testnim skupom, što je značajno bolje nego prva dva osnovna modela koja su imala 7.21 i 14.12 za ROUGE-2. Ovime ne samo da smo pokazali kako fino podešavanje apstraktnog sažimatelja teksta, čak i nad skupom s visokom varijancom među podatcima poput Newsroom-a, može dati bolje rezultate nego ostali trenutni modeli evidentirani u radovima koji se bave malim modelima sažimanja teksta. Međutim, treba uzeti u obzir kako je ovo daleko od prvorazrednih modela poput GPT-3 ili Pegasus-a, no i dalje pokazuja kako čak i mali modeli mogu vratiti zadovoljavajuće rezultate. Abstractive text summarization is a powerful tool for boiling down information in today’s world. In this thesis, we implemented a small, fine-tuned sequence to sequence BERT model to see if it can carry out this task. We compared this model with two baseline models, the Sequential Sentence Selection model which uses extractive text summarization and a BERT abstractive summarizer without any fine-tuning. The models were trained and tested on a subset of the Newsroom dataset, which encompasses 1.3 million articles from 38 publishing houses. These articles were accompanied with summaries written by their authors which made them perfect for this task. The fine-tuned BERT model demonstrated a ROUGE-2 score of 18.79 on the test set, which is quite better than the two baseline models, which achieved 7.21 and 14.12 of ROUGE-2, respectively. As such, we have proved that fine-tuning an abstractive text summarization model, even on a high variance dataset such as Newsroom, can be done to achieve better results than what is currently reported in other papers, regarding small models. This however, is far from the state of the art models such as GPT-3 or Pegasus, but it shows that even small models can achieve satisfying results.
- Published
- 2022
43. An Intelligent Tree Extractive Text Summarization Deep Learning.
- Author
-
AlArfaj, Abeer Abdulaziz and Hosni Mahmoud, Hanan Ahmed
- Subjects
DEEP learning ,KNOWLEDGE representation (Information theory) ,MACHINE learning ,ENTROPY (Information theory) ,NATURAL languages - Abstract
In recent research, deep learning algorithms have presented effective representation learning models for natural languages. The deep learningbasedmodels create better data representation than classicalmodels. They are capable of automated extraction of distributed representation of texts. In this research,we introduce a newtree Extractive text summarization that is characterized by fitting the text structure representation in knowledge base training module, and also addressesmemory issues that were not addresses before. The proposed model employs a tree structured mechanism to generate the phrase and text embedding. The proposed architecture mimics the tree configuration of the text-texts and provide better feature representation. It also incorporates an attention mechanism that offers an additional information source to conduct better summary extraction. The novel model addresses text summarization as a classification process, where the model calculates the probabilities of phrase and text-summary association. The model classification is divided into multiple features recognition such as information entropy, significance, redundancy and position. The model was assessed on two datasets, on the Multi-Doc Composition Query (MCQ) and Dual Attention Composition dataset (DAC) dataset. The experimental results prove that our proposed model has better summarization precision vs. other models by a considerable margin. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Abstractive Text Summarization Using Deep Learning
- Author
-
Anđelković, Ivan and Šilić, Marin
- Subjects
Pažnja ,Transformer ,Pre training ,TEHNIČKE ZNANOSTI. Računarstvo ,CNN Daily Mail skup ,CNN Daily Mail Dataset ,Deep learning ,BRIO ,Apstraktivno sažimanje ,Huggingface ,Abstractive Summarization ,TECHNICAL SCIENCES. Computing ,Pred treniranje ,Attention ,BART ,ROUGE ,Duboko učenje ,LSTM ,BERT - Abstract
Apstraktivno sažimanje jedno je od dva smjera u strojnom sažimanju teksta, uz ekstraktivno. Arhitekture modela koji se koriste pri rješavanju zadataka u ovom području bazirane su na ideji transformera koji se sastoji od para enkoder-dekoder. Trenutno najuspješniji modeli bazirani su na nenadziranom predtreniranju na velikim skupovima podataka i daljnjem podešavanju parametara na manjim skupovima podataka korištenjem nadziranog učenja. U radu su testirani i validirani upravo tri takva modela za apstraktivno sažimanje teksta: BERT, BART i BRIO te su evaluirani pomoću ROUGE metrika. Cilj je izvući bit iz teksta smanjujući njegovu veličinu. Za pravilnu generaciju sažetaka potrebno je poznavanje cijelog jezika kako bi se mogle odabrati prikladne riječi iz rječnika. Trenutni modeli su uspješni u stvaranju sažetaka koji su sadržajno isti ili suženi u odnosu na izvorni tekst. Razumiju značenje ulaznog teksta i koriste ga kao referencu pri generiranju novog kraćeg teksta. Za rad u području apstraktivnog sažimanja teksta potrebne su velike količine vremena i računalne snage što i dalje čini ovakve zadatke izazovnima. Abstractive Text Summarization is one of two ways to automatically summarize text, along with extractive. The model architectures used in solving tasks in this area are based on the idea of a transformer consisting of an encoder-decoder pair. Currently, the most successful models are based on unsupervised pre-training on large datasets and further parameter tuning on smaller datasets using supervised learning. In this paper, exactly three such models for abstract text compression were tested and validated: BERT, BART and BRIO and were evaluated using ROUGE metrics. The goal is to get the meaning out of a body of text while also reducing it in size. Proper generation of summaries requires knowledge of the entire language in order to be able to select appropriate words from the dictionary. Current models are successful in creating summaries that keep the meaning the same or narrowed compared to the original text. They understand the meaning of the input text and use it as a reference when generating new, shorter text. Working in the field of abstract text summarization requires large amounts of time and computing power, which still makes such tasks challenging.
- Published
- 2022
45. T5-Based Model for Abstractive Summarization: A Semi-Supervised Learning Approach with Consistency Loss Functions.
- Author
-
Wang, Mingye, Xie, Pan, Du, Yao, and Hu, Xiaohui
- Subjects
TEXT summarization ,NATURAL language processing ,SUPERVISED learning ,CHINESE language - Abstract
Text summarization is a prominent task in natural language processing (NLP) that condenses lengthy texts into concise summaries. Despite the success of existing supervised models, they often rely on datasets of well-constructed text pairs, which can be insufficient for languages with limited annotated data, such as Chinese. To address this issue, we propose a semi-supervised learning method for text summarization. Our method is inspired by the cycle-consistent adversarial network (CycleGAN) and considers text summarization as a style transfer task. The model is trained by using a similar procedure and loss function to those of CycleGAN and learns to transfer the style of a document to its summary and vice versa. Our method can be applied to multiple languages, but this paper focuses on its performance on Chinese documents. We trained a T5-based model and evaluated it on two datasets, CSL and LCSTS, and the results demonstrate the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Fine-tuning and multilingual pre-training for abstractive summarization task for the Arabic language.
- Author
-
Kahla, Mram, Novák, Attila, and Zijian Győző Yang
- Subjects
- *
TEXT summarization , *ARABIC language - Abstract
The main task of our research is to train various abstractive summarization models for the Arabic language. The work for abstractive Arabic text summarization has hardly begun so far due to the unavailability of the datasets needed for that. In our previous research, we created the first monolingual corpus in the Arabic language for abstractive text summarization. Based on this corpus, we fine-tuned various transformer models. We tested the PreSumm and multilingual BART models. We achieved a "state of the art" result in this area with the PreSumm method. The present study continues the same series of research. We extended our corpus "AraSum" and managed to reach up to 50 thousand items, each consisting of an article and its corresponding lead. In addition, we pretrained our own monolingual and trilingual BART models for the Arabic language and fine-tuned them in addition to the mT5 model for abstractive text summarization for the same language, using the AraSum corpus. While there is room for improvement in the resources and the infrastructure we possess, the results clearly demonstrate that most of our models surpassed the XL-Sum which is considered to be state of the art for abstractive Arabic text summarization so far. Our corpus "AraSum" will be released to facilitate future work on abstractive Arabic text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Text Summarization Based on Several Natural Language Techniques
- Author
-
Abeer Khalid Ahmad
- Subjects
keywords extraction ,morphological rules ,extractive summarization ,abstractive summarization ,Science ,Technology - Abstract
Because of the great amount of information that provided by internet technologies, the automatic text summarization have become more important. This paper describes a method for summarizing English text. It depends on extractive summarization. The method implies many techniques of statistics and linguistic approaches especially based on morphological rules. The linguistic approaches in this method also include synonym, word-frequencies, word position, and part of speech. It will be shown that merging many statistics and linguistic approaches in one system, gives high accurate results at low threshold values. The system is tested to find the best threshold value, and it was 60%.
- Published
- 2014
- Full Text
- View/download PDF
48. A Comprehensive Survey of Deep Learning Models for Legal Document Summarization.
- Author
-
Reddy, Kancharla Bharath and Jayabharathy, J.
- Subjects
DEEP learning ,TEXT summarization ,LEGAL documents ,EVIDENCE gaps ,LEGAL language ,RESEARCH personnel - Abstract
Legal document Summarization poses a difficult challenge because of the complexity and diversity of legal language and concepts. In recent years, deep learning models have shown promising results in summarizing legal documents. The article examines the challenges of legal document summarization, reviews the state-of-the-art deep learning models, and identifies research gaps in this area. Potential future research directions and applications of legal document summarization using deep learning models are also discussed. The paper is a valuable resource for researchers and practitioners interested in legal document summarization using deep learning models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
49. Big Data Text Summarization - Hurricane Harvey
- Author
-
Geissinger, Jack, Long, Theo, Jung, James, Parent, Jordan, and Rizzo, Robert
- Subjects
abstractive summarization ,text summarization ,deep learning ,topic summarization ,neural networks ,NLP ,computational linguistics ,big data text summarization ,pointer-generator network ,big data ,template filling ,multi-document summarization ,Hurricane Harvey ,hurricanes ,TextRank ,information extraction ,natural language processing ,extractive summarization ,event summarization - Abstract
Natural language processing (NLP) has advanced in recent years. Accordingly, we present progressively more complex generated text summaries on the topic Hurricane Harvey. We utilized TextRank, which is an unsupervised extractive summarization algorithm. TextRank is computationally expensive, and the sentences generated by the algorithm aren’t always directly related or essential to the topic at hand. When evaluating TextRank, we found that a single sentence interjected and ruined the flow of the summary. We also found that ROUGE evaluation for our TextRank summary was quite low compared to a golden standard that was prepared for us. However, the TextRank summary had high marks for ROUGE evaluation compared to the Wikipedia article lead for Hurricane Harvey. To improve upon the TextRank algorithm, we utilized template summarization with named entities. Template summarization takes less time to run than TextRank but is supervised by the author of the template and script to choose valuable named entities. Thus, it is highly dependent on human intervention to produce reasonable and readable summaries that aren’t error-prone. As expected, the template summary evaluated well compared to the Gold Standard and the Wikipedia article lead. This result is mainly due to our ability to include named entities we thought were pertinent to the summary. Beyond extractive summaries like TextRank and template summarization, we pursued abstractive summarization using pointer-generator networks and multi-document summarization with pointer-generator networks and maximal marginal relevance. The benefit of using abstractive summarization is that it is more in-line with how humans summarize documents. Pointer-generator networks, however, require GPUs to run properly and a large amount of training data. Luckily, we were able to use a pre-trained network to generate summaries. The pointer-generator network is the centerpiece of our abstractive methods and allowed us to create summaries in the first place. NLP is at an inflection point due to deep learning, and our generated summaries using a state-of-the-art pointer-generator neural network are filled with details about Hurricane Harvey, including damage incurred, the average amount of rainfall, and the locations it affected the most. The summary is also free of grammatical errors. We also use a novel Python library, written by Logan Lebanoff at the University of Central Florida, for multi-document summarization using deep learning to summarize our Hurricane Harvey dataset of 500 articles and the Wikipedia article for Hurricane Harvey. The summary of the Wikipedia article is our final summary and has the highest ROUGE scores that we could attain. NSF: IIS-1619028 - BDTS_Hurricane_Harvey_final_report.docx: Editable version of the final report - BDTS_Hurricane_Harvey_final_report.pdf: PDF version of the final report - BDTS_Hurricane_Harvey_presentation.pptx: Editable version of the presentation slides - BDTS_Hurricane_Harvey_presentation.pdf: PDF version of the presentation slides Source file in zip: - freq_words.py - Finds the most frequent words in a JSON file that contains a sentences field. Requires a file to be passed through the -f option. - pos_tagging.py - Performs basic part-of-speech tagging on a JSON file that contains a sentences field. Requires a file to be passed through the -f option. - textrank_summarizer.py - Performs TextRank summarization with a JSON file that contains a sentences field. Requires a file to be passed through the -f option. - template_summarizer.py - Performs template summarization with a JSON file that contains a sentences field. Requires a file to be passed through the -f option. - wikipedia_content.py - Extracts content from a Wikipedia page given a topic and formats the information for the pointer-generator network using the “make_datafiles.py” script. Requires a topic to be given in the -t option and an output directory for “make_datafiles.py” to read from with the -o option. - make_datafiles.py - Called by "wikipedia_content.py" to convert story files to .bin files. - jusText.py - Used to clean up the large dataset - requirements.txt - Used with Anaconda for installing all of the dependencies. - small_dataset.json - Properly formatted JSON file for use with other files.
- Published
- 2018
50. Hierarchical Sliding Inference Generator for Question-driven Abstractive Answer Summarization.
- Author
-
BING LI, PENG YANG, HANLIN ZHAO, PENGHUI ZHANG, and ZIJIAN LIU
- Subjects
- *
TEXT summarization , *NATURAL languages - Abstract
Text summarization on non-factoid question answering (NQA) aims at identifying the core information of redundant answer guidance using questions, which can dramatically improve answer readability and comprehensibility. Most existing approaches focus on extracting query-related sentences to construct a summary, where the logical connection of natural language and the hierarchical interpretable semantic association are often neglected, thus degrading performance. To address these issues, we propose a novel question-driven abstractive answer summarization model, called the Hierarchical Sliding Inference Generator (HSIG), to form inferable and interpretable summaries by explicitly introducing hierarchical information reasoning between questions and corresponding answers. Specifically, we first apply an elaborately designed hierarchical sliding fusion inference model to determine the most relevant question sentence-level representation that provides a deeper interpretable basis for sentence selection in summarization, which further increases computational performance on the premise of following the semantic inheritance structure. Additionally, to improve summary fluency, we construct a double-driven selective generator to integrate various semantic information from two mutual question-and-answer perspectives. Experimental results illustrate that compared with stateof-the-art baselines, our model achieves remarkable improvement on two benchmark datasets and specifically improves the 2.46 ROUGE-1 points on PubMedQA, which demonstrates the superiority of our model on abstractive summarization with hierarchical sequential reasoning. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.