81 results on '"TEXT summarization"'
Search Results
2. Automatic text summarization using transformer-based language models.
- Author
-
Rao, Ritika, Sharma, Sourabh, and Malik, Nitin
- Abstract
Automatic text summarization is a lucrative field in natural language processing (NLP). The amount of data flow has multiplied with the switch to digital. The massive datasets hold a wealth of knowledge and information must be extracted to be useful. This article focusses on creating an unmanned text summarizing structure that accepts text as data feeded into the system to outputs a summary using a cutting-edge machine learning model. Advancements in NLP led to the introduction of transformers in the field and their outstanding performance pulled a lot of attention towards them. The two transformer-based language models namely, Bidirectional and Auto-regressive Transformer (BART) and Text-To-Text Transfer Transformer (T5) were implemented on the CNN_dailymail dataset. BART outperforms T5 by 3.02% in ROUGE-1 Score. The model provides a worthier performance in comparison to the other models introduced in the existing literature for performing the same task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Chatgpt in and for second language acquisition: A call for systematic research.
- Author
-
Han, ZhaoHong
- Subjects
CHATGPT ,GENERATIVE pre-trained transformers ,LANGUAGE models ,SECOND language acquisition ,COMPUTER assisted language instruction ,TEXT summarization - Abstract
The article discusses the use of Artificial Intelligence (A.I.) tools, specifically chatbots, in second language acquisition. It highlights the potential benefits and concerns associated with the use of A.I. in language learning. The article calls for systematic research to explore the capabilities of A.I. language models, such as ChatGPT, and their impact on language learning. It suggests three strands of research: examining the language model itself, investigating learner interactions with the model, and exploring the effects of the model on learners. The article emphasizes the need to connect this research with existing studies in Instructed Second Language Acquisition (ISLA) to enhance our understanding of language learning conditions. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
4. Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition.
- Author
-
Zhang, Taozheng, Meng, Jiajian, Yang, Yuseng, and Yu, Shaode
- Subjects
LANGUAGE models ,GENERATIVE pre-trained transformers ,TEXT summarization ,COMPUTER-assisted image analysis (Medicine) ,ARTIFICIAL intelligence ,NATURAL language processing - Abstract
Medical imaging description and disease diagnosis are vitally important yet time-consuming. Automated diagnosis report generation (DRG) from medical imaging description can reduce clinicians' workload and improve their routine efficiency. To address this natural language generation task, fine-tuning a pre-trained large language model (LLM) is cost-effective and indispensable, and its success has been witnessed in many downstream applications. However, semantic inconsistency of sentence embeddings has been massively observed from undesirable repetitions or unnaturalness in text generation. To address the underlying issue of anisotropic distribution of token representation, in this study, a contrastive learning penalized cross-entropy (CLpCE) objective function is implemented to enhance the semantic consistency and accuracy of token representation by guiding the fine-tuning procedure towards a specific task. Furthermore, to improve the diversity of token generation in text summarization and to prevent sampling from unreliable tail of token distributions, a diversity contrastive search (DCS) decoding method is designed for restricting the report generation derived from a probable candidate set with maintained semantic coherence. Furthermore, a novel metric named the maximum of token repetition ratio (maxTRR) is proposed to estimate the token diversity and to help determine the candidate output. Based on the LLM of a generative pre-trained Transformer 2 (GPT-2) of Chinese version, the proposed CLpCE with DCS (CLpCEwDCS) decoding framework is validated on 30,000 desensitized text samples from the "Medical Imaging Diagnosis Report Generation" track of 2023 Global Artificial Intelligence Technology Innovation Competition. Using four kinds of metrics evaluated from n-gram word matching, semantic relevance, and content similarity as well as the maxTRR metric extensive experiments reveal that the proposed framework effectively maintains semantic coherence and accuracy (BLEU-1, 0.4937; BLEU-2, 0.4107; BLEU-3, 0.3461; BLEU-4, 0.2933; METEOR, 0.2612; ROUGE, 0.5182; CIDER, 1.4339) and improves text generation diversity and naturalness (maxTRR, 0.12). The phenomenon of dull or repetitive text generation is common when fine-tuning pre-trained LLMs for natural language processing applications. This study might shed some light on relieving this issue by developing comprehensive strategies to enhance semantic coherence, accuracy and diversity of sentence embeddings. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. QUICK REVIEW OF PEDAGOGICAL EXPERIENCES USING GPT-3 IN EDUCATION.
- Author
-
Manuel Prieto-Andreu, Joel and Labisa-Palmeira, Antonio
- Subjects
GENERATIVE pre-trained transformers ,LANGUAGE models ,TEXT summarization ,CHATBOTS ,COMPUTER science ,ARTIFICIAL intelligence - Abstract
GPT-3 is a neuronal language model that performs tasks such as classification, question-answering and text summarization. Although chatbots like BlenderBot-3 work well in a conversational sense, and GPT-3 can assist experts in evaluating questions, they are quantifiably worse than real teachers in several pedagogical dimensions. We present the first systematic literature review that analyzes the main contributions and uses of GPT-3 in the field of education. The protocols suggested in the PRISMA 2020 statement were followed for the drafting of the review. According to the results, 34 significant productions were identified through a systematic search in ISI Web of Science, SCOPUS and Google Scholar. GPT-3 has been considered in the academic, ethical and medical fields, in humanities and in computer science, in the formulation of questions and answers, and through cooperative educational dialogs. GPT-3 has been proven to have valuable applications in education, such as the automation of routine tasks, in making quick diagnoses of the students' weaknesses and in the automatic generation of questions, but it still faces challenges and limitations that require additional investigation. We discuss the educational possibilities and the limitations to the use of GPT-3. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. From Text to Action: NLP Techniques for Washing Machine Manual Processing.
- Author
-
Biju, Vinai George, Babu, Bibin, Asghar, Ali, Prathap, Boppuru Rudra, and Reddy, Vandana
- Subjects
WASHING machines ,TEXT summarization ,LANGUAGE models ,QUESTION answering systems ,NATURAL language processing ,TEXT mining - Abstract
This scientific research study focuses on the advancements in Natural Language Processing (NLP) driven by large-scale parallel corpora and presents a comprehensive methodology for creating a parallel, multilingual corpus using NLP techniques and semantic technologies, with a particular focus on washing machine manuals. The study highlights the significant progress made in NLP through the utilization of large-scale parallel corpora and advanced NLP techniques. The successful creation of a parallel, multilingual corpus for washing machine manuals, coupled with the integration of semantic technologies and ontology modeling, demonstrates the broad applicability and potential of NLP in diverse domains.The research covers various aspects, including text extraction, segmentation, and the development of specialized pipelines for question-answering, translation, and text summarization tailored for washing machine manuals. Translation experiments using fine-tuned models demonstrated the feasibility of providing washing machine manuals in local languages, expanding accessibility and understanding for users worldwide. Additionally, the study explored text summarization using a powerful transformer-based model, which exhibited remarkable proficiency in generating concise and coherent summaries from complex input texts. The implementation of a question-answering pipeline showcased the effectiveness of various language models in handling question-answering tasks with high accuracy and effectiveness.Additionally, the article discusses the processes of data collection, information preparation, ontology creation, alignment strategies, and text analytics. Furthermore, the study addresses the challenges and potential future developments in this field, offering insights into the promising applications of NLP in the context of washing machine manuals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Evaluating the Impact of Text Data Augmentation on Text Classification Tasks using DistilBERT.
- Author
-
Nair, Aarathi Rajagopalan, Singh, Rimjhim Padam, Gupta, Deepa, and Kumar, Priyanka
- Subjects
DATA augmentation ,LANGUAGE models ,TEXT summarization ,TRANSLATING & interpreting ,SENTIMENT analysis ,NATURAL language processing ,MACHINE translating - Abstract
Data augmentation entails artificially expanding the dataset's size by applying various transformations to the existing raw data. Enhancing the quality and quantity of the datasets with varying sizes by employing varieddata augmentation techniques has immense importance in the field on Natural Language Processing. Several notable applications for instance text classification, sentiment analysis, text summarization, etc. have proven to be benefitted immensely with the employment of text augmentation techniques. Hence, the paper focuses on efficient text classification using varied datasets of different sizes; small- 500 instances, medium-5564 instances and large-43934 instances.The work considers the standard DistilBERT model, a popular transformer-based language model and presents the impact on the performance of the modelafter employing different text augmentation techniques. The study specifically focuses on three augmentation methods: (a) Synonym augmentation:that involves replacing words with their synonyms to enhance vocabulary diversity and generalization, (b) Contextual word embeddings that enriches semantic understanding by leveraging pre-trained language models, and (c) Black translation that entails translating the text into a another different language and then translating it back, introducing variations in the data and capturing different linguistic patterns.Additionally,the work also discusses the combined effect of employing all three augmentation techniques simultaneously. Moreover, the study also aims compares the relation between the dataset sizes and the performance of the augmentation techniques. The study considers three standard datasets for the analysis and presents a comprehensive analysis using accuracy and F1 score as evaluation metrics. The results highlight the efficacy of each technique across small, medium, and large datasets, enabling a nuanced understanding of their benefits in different data scenarios. The findings indicate the varying degrees of improvement achieved through each augmentation technique.The enhancement achieved by applying text augmentation varied from around 2% on large datasets to 20% on smaller datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Summary-Sentence Level Hierarchical Supervision for Re-Ranking Model of Two-Stage Abstractive Summarization Framework.
- Author
-
Yoo, Eunseok, Kim, Gyunyeop, and Kang, Sangwoo
- Subjects
- *
TEXT summarization , *LANGUAGE models , *AUTOMATIC summarization , *NATURAL language processing , *STOCHASTIC programming , *PREDICATE calculus - Abstract
Fine-tuning a pre-trained sequence-to-sequence-based language model has significantly advanced the field of abstractive summarization. However, the early models of abstractive summarization were limited by the gap between training and inference, and they did not fully utilize the potential of the language model. Recent studies have introduced a two-stage framework that allows the second-stage model to re-rank the candidate summary generated by the first-stage model, to resolve these limitations. In this study, we point out that the supervision method performed in the existing re-ranking model of the two-stage abstractive summarization framework cannot learn detailed and complex information of the data. In addition, we present the problem of positional bias in the existing encoder–decoder-based re-ranking model. To address these two limitations, this study proposes a hierarchical supervision method that jointly performs summary and sentence-level supervision. For sentence-level supervision, we designed two sentence-level loss functions: intra- and inter-intra-sentence ranking losses. Compared to the existing abstractive summarization model, the proposed method exhibited a performance improvement for both the CNN/DM and XSum datasets. The proposed model outperformed the baseline model under a few-shot setting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. A semantically enhanced text retrieval framework with abstractive summarization.
- Author
-
Pan, Min, Li, Teng, Liu, Yu, Pei, Quanli, Huang, Ellen Anne, and Huang, Jimmy X.
- Subjects
- *
LANGUAGE models , *TEXT summarization , *PROBABILISTIC generative models - Abstract
Recently, large pretrained language models (PLMs) have led a revolution in the information retrieval community. In most PLMs‐based retrieval frameworks, the ranking performance broadly depends on the model structure and the semantic complexity of the input text. Sequence‐to‐sequence generative models for question answering or text generation have proven to be competitive, so we wonder whether these models can improve ranking effectiveness by enhancing input semantics. This article introduces SE‐BERT, a semantically enhanced bidirectional encoder representation from transformers (BERT) based ranking framework that captures more semantic information by modifying the input text. SE‐BERT utilizes a pretrained generative language model to summarize both sides of the candidate passage and concatenate them into a new input sequence, allowing BERT to acquire more semantic information within the constraints of the input sequence's length. Experimental results from two Text Retrieval Conference datasets demonstrate that our approach's effectiveness increasing as the length of the input text increases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Dilated convolution for enhanced extractive summarization: A GAN-based approach with BERT word embedding.
- Author
-
Wu, Huimin
- Subjects
- *
AUTOMATIC summarization , *LANGUAGE models , *NATURAL language processing , *TEXT summarization , *GENERATIVE adversarial networks - Abstract
Text summarization (TS) plays a crucial role in natural language processing (NLP) by automatically condensing and capturing key information from text documents. Its significance extends to diverse fields, including engineering, healthcare, and others, where it offers substantial time and resource savings. However, manual summarization is a laborious task, prompting the need for automated text summarization systems. In this paper, we propose a novel strategy for extractive summarization that leverages a generative adversarial network (GAN)-based method and Bidirectional Encoder Representations from Transformers (BERT) word embedding. BERT, a transformer-based architecture, processes sentence bidirectionally, considering both preceding and following words. This contextual understanding empowers BERT to generate word representations that carry a deeper meaning and accurately reflect their usage within specific contexts. Our method adopts a generator and discriminator within the GAN framework. The generator assesses the likelihood of each sentence in the summary while the discriminator evaluates the generated summary. To extract meaningful features in parallel, we introduce three dilated convolution layers in the generator and discriminator. Dilated convolution allows for capturing a larger context and incorporating long-range dependencies. By introducing gaps between filter weights, dilated convolution expands the receptive field, enabling the model to consider a broader context of words. To encourage the generator to explore diverse sentence combinations that lead to high-quality summaries, we introduce various noises to each document within our proposed GAN. This approach allows the generator to learn from a range of sentence permutations and select the most suitable ones. We evaluate the performance of our proposed model using the CNN/Daily Mail dataset. The results, measured using the ROUGE metric, demonstrate the superiority of our approach compared to other tested methods. This confirms the effectiveness of our GAN-based strategy, which integrates dilated convolution layers, BERT word embedding, and a generator-discriminator framework in achieving enhanced extractive summarization performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Survey of transformers and towards ensemble learning using transformers for natural language processing.
- Author
-
Zhang, Hongzhi and Shafiq, M. Omair
- Subjects
NATURAL language processing ,LANGUAGE models ,TEXT summarization ,TRANSFORMER models ,QUESTION answering systems ,DEEP learning ,SENTIMENT analysis - Abstract
The transformer model is a famous natural language processing model proposed by Google in 2017. Now, with the extensive development of deep learning, many natural language processing tasks can be solved by deep learning methods. After the BERT model was proposed, many pre-trained models such as the XLNet model, the RoBERTa model, and the ALBERT model were also proposed in the research community. These models perform very well in various natural language processing tasks. In this paper, we describe and compare these well-known models. In addition, we also apply several types of existing and well-known models which are the BERT model, the XLNet model, the RoBERTa model, the GPT2 model, and the ALBERT model to different existing and well-known natural language processing tasks, and analyze each model based on their performance. There are a few papers that comprehensively compare various transformer models. In our paper, we use six types of well-known tasks, such as sentiment analysis, question answering, text generation, text summarization, name entity recognition, and topic modeling tasks to compare the performance of various transformer models. In addition, using the existing models, we also propose ensemble learning models for the different natural language processing tasks. The results show that our ensemble learning models perform better than a single classifier on specific tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. State-of-the-Art Future Internet Technology in Italy 2022–2023.
- Author
-
Cafaro, Massimo, Epicoco, Italo, and Pulimeno, Marco
- Subjects
INFORMATION technology ,INTERNET ,LANGUAGE models ,TEXT summarization ,ARTIFICIAL intelligence ,NATURAL language processing - Abstract
This document is a summary of a special issue of the journal Future Internet, focusing on the state of Future Internet Technology in Italy. The issue covers a range of topics, including anonymous service delivery, virtual environments for furnishings sales, Italian-language sequence-to-sequence models for text summarization, knowledge scaffolding in Wikipedia, the integration of Operation Technology and Information Technology networking in Industry 4.0, a digital platform for healthcare operators, self-admitted technical debt in blockchain initiatives, unified analytics for smart cities, the integration of natural language processing and artificial intelligence in e-learning, and a system for attack detection in network architectures. The authors provide detailed explanations and evaluations of each topic, highlighting their potential benefits and applications. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
13. Timely need for navigating the potential and downsides of LLMs in healthcare and biomedicine.
- Author
-
Ray, Partha Pratim
- Subjects
- *
LANGUAGE models , *MEDICAL personnel , *COACHING psychology , *MEDICAL terminology , *HEALTH information technology , *MEDICAL care , *TEXT summarization - Abstract
The article titled "Timely need for navigating the potential and downsides of LLMs in healthcare and biomedicine" provides a comprehensive exploration of the opportunities and challenges associated with large language models (LLMs) in the field of biomedicine and healthcare. The authors discuss the transformative potential of LLMs in areas such as biomedical information retrieval, question answering, medical text summarization, information extraction, and medical education. They also highlight the need for caution regarding data privacy, ethical considerations, and the mitigation of biases. The article presents a list of popular LLMs in the health domain and identifies limitations and new challenges, along with mitigation strategies. Additionally, it suggests various applications of LLMs in healthcare, including personalized treatment recommendations, predictive health analytics, and automated medical literature review. The authors emphasize the importance of proceeding with ethical principles, inclusivity, and collaboration in integrating LLMs into healthcare. While the article provides valuable insights, it could have further explored futuristic challenges and applications of LLMs in the biomedical domain. Overall, this work contributes to the ongoing discourse on the role of AI in healthcare and sets the stage for future research and development in this field. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
14. Exploring the Cognitive Neural Basis of Factuality in Abstractive Text Summarization Models: Interpretable Insights from EEG Signals.
- Author
-
Zhang, Zhejun, Zhu, Yingqi, Zheng, Yubo, Luo, Yingying, Shao, Hengyi, Guo, Shaoting, Dong, Liang, Zhang, Lin, and Li, Lei
- Subjects
TEXT summarization ,AUTOMATIC summarization ,ELECTROENCEPHALOGRAPHY ,LANGUAGE models ,NATURAL language processing ,COGNITIVE neuroscience ,INFORMATION overload ,NEUROSCIENCES - Abstract
(1) Background: Information overload challenges decision-making in the Industry 4.0 era. While Natural Language Processing (NLP), especially Automatic Text Summarization (ATS), offers solutions, issues with factual accuracy persist. This research bridges cognitive neuroscience and NLP, aiming to improve model interpretability. (2) Methods: This research examined four fact extraction techniques: dependency relation, named entity recognition, part-of-speech tagging, and TF-IDF, in order to explore their correlation with human EEG signals. Representational Similarity Analysis (RSA) was applied to gauge the relationship between language models and brain activity. (3) Results: Named entity recognition showed the highest sensitivity to EEG signals, marking the most significant differentiation between factual and non-factual words with a score of −0.99. The dependency relation followed with −0.90, while part-of-speech tagging and TF-IDF resulted in 0.07 and −0.52, respectively. Deep language models such as GloVe, BERT, and GPT-2 exhibited noticeable influences on RSA scores, highlighting the nuanced interplay between brain activity and these models. (4) Conclusions: Our findings emphasize the crucial role of named entity recognition and dependency relations in fact extraction and demonstrate the independent effects of different models and TOIs on RSA scores. These insights aim to refine algorithms to reflect human text processing better, thereby enhancing ATS models' factual integrity. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Review on Query-focused Multi-document Summarization (QMDS) with Comparative Analysis.
- Author
-
ROY, PRASENJEET and KUNDU, SUMAN
- Subjects
- *
TEXT summarization , *AUTOMATIC speech recognition , *LATENT semantic analysis , *DIFFERENTIAL evolution , *GRAPH algorithms , *LANGUAGE models , *DEEP learning , *MACHINE learning - Published
- 2024
- Full Text
- View/download PDF
16. Pre‐trained language models: What do they know?
- Author
-
Guimarães, Nuno, Campos, Ricardo, and Jorge, Alípio
- Subjects
- *
LANGUAGE models , *TEXT summarization , *MACHINE translating , *ARTIFICIAL intelligence , *NATURAL language processing , *VERBAL behavior - Abstract
Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre‐trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common‐sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research. This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in Data MiningTechnologies > Artificial Intelligence [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Opportunities and challenges for ChatGPT and large language models in biomedicine and health.
- Author
-
Tian, Shubo, Jin, Qiao, Yeganova, Lana, Lai, Po-Ting, Zhu, Qingqing, Chen, Xiuying, Yang, Yifan, Chen, Qingyu, Kim, Won, Comeau, Donald C, Islamaj, Rezarta, Kapoor, Aadit, Gao, Xin, and Lu, Zhiyong
- Subjects
- *
LANGUAGE models , *CHATGPT , *TEXT summarization , *GENERATIVE artificial intelligence , *DATA privacy , *DATA mining - Abstract
ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Representation transfer and data cleaning in multi-views for text simplification.
- Author
-
He, Wei, Farrahi, Katayoun, Chen, Bin, Peng, Bohua, and Villavicencio, Aline
- Subjects
- *
DATA scrubbing , *TEXT summarization , *NATURAL language processing , *LANGUAGE models , *TEXT messages - Abstract
Representation transfer is a widely used technique in natural language processing. We propose methods of cleaning the dominant dataset of text simplification (TS) WikiLarge in multi-views to remove errors that impact model training and fine-tuning. The results show that our method can effectively refine the dataset. We propose to take the pre-trained text representations from a similar task (e.g., text summarization) to text simplification to conduct a continue-fine-tuning strategy to improve the performance of pre-trained models on TS. This approach will speed up the training and make the model convergence easier. Besides, we also propose a new decoding strategy for simple text generation. It is able to generate simpler and more comprehensible text with controllable lexical simplicity. The experimental results show that our method can achieve good performance on many evaluation metrics. • Using sentence representations to refine the dataset. • Fine-tuning by taking text representations from similar tasks. • A new decoding strategy for simple text generation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. NLP TRANSFORMERS: ANALYSIS OF LLMS AND TRADITIONAL APPROACHES FOR ENHANCED TEXT SUMMARIZATION.
- Author
-
ISIKDEMIR, Yunus Emre
- Subjects
TEXT summarization ,DEEP learning ,INFORMATION retrieval ,NATURAL language processing ,LANGUAGE models - Abstract
Copyright of Journal of Engineering & Architectural Faculty of Eskisehir Osmangazi University / Eskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi is the property of Eskisehir Osmangazi University and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
20. Grosse Sprachmodelle.
- Author
-
Handschuh, Siegfried
- Subjects
LANGUAGE models ,GENERATIVE artificial intelligence ,TEXT summarization ,ACCOUNTING ethics ,GENERATIVE pre-trained transformers - Abstract
Copyright of Informationswissenschaft: Theorie, Methode & Praxis / Sciences de l’information: Théorie, Méthode & Pratique is the property of University Library of Bern and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
21. Two-Stream Network for Korean Natural Language Understanding.
- Author
-
Hwang Kim, Jihyeon Lee, and Ho-Young Kwak
- Subjects
KOREAN language ,NATURAL languages ,TEXT summarization ,LANGUAGE models ,SENTIMENT analysis - Abstract
This study pioneers a dual-stream network architecture tailored for Korean Natural Language Understanding (NLU), focusing on enhancing comprehension by distinct processing of syntactic and semantic aspects. The hypothesis is that this bifurcation can lead to a more nuanced and accurate understanding of the Korean language, which often presents unique syntactic and semantic challenges not fully addressed by generalized models. The validation of this novel architecture employs the Korean Natural Language Inference (koNLI) and Korean Semantic Textual Similarity (koSTS) datasets. By evaluating the model's performance on these datasets, the study aims to determine its efficacy in accurately parsing and interpreting Korean text's syntactic structure and semantic meaning. Preliminary results from this research are promising. They indicate that the dual-stream approach significantly enhances the model's capability to understand and interpret complex Korean sentences. This improvement is crucial in NLU, especially for language-specific applications. The implications of this study are far-reaching. The methodology and findings could pave the way for more sophisticated NLU applications tailored to the Korean language, such as advanced sentiment analysis, nuanced text summarization, and more effective conversational AI systems. Moreover, this research contributes significantly to the broader field of NLU by underscoring the importance and efficacy of developing language-specific models, moving beyond the one-size-fits-all approach of general language models. Thus, this study is a testament to the potential of specialized approaches in language understanding technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
22. Comprehensive Analysis of Text Summarization Techniques for Legal Documents.
- Author
-
Deepika, Ryakala, Das, Surajit, Dey, Niladri Sekhar, Panuganti, Jeshwanth, and Hussain, Mohammed Raashed
- Subjects
TEXT summarization ,LANGUAGE models ,LEGAL documents ,AUTOMATIC summarization ,LEGAL professions - Abstract
The surge of digital legal documents has significantly expanded their usage. This has resulted in the sheer number of papers being used by various members of the judiciary, as well as by advocates and judicial officers. It can be incredibly challenging to keep up with all of them. Over four crore cases are still pending in Indian courts, and manually reviewing them can be a tedious and time-consuming process\cite{r1}. As machine learning has advanced, various text summarization models have been created to help legal professionals manage their documents. Due to the lack of publicly accessible datasets, it is difficult to fine-tune domain-independent models for Indian legal systems. The methodology proposed in this paper seeks to improve the overall performance of these models, and it also explores Indian legal documents' summarization techniques. In addition, this research also provides a study of the several summarization methods in-depth that have been on Indian legal documents, including PEGASUS, Bidirectional Auto- Regressive transformers (BART), TextRank, and Bidirectional Encoder Representations from Transformers (BERT). Through the process of extractive and abstract summarization, BART and PEGASUS will be able to gain a deeper understanding of the text normalization process. The outcomes of the text normalization process are evaluated by experts using the ROUGE metrics and multiple parameters. It shows that the proposed approach can work well in legal texts that have domain-independent frameworks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
23. Oversea Cross-Lingual Summarization Service in Multilanguage Pre-Trained Model through Knowledge Distillation.
- Author
-
Yang, Xiwei, Yun, Jing, Zheng, Bofei, Liu, Limin, and Ban, Qi
- Subjects
LANGUAGE models ,TEXT summarization ,VECTOR spaces ,WORD order (Grammar) - Abstract
Cross-lingual text summarization is a highly desired service for overseas report editing tasks and is formulated in a distributed application to facilitate the cooperation of editors. The multilanguage pre-trained language model (MPLM) can generate high-quality cross-lingual text summaries with simple fine-tuning. However, the MPLM does not adapt to complex variations, like the word order and tense in different languages. When the model performs on these languages with separate syntactic structures and vocabulary morphologies, it will lead to the low-level quality of the cross-lingual summary. The matter worsens when the cross-lingual summarization datasets are low-resource. We use a knowledge distillation framework for the cross-lingual summarization task to address the above issues. By learning the monolingual teacher model, the cross-lingual student model can effectively capture the differences between languages. Since the teacher and student models generate summaries in two languages, their representations lie on different vector spaces. In order to construct representation relationships across languages, we further propose a similarity metric, which is based on bidirectional semantic alignment, to map different language representations to the same space. In order to improve the quality of cross-lingual summaries further, we use contrastive learning to make the student model focus on the differentials among languages. Contrastive learning can enhance the ability of the similarity metric for bidirectional semantic alignment. Our experiments show that our approach is competitive in low-resource scenarios on cross-language summarization datasets in pairs of distant languages. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Survey of Hallucination in Natural Language Generation.
- Author
-
ZIWEI JI, NAYEON LEE, FRIESKE, RITA, TIEZHENG YU, DAN SU, YAN XU, ETSUKO ISHII, YE JIN BANG, MADOTTO, ANDREA, and FUNG, PASCALE
- Subjects
- *
DEEP learning , *TEXT summarization , *LANGUAGE models , *NATURAL languages , *HALLUCINATIONS , *MACHINE translating - Abstract
Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation, and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions, and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Multi-granularity adaptive extractive document summarization with heterogeneous graph neural networks.
- Author
-
Wu Su, Jin Jiang, and Kaihui Huang
- Subjects
AUTOMATIC summarization ,TEXT summarization ,LANGUAGE models ,DISTRIBUTION (Probability theory) ,THESIS statements (Rhetoric) - Abstract
The crucial aspect of extractive document summarization lies in understanding the interrelations between sentences. Documents inherently comprise a multitude of sentences, and sentence-level models frequently fail to consider the relationships between distantly-placed sentences, resulting in the omission of significant information in the summary. Moreover, information within documents tends to be distributed sparsely, challenging the efficacy of sentence-level models. In the realm of heterogeneous graph neural networks, it has been observed that semantic nodes with varying levels of granularity encapsulate distinct semantic connections. Initially, the incorporation of edge features into the computation of dynamic graph attention networks is performed to account for node relationships. Subsequently, given the multiplicity of topics in a document or a set of documents, a topic model is employed to extract topic-specific features and the probability distribution linking these topics with sentence nodes. Last but not least, the model defines nodes with different levels of granularity--ranging from documents and topics to sentences--and these various nodes necessitate different propagation widths and depths for capturing intricate relationships in the information being disseminated. Adaptive measures are taken to learn the importance and correlation between nodes of different granularities in terms of both width and depth. Experimental evidence from two benchmark datasets highlights the superior performance of the proposed model, as assessed by ROUGE metrics, in comparison to existing approaches, even in the absence of pre-trained language models. Additionally, an ablation study confirms the positive impact of each individual module on the model's ROUGE scores. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. A parallel optimization and transfer learning approach for summarization in electrical power systems.
- Author
-
Priya, V., Praveena, V., and Sujithra, L. R.
- Subjects
ELECTRIC power ,LANGUAGE models ,TEXT summarization ,PARALLEL processing ,ELECTRONIC data processing ,NATURAL language processing - Abstract
Transfer learning approaches in natural language processing have been explored and evolved as a potential solution for solving many problems in recent days. The current research on aspect-based summarization shows unsatisfactory accuracy and low-quality generated summaries. Additionally, the potential advantages of combining language models with parallel processing have not been explored in the existing literature. This paper aims to address the problem of aspect-based extractive text summarization using a transfer learning approach and an optimization method based on map reduce. The proposed approach utilizes transfer learning with language models to extract significant aspects from the text. Subsequently, an optimization process using map reduce is employed. This optimization framework includes an in-node mapper and reducer algorithm to generate summaries for important aspects identified by the language model. This enhances the quality of the summary, leading to improved accuracy, particularly when applied to electrical power system documents. By leveraging the strengths of natural language models and parallel data processing techniques, this model presents an opportunity to achieve better text summary generation. The performance metric used is accuracy, measured with the ROUGE tool, incorporating precision, recall and f-measure. The proposed model demonstrates a 6% improvement in scores compared to state-of-the-art techniques. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
27. A System to Support Readers in Automatically Acquiring Complete Summarized Information on an Event from Different Sources.
- Author
-
Dell'Oglio, Pietro, Bondielli, Alessandro, and Marcelloni, Francesco
- Subjects
- *
TEXT summarization , *TRANSFORMER models , *NATURAL language processing , *LANGUAGE models , *ELECTRONIC newspapers , *CLASSIFICATION algorithms , *SOCIAL media - Abstract
Today, most newspapers utilize social media to disseminate news. On the one hand, this results in an overload of related articles for social media users. On the other hand, since social media tends to form echo chambers around their users, different opinions and information may be hidden. Enabling users to access different information (possibly outside of their echo chambers, without the burden of reading entire articles, often containing redundant information) may be a step forward in allowing them to form their own opinions. To address this challenge, we propose a system that integrates Transformer neural models and text summarization models along with decision rules. Given a reference article already read by the user, our system first collects articles related to the same topic from a configurable number of different sources. Then, it identifies and summarizes the information that differs from the reference article and outputs the summary to the user. The core of the system is the sentence classification algorithm, which classifies sentences in the collected articles into three classes based on similarity with the reference article: sentences classified as dissimilar are summarized by using a pre-trained abstractive summarization model. We evaluated the proposed system in two steps. First, we assessed its effectiveness in identifying content differences between the reference article and the related articles by using human judgments obtained through crowdsourcing as ground truth. We obtained an average F1 score of 0.772 against average F1 scores of 0.797 and 0.676 achieved by two state-of-the-art approaches based, respectively, on model tuning and prompt tuning, which require an appropriate tuning phase and, therefore, greater computational effort. Second, we asked a sample of people to evaluate how well the summary generated by the system represents the information that is not present in the article read by the user. The results are extremely encouraging. Finally, we present a use case. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. Leveraging BERT for extractive text summarization on federal police documents.
- Author
-
Barros, Thierry S., Pires, Carlos Eduardo S., and Nascimento, Dimas Cassimiro
- Subjects
LANGUAGE models ,TEXT summarization ,ARTIFICIAL neural networks ,NATURAL language processing ,CRIMINAL investigation - Abstract
A document known as notitia criminis (NC) is use in the Brazilian Federal Police as the starting point of the criminal investigation. An NC aims to report a summary of investigative activities. Thus, it contains all relevant information about a supposed crime that occurred. To manage an inquiry and correlate similar investigations, the Federal Police usually needs to extract essential information from an NC document. The manual extraction (reading and understanding the entire content) may be human mentally exhausting, due to the size and complexity of the documents. In this light, natural language processing (NLP) techniques are commonly used for automatic information extraction from textual documents. Deep neural networks are successfully apply to many different NLP tasks. A neural network model that leveraged the results in a wide range of NLP tasks was the BERT model—an acronym for Bidirectional Encoder Representations from Transformers. In this article, we propose approaches based on the BERT model to extract relevant information from textual documents using automatic text summarization techniques. In other words, we aim to analyze the feasibility of using the BERT model to extract and synthesize the most essential information of an NC document. We evaluate the performance of the proposed approaches using two real-world datasets: the Federal Police dataset (a private domain dataset) and the Brazilian WikiHow dataset (a public domain dataset). Experimental results using different variants of the ROUGE metric show that our approaches can significantly increase extractive text summarization effectiveness without sacrificing efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. From task to evaluation: an automatic text summarization review.
- Author
-
Lu, Lingfeng, Liu, Yang, Xu, Weiqiang, Li, Huakang, and Sun, Guozi
- Subjects
TEXT summarization ,AUTOMATIC summarization ,TASK analysis ,LANGUAGE models - Abstract
Automatic summarization is attracting increasing attention as one of the most promising research areas. This technology has been tried in various real-world applications in recent years and achieved a good response. However, the applicability of conventional evaluation metrics cannot keep up with rapidly evolving summarization task formats and ensuing indicator. After recent years of research, automatic summarization task requires not only readability and fluency, but also informativeness and consistency. Diversified application scenarios also bring new challenges both for generative language models and evaluation metrics. In this review, we analysis and specifically focus on the difference between the task format and the evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Enhanced TextRank using weighted word embedding for text summarization.
- Author
-
Yulianti, Evi, Pangestu, Nicholas, and Jiwanggi, Meganingrum Arista
- Subjects
TEXT summarization ,LANGUAGE models - Abstract
The length of a news article may influence people’s interest to read the article. In this case, text summarization can help to create a shorter representative version of an article to reduce people’s read time. This paper proposes to use weighted word embedding based on Word2Vec, FastText, and bidirectional encoder representations from transformers (BERT) models to enhance the TextRank summarization algorithm. The use of weighted word embedding is aimed to create better sentence representation, in order to produce more accurate summaries. The results show that using (unweighted) word embedding significantly improves the performance of the TextRank algorithm, with the best performance gained by the summarization system using BERT word embedding. When each word embedding is weighed using term frequency-inverse document frequency (TF-IDF), the performance for all systems using unweighted word embedding further significantly improve, with the biggest improvement achieved by the systems using Word2Vec (with 6.80% to 12.92% increase) and FastText (with 7.04% to 12.78% increase). Overall, our systems using weighted word embedding can outperform the TextRank method by up to 17.33% in ROUGE-1 and 30.01% in ROUGE-2. This demonstrates the effectiveness of weighted word embedding in the TextRank algorithm for text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Turkish abstractive text summarization using pretrained sequence-to-sequence models.
- Author
-
Baykara, Batuhan and Güngör, Tunga
- Subjects
TEXT summarization ,LANGUAGE models ,TURKISH language ,ENGLISH language - Abstract
The tremendous amount of increase in the number of documents available on the Web has turned finding the relevant piece of information into a challenging, tedious, and time-consuming activity. Accordingly, automatic text summarization has become an important field of study by gaining significant attention from the researchers. Lately, with the advances in deep learning, neural abstractive text summarization with sequence-to-sequence (Seq2Seq) models has gained popularity. There have been many improvements in these models such as the use of pretrained language models (e.g., GPT, BERT, and XLM) and pretrained Seq2Seq models (e.g., BART and T5). These improvements have addressed certain shortcomings in neural summarization and have improved upon challenges such as saliency, fluency, and semantics which enable generating higher quality summaries. Unfortunately, these research attempts were mostly limited to the English language. Monolingual BERT models and multilingual pretrained Seq2Seq models have been released recently providing the opportunity to utilize such state-of-the-art models in low-resource languages such as Turkish. In this study, we make use of pretrained Seq2Seq models and obtain state-of-the-art results on the two large-scale Turkish datasets, TR-News and MLSum, for the text summarization task. Then, we utilize the title information in the datasets and establish hard baselines for the title generation task on both datasets. We show that the input to the models has a substantial amount of importance for the success of such tasks. Additionally, we provide extensive analysis of the models including cross-dataset evaluations, various text generation options, and the effect of preprocessing in ROUGE evaluations for Turkish. It is shown that the monolingual BERT models outperform the multilingual BERT models on all tasks across all the datasets. Lastly, qualitative evaluations of the generated summaries and titles of the models are provided. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. A Comprehensive Study of ChatGPT: Advancements, Limitations, and Ethical Considerations in Natural Language Processing and Cybersecurity.
- Author
-
Alawida, Moatsum, Mejri, Sami, Mehmood, Abid, Chikhaoui, Belkacem, and Isaac Abiodun, Oludare
- Subjects
- *
CHATGPT , *NATURAL language processing , *LANGUAGE models , *TEXT summarization , *INTERNET security , *HAZARD mitigation - Abstract
This paper presents an in-depth study of ChatGPT, a state-of-the-art language model that is revolutionizing generative text. We provide a comprehensive analysis of its architecture, training data, and evaluation metrics and explore its advancements and enhancements over time. Additionally, we examine the capabilities and limitations of ChatGPT in natural language processing (NLP) tasks, including language translation, text summarization, and dialogue generation. Furthermore, we compare ChatGPT to other language generation models and discuss its applicability in various tasks. Our study also addresses the ethical and privacy considerations associated with ChatGPT and provides insights into mitigation strategies. Moreover, we investigate the role of ChatGPT in cyberattacks, highlighting potential security risks. Lastly, we showcase the diverse applications of ChatGPT in different industries and evaluate its performance across languages and domains. This paper offers a comprehensive exploration of ChatGPT's impact on the NLP field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers.
- Author
-
Chen, Tong, Wang, Xuewei, Yue, Tianwei, Bai, Xiaoyu, Le, Cindy X., and Wang, Wenping
- Subjects
TEXT summarization ,KNOWLEDGE graphs ,LANGUAGE models ,CHATGPT - Abstract
As the popularity of large language models (LLMs) has risen over the course of the last year, led by GPT-3/4 and especially its productization as ChatGPT, we have witnessed the extensive application of LLMs to text summarization. However, LLMs do not intrinsically have the power to verify the correctness of the information they supply and generate. This research introduces a novel approach to abstractive summarization, aiming to address the limitations of LLMs in that they struggle to understand the truth. The proposed method leverages extracted knowledge graph information and structured semantics as a guide for summarization. Building upon BART, one of the state-of-the-art sequence-to-sequence pre-trained LLMs, multi-source transformer modules are developed as an encoder, which are capable of processing textual and graphical inputs. Decoding is performed based on this enriched encoding to enhance the summary quality. The Wiki-Sum dataset, derived from Wikipedia text dumps, is introduced for evaluation purposes. Comparative experiments with baseline models demonstrate the strengths of the proposed approach in generating informative and relevant summaries. We conclude by presenting our insights into utilizing LLMs with graph external information, which will become a powerful aid towards the goal of factually correct and verified LLMs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Abstractive vs. Extractive Summarization: An Experimental Review.
- Author
-
Giarelis, Nikolaos, Mastrokostas, Charalampos, and Karacapilidis, Nikos
- Subjects
LANGUAGE models ,TEXT summarization ,COMPUTATIONAL linguistics ,COMPARATIVE method ,NATURAL language processing ,LITERATURE reviews ,DEEP learning - Abstract
Text summarization is a subtask of natural language processing referring to the automatic creation of a concise and fluent summary that captures the main ideas and topics from one or multiple documents. Earlier literature surveys focus on extractive approaches, which rank the top-n most important sentences in the input document and then combine them to form a summary. As argued in the literature, the summaries of these approaches do not have the same lexical flow or coherence as summaries that are manually produced by humans. Newer surveys elaborate abstractive approaches, which generate a summary with potentially new phrases and sentences compared to the input document. Generally speaking, contrary to the extractive approaches, the abstractive ones create summaries that are more similar to those produced by humans. However, these approaches still lack the contextual representation needed to form fluent summaries. Recent advancements in deep learning and pretrained language models led to the improvement of many natural language processing tasks, including abstractive summarization. Overall, these surveys do not present a comprehensive evaluation framework that assesses the aforementioned approaches. Taking the above into account, the contribution of this survey is fourfold: (i) we provide a comprehensive survey of the state-of-the-art approaches in text summarization; (ii) we conduct a comparative evaluation of these approaches, using well-known datasets from the related literature, as well as popular evaluation scores such as ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-LSUM, BLEU-1, BLEU-2 and SACREBLEU; (iii) we report on insights gained on various aspects of the text summarization process, including existing approaches, datasets and evaluation methods, and we outline a set of open issues and future research directions; (iv) we upload the datasets and the code used in our experiments in a public repository, aiming to increase the reproducibility of this work and facilitate future research in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. What is ChatGPT and how will it impact project management?
- Author
-
Reid, Lawrence
- Subjects
PROJECT managers ,CHATGPT ,PROJECT management ,LANGUAGE models ,TEXT summarization - Published
- 2023
36. Multilingual Text Summarization for German Texts Using Transformer Models.
- Author
-
Alcantara, Tomas Humberto Montiel, Krütli, David, Ravada, Revathi, and Hanne, Thomas
- Subjects
- *
TEXT summarization , *NATURAL language processing , *LANGUAGE models , *GERMAN language , *ENGLISH language - Abstract
The tremendous increase in documents available on the Web has turned finding the relevant pieces of information into a challenging, tedious, and time-consuming activity. Text summarization is an important natural language processing (NLP) task used to reduce the reading requirements of text. Automatic text summarization is an NLP task that consists of creating a shorter version of a text document which is coherent and maintains the most relevant information of the original text. In recent years, automatic text summarization has received significant attention, as it can be applied to a wide range of applications such as the extraction of highlights from scientific papers or the generation of summaries of news articles. In this research project, we are focused mainly on abstractive text summarization that extracts the most important contents from a text in a rephrased form. The main purpose of this project is to summarize texts in German. Unfortunately, most pretrained models are only available for English. We therefore focused on the German BERT multilingual model and the BART monolingual model for English, with a consideration of translation possibilities. As the source of the experiment setup, took the German Wikipedia article dataset and compared how well the multilingual model performed for German text summarization when compared to using machine-translated text summaries from monolingual English language models. We used the ROUGE-1 metric to analyze the quality of the text summarization. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. A Mathematical Interpretation of Autoregressive Generative Pre-Trained Transformer and Self-Supervised Learning.
- Author
-
Lee, Minhyeok
- Subjects
- *
LANGUAGE models , *TEXT summarization , *QUESTION answering systems , *NATURAL language processing , *INVERSE functions , *AUTOREGRESSIVE models , *NATURAL languages - Abstract
In this paper, we present a rigorous mathematical examination of generative pre-trained transformer (GPT) models and their autoregressive self-supervised learning mechanisms. We begin by defining natural language space and knowledge space, which are two key concepts for understanding the dimensionality reduction process in GPT-based large language models (LLMs). By exploring projection functions and their inverses, we establish a framework for analyzing the language generation capabilities of these models. We then investigate the GPT representation space, examining its implications for the models' approximation properties. Finally, we discuss the limitations and challenges of GPT models and their learning mechanisms, considering trade-offs between complexity and generalization, as well as the implications of incomplete inverse projection functions. Our findings demonstrate that GPT models possess the capability to encode knowledge into low-dimensional vectors through their autoregressive self-supervised learning mechanism. This comprehensive analysis provides a solid mathematical foundation for future advancements in GPT-based LLMs, promising advancements in natural language processing tasks such as language translation, text summarization, and question answering due to improved understanding and optimization of model training and performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. A Multitask Cross-Lingual Summary Method Based on ABO Mechanism.
- Author
-
Li, Qing, Wan, Weibing, and Zhao, Yuming
- Subjects
LANGUAGE models ,TEXT summarization - Abstract
Recent cross-lingual summarization research has pursued the use of a unified end-to-end model which has demonstrated a certain level of improvement in performance and effectiveness, but this approach stitches together multiple tasks and makes the computation more complex. Less work has focused on alignment relationships across languages, which has led to persistent problems of summary misordering and loss of key information. For this reason, we first simplify the multitasking by converting the translation task into an equal proportion of cross-lingual summary tasks so that the model can perform only cross-lingual summary tasks when generating cross-lingual summaries. In addition, we splice monolingual and cross-lingual summary sequences as an input so that the model can fully learn the core content of the corpus. Then, we propose a reinforced regularization method based on the model to improve its robustness, and build a targeted ABO mechanism to enhance the semantic relationship alignment and key information retention of the cross-lingual summaries. Ablation experiments are conducted on three datasets of different orders of magnitude to demonstrate the effective enhancement of the model by the optimization approach; they outperform the mainstream approaches on the cross-lingual summarization task and the monolingual summarization task for the full dataset. Finally, we validate the model's capabilities on a cross-lingual summary dataset of professional domains, and the results demonstrate its superior performance and ability to improve cross-lingual sequencing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. A Multi-Granularity Heterogeneous Graph for Extractive Text Summarization.
- Author
-
Zhao, Henghui, Zhang, Wensheng, Huang, Mengxing, Feng, Siling, and Wu, Yuanyuan
- Subjects
TEXT summarization ,LANGUAGE models - Abstract
Extractive text summarization selects the most important sentences from a document, preserves their original meaning, and produces an objective and fact-based summary. It is faster and less computationally intensive than abstract summarization techniques. Learning cross-sentence relationships is crucial for extractive text summarization. However, most of the language models currently in use process text data sequentially, which makes it difficult to capture such inter-sentence relations, especially in long documents. This paper proposes an extractive summarization model based on the graph neural network (GNN) to address this problem. The model effectively represents cross-sentence relationships using a graph-structured document representation. In addition to sentence nodes, we introduce two nodes with different granularity in the graph structure, words and topics, which bring different levels of semantic information. The node representations are updated by the graph attention network (GAT). The final summary is obtained using the binary classification of the sentence nodes. Our text summarization method was demonstrated to be highly effective, as supported by the results of our experiments on the CNN/DM and NYT datasets. To be specific, our approach outperformed baseline models of the same type in terms of ROUGE scores on both datasets, indicating the potential of our proposed model for enhancing text summarization tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Adapting Static and Contextual Representations for Policy Gradient-Based Summarization.
- Author
-
Lin, Ching-Sheng, Jwo, Jung-Sing, and Lee, Cheng-Hsiung
- Subjects
- *
LANGUAGE models , *TEXT summarization , *EVIDENCE gaps , *REINFORCEMENT learning , *ELECTRONIC records , *GENERATIVE pre-trained transformers - Abstract
Considering the ever-growing volume of electronic documents made available in our daily lives, the need for an efficient tool to capture their gist increases as well. Automatic text summarization, which is a process of shortening long text and extracting valuable information, has been of great interest for decades. Due to the difficulties of semantic understanding and the requirement of large training data, the development of this research field is still challenging and worth investigating. In this paper, we propose an automated text summarization approach with the adaptation of static and contextual representations based on an extractive approach to address the research gaps. To better obtain the semantic expression of the given text, we explore the combination of static embeddings from GloVe (Global Vectors) and the contextual embeddings from BERT (Bidirectional Encoder Representations from Transformer) and GPT (Generative Pre-trained Transformer) based models. In order to reduce human annotation costs, we employ policy gradient reinforcement learning to perform unsupervised training. We conduct empirical studies on the public dataset, Gigaword. The experimental results show that our approach achieves promising performance and is competitive with various state-of-the-art approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. Guest Editorial: AI and the news.
- Author
-
Opdahl, Andreas L., Helberger, Natali, and Diakopoulos, Nicholas
- Subjects
ARTIFICIAL intelligence ,GENERATIVE artificial intelligence ,LANGUAGE models ,TEXT summarization ,SOCIAL media - Abstract
This article discusses the role of Artificial Intelligence (AI) in supporting the production and distribution of high-quality news in the face of challenges such as misinformation and declining trust in institutions. The article explores how AI, including Machine Learning and generative AI, can be used in various stages of news production and dissemination. It also highlights the importance of considering factors beyond technology, such as organizational and economic contexts, in order to successfully harness the potential of AI in the media sector. The article presents several papers that examine different aspects of AI's impact on news, including its influence on audience behavior, financial reporting, local journalism, news recommendation systems, and misinformation correction. The authors emphasize the need for ethical considerations and collaboration among stakeholders to ensure the benefits of AI in supporting quality media production. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
42. Evaluation of Automatic Legal Text Summarization Techniques for Greek Case Law.
- Author
-
Koniaris, Marios, Galanis, Dimitris, Giannini, Eugenia, and Tsanakas, Panayiotis
- Subjects
- *
TEXT summarization , *AUTOMATIC summarization , *JUDGE-made law , *LANGUAGE models , *LEGAL documents , *METADATA , *TAGS (Metadata) - Abstract
The increasing amount of legal information available online is overwhelming for both citizens and legal professionals, making it difficult and time-consuming to find relevant information and keep up with the latest legal developments. Automatic text summarization techniques can be highly beneficial as they save time, reduce costs, and lessen the cognitive load of legal professionals. However, applying these techniques to legal documents poses several challenges due to the complexity of legal documents and the lack of needed resources, especially in linguistically under-resourced languages, such as the Greek language. In this paper, we address automatic summarization of Greek legal documents. A major challenge in this area is the lack of suitable datasets in the Greek language. In response, we developed a new metadata-rich dataset consisting of selected judgments from the Supreme Civil and Criminal Court of Greece, alongside their reference summaries and category tags, tailored for the purpose of automated legal document summarization. We also adopted several state-of-the-art methods for abstractive and extractive summarization and conducted a comprehensive evaluation of the methods using both human and automatic metrics. Our results: (i) revealed that, while extractive methods exhibit average performance, abstractive methods generate moderately fluent and coherent text, but they tend to receive low scores in relevance and consistency metrics; (ii) indicated the need for metrics that capture better a legal document summary's coherence, relevance, and consistency; (iii) demonstrated that fine-tuning BERT models on a specific upstream task can significantly improve the model's performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Attribute-Sentiment-Guided Summarization of User Opinions From Online Reviews.
- Author
-
Yi Han, Nanda, Gaurav, and Moghaddam, Mohsen
- Subjects
- *
TEXT summarization , *LANGUAGE models , *CONSUMERS' reviews , *NATURAL language processing , *PRODUCT attributes - Abstract
Eliciting informative user opinions from online reviews is a key success factor for innovative product design and development. The unstructured, noisy, and verbose nature of user reviews, however, often complicate large-scale need finding in a format useful for designers without losing important information. Recent advances in abstractive text summarization have created the opportunity to systematically generate opinion summaries from online reviews to inform the early stages of product design and development. However, two knowledge gaps hinder the applicability of opinion summarization methods in practice. First, there is a lack of formal mechanisms to guide the generative process with respect to different categories of product attributes and user sentiments. Second, the annotated training datasets needed for supervised training of abstractive summarization models are often difficult and costly to create. This article addresses these gaps by (1) devising an efficient computational framework for abstractive opinion summarization guided by specific product attributes and sentiment polarities, and (2) automatically generating a synthetic training dataset that captures various degrees of granularity and polarity. A hierarchical multi-instance attribute-sentiment inference model is developed for assembling a high-quality synthetic dataset, which is utilized to fine-tune a pretrained language model for abstractive summary generation. Numerical experiments conducted on a large dataset scraped from three major e-Commerce retail stores for apparel and footwear products indicate the performance, feasibility, and potentials of the developed framework. Several directions are provided for future exploration in the area of automated opinion summarization for user-centered design. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Efficient Memory-Enhanced Transformer for Long-Document Summarization in Low-Resource Regimes.
- Author
-
Moro, Gianluca, Ragazzi, Luca, Valgimigli, Lorenzo, Frisoni, Giacomo, Sartori, Claudio, and Marfia, Gustavo
- Subjects
- *
MNEMONICS , *LANGUAGE models , *TEXT summarization - Abstract
Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today's state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of different domains while consuming significantly less GPU memory than competitors do, even in low-resource settings. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. 多注意力机制的文本摘要事实一致性评估模型.
- Author
-
魏楚元, 张鑫贤, 王致远, 李金哲, and 刘杰
- Subjects
LANGUAGE models ,TEXT summarization ,NATURAL languages ,INFORMATION resources ,ALGORITHMS - Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
46. ARNLI: ARABIC NATURAL LANGUAGE INFERENCE ENTAILMENT AND CONTRADICTION DETECTION.
- Author
-
AL JALLAD, KHLOUD and GHNEIM, NADA
- Subjects
NATURAL languages ,LANGUAGE models ,ARABIC language ,NATURAL language processing ,TEXT summarization ,SENTENCES (Grammar) - Abstract
Natural language inference (NLI) is a hot research topic in natural language processing; contradiction-detection between sentences is a special case of NLI. This is considered to be a difficult NLP task that has a significant influence when added as a component in many NLP applications (such as questionanswering systems and text summarization). The Arabic language is one of the most challenging low-resource languages for detecting contradictions due to its rich lexical semantic ambiguity. We have created a data set of more than 12k sentences and named it ArNLI; it will be publicly available. Moreover, we have applied a new model that was inspired by Stanford's proposed contradictiondetection solutions for the English language. We proposed an approach for detecting contradictions between pairs of sentences in the Arabic language using a contradiction vector combined with a language model vector as an input to a machine-learning model. We analyzed the results of different traditional machine-learning classifiers and compared their results on our created data set (ArNLI) and on the automatic translation of both the PHEME and SICK English data sets. The best results were achieved by using the random forest classifier, with accuracies of 0.99, 0.60 and 0.75 on PHEME, SICK, and ArNLI respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Prompt-Based Word-Level Information Injection BERT for Chinese Named Entity Recognition.
- Author
-
He, Qiang, Chen, Guowei, Song, Wenchao, and Zhang, Pengzhou
- Subjects
CHINESE language ,LANGUAGE models ,TEXT summarization ,NATURAL language processing ,DATA mining ,ENGINEERS - Abstract
Named entity recognition (NER) is a subfield of natural language processing (NLP) that identifies and classifies entities from plain text, such as people, organizations, locations, and other types. NER is a fundamental task in information extraction, information retrieval, and text summarization, as it helps to organize the relevant information in a structured way. The current approaches to Chinese named entity recognition do not consider the category information of matched Chinese words, which limits their ability to capture the correlation between words. This makes Chinese NER more challenging than English NER, which already has well-defined word boundaries. To improve Chinese NER, it is necessary to develop new approaches that take into account category features of matched Chinese words, and the category information would help to effectively capture the relationship between words. This paper proposes a Prompt-based Word-level Information Injection BERT (PWII-BERT) to integrate prompt-guided lexicon information into a pre-trained language model. Specifically, we engineer a Word-level Information Injection Adapter (WIIA) through the original Transformer encoder and prompt-guided Transformer layers. Thus, the ability of PWII-BERT to explicitly obtain fine-grained character-to-word relevant information according to the category prompt is one of its key advantages. In experiments on four benchmark datasets, PWII-BERT outperforms the baselines, demonstrating the significance of fully utilizing the advantages of fusing the category information and lexicon feature to implement Chinese NER. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. 텍스트 요약을 위한 어텐션 기반 BART 모델 미세조정.
- Author
-
안영필 and 박현준
- Subjects
TEXT summarization ,LANGUAGE models - Abstract
Automatically summarizing long sentences is an important technique. The BART model is one of the widely used models in the summarization task. In general, in order to generate a summarization model of a specific domain, fine-tuning is performed by re-training a language model trained on a large dataset to fit the domain. The fine-tuning is usually done by changing the number of nodes in the last fully connected layer. However, in this paper, we propose a fine-tuning method by adding an attention layer, which has been recently applied to various models and shows good performance. In order to evaluate the performance of the proposed method, various experiments were conducted, such as accumulating layers deeper, fine-tuning without skip connections during the fine tuning process, and so on. As a result, the BART model using two attention layers with skip connection shows the best score. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Deep Transformer Language Models for Arabic Text Summarization: A Comparison Study.
- Author
-
Chouikhi, Hasna and Alsuhaibani, Mohammed
- Subjects
DEEP learning ,ARABIC language ,TEXT summarization ,MACHINE learning ,NATURAL languages ,LANGUAGE models - Abstract
Large text documents are sometimes challenging to understand and time-consuming to extract vital information from. These issues are addressed by automatic text summarizing techniques, which condense lengthy texts while preserving their key information. Thus, the development of automatic summarization systems capable of fulfilling the ever-increasing demands of textual data becomes of utmost importance. It is even more vital with complex natural languages. This study explores five State-Of-The-Art (SOTA) Arabic deep Transformer-based Language Models (TLMs) in the task of text summarization by adapting various text summarization datasets dedicated to Arabic. A comparison against deep learning and machine learning-based baseline models has also been conducted. Experimental results reveal the superiority of TLMs, specifically the PEAGASUS family, against the baseline approaches, with an average F1-score of 90% on several benchmark datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. N-GPETS: Neural Attention Graph-Based Pretrained Statistical Model for Extractive Text Summarization.
- Author
-
Umair, Muhammad, Alam, Iftikhar, Khan, Atif, Khan, Inayat, Ullah, Niamat, and Momand, Mohammad Yusuf
- Subjects
- *
STATISTICAL models , *LINGUISTIC models , *TEXT summarization , *LANGUAGE models - Abstract
The extractive summarization approach involves selecting the source document's salient sentences to build a summary. One of the most important aspects of extractive summarization is learning and modelling cross-sentence associations. Inspired by the popularity of Transformer-based Bidirectional Encoder Representations (BERT) pretrained linguistic model and graph attention network (GAT) having a sophisticated network that captures intersentence associations, this research work proposes a novel neural model N-GPETS by combining heterogeneous graph attention network with BERT model along with statistical approach using TF-IDF values for extractive summarization task. Apart from sentence nodes, N-GPETS also works with different semantic word nodes of varying granularity levels that serve as a link between sentences, improving intersentence interaction. Furthermore, proposed N-GPETS becomes more improved and feature-rich by integrating graph layer with BERT encoder at graph initialization step rather than employing other neural network encoders such as CNN or LSTM. To the best of our knowledge, this work is the first attempt to combine the BERT encoder and TF-IDF values of the entire document with a heterogeneous attention graph structure for the extractive summarization task. The empirical outcomes on benchmark news data sets CNN/DM show that the proposed model N-GPETS gets favorable results in comparison with other heterogeneous graph structures employing the BERT model and graph structures without the BERT model. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.