836 results on '"Language Modeling"'
Search Results
2. Linguistic Secret Sharing via Ambiguous Token Selection for IoT Security.
- Author
-
Gao, Kai, Horng, Ji-Hwei, Chang, Ching-Chun, and Chang, Chin-Chen
- Subjects
FINITE fields ,PRIVATE networks ,LINGUISTIC models ,DATA protection ,INTERNET of things - Abstract
The proliferation of Internet of Things (IoT) devices has introduced significant security challenges, including weak authentication, insufficient data protection, and firmware vulnerabilities. To address these issues, we propose a linguistic secret sharing scheme tailored for IoT applications. This scheme leverages neural networks to embed private data within texts transmitted by IoT devices, using an ambiguous token selection algorithm that maintains the textual integrity of the cover messages. Our approach eliminates the need to share additional information for accurate data extraction while also enhancing security through a secret sharing mechanism. Experimental results demonstrate that the proposed scheme achieves approximately 50% accuracy in detecting steganographic text across two steganalysis networks. Additionally, the generated steganographic text preserves the semantic information of the cover text, evidenced by a BERT score of 0.948. This indicates that the proposed scheme performs well in terms of security. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Word Embeddings as Statistical Estimators.
- Author
-
Dey, Neil, Singer, Matthew, Williams, Jonathan P., and Sengupta, Srijan
- Abstract
Word embeddings are a fundamental tool in natural language processing. Currently, word embedding methods are evaluated on the basis of empirical performance on benchmark data sets, and there is a lack of rigorous understanding of their theoretical properties. This paper studies word embeddings from a statistical theoretical perspective, which is essential for formal inference and uncertainty quantification. We propose a copula-based statistical model for text data and show that under this model, the now-classical Word2Vec method can be interpreted as a statistical estimation method for estimating the theoretical pointwise mutual information (PMI). We further illustrate the utility of this statistical model by using it to develop a missing value-based estimator as a statistically tractable and interpretable alternative to the Word2Vec approach. The estimation error of this estimator is comparable to Word2Vec and improves upon the truncation-based method proposed by Levy and Goldberg (Adv. Neural Inf. Process. Syst., 27, 2177–2185 2014). The resulting estimator also is comparable to Word2Vec in a benchmark sentiment analysis task on the IMDb Movie Reviews data set and a part-of-speech tagging task on the OntoNotes data set. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. PhosBERT: A self-supervised learning model for identifying phosphorylation sites in SARS-CoV-2-infected human cells.
- Author
-
Li, Yong, Gao, Ru, Liu, Shan, Zhang, Hongqi, Lv, Hao, and Lai, Hongyan
- Subjects
- *
SARS-CoV-2 , *LANGUAGE models , *COVID-19 , *ARTIFICIAL neural networks , *POST-translational modification , *SUPERVISED learning - Abstract
• The accurate identification of phosphorylation sites in SARS-CoV-2-infected host cells would contribute to the investigation of SARS-CoV-2 pathogenic mechanism and the mining of candidate therapy targets. • This work constructed a phosphorylation site computational approach (named PhosBERT) based on a pre-trained protein language model through self-supervised learning from the BERT architecture. • Prediction accuracy and AUC for SARS-CoV-2-infected serine and threonine phosphorylation sites are 81.9% and 0.896. • Prediction accuracy and AUC for SARS-CoV-2-infected tyrosine phosphorylation sites are 87.1% and 0.902. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a single-stranded RNA virus, which mainly causes respiratory and enteric diseases and is responsible for the outbreak of coronavirus disease 19 (COVID-19). Numerous studies have demonstrated that SARS-CoV-2 infection will lead to a significant dysregulation of protein post-translational modification profile in human cells. The accurate recognition of phosphorylation sites in host cells will contribute to a deep understanding of the pathogenic mechanisms of SARS-CoV-2 and also help to screen drugs and compounds with antiviral potential. Therefore, there is a need to develop cost-effective and high-precision computational strategies for specifically identifying SARS-CoV-2-infected phosphorylation sites. In this work, we first implemented a custom neural network model (named PhosBERT) on the basis of a pre-trained protein language model of ProtBert, which was a self-supervised learning approach developed on the Bidirectional Encoder Representation from Transformers (BERT) architecture. PhosBERT was then trained and validated on serine (S) and threonine (T) phosphorylation dataset and tyrosine (Y) phosphorylation dataset with 5-fold cross-validation, respectively. Independent validation results showed that PhosBERT could identify S/T phosphorylation sites with high accuracy and AUC (area under the receiver operating characteristic) value of 81.9% and 0.896. The prediction accuracy and AUC value of Y phosphorylation sites reached up to 87.1% and 0.902. It indicated that the proposed model was of good prediction ability and stability and would provide a new approach for studying SARS-CoV-2 phosphorylation sites. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Flight Arrival Scheduling via Large Language Model.
- Author
-
Zhou, Wentao, Wang, Jinlin, Zhu, Longtao, Wang, Yi, and Ji, Yulong
- Subjects
LANGUAGE models ,AIR traffic control ,FLIGHT training ,AIR travel ,AIRPORTS - Abstract
The flight arrival scheduling problem is one of the critical tasks in air traffic operations, aiming to ensure that the flight arrive in the correct sequence safely. Existing methods primarily focus on the terminal area and often overlook the presence of training flight at the airport. Due to the limited generalization of traditional methods and varying control practices at different airports, training flight at airports still rely on manual control for arrival sorting. To effectively address these issues, we propose a novel method for slot allocation that leverages the strong reasoning capabilities and generalization potential of large language models (LLMs). Our method conceptualizes the dynamic scheduling problem for training flight as a language modeling problem, a perspective not previously explored. Specifically, we represent the allocator's inputs and outputs as language tokens, utilizing LLMs to generate conflict-free results based on a language description of requested landing information and assigned training flight information. Additionally, we employ a reset strategy to create a small dataset for scenario-specific samples, enabling LLMs to quickly learn allocation schemes from the dataset. We demonstrated the capability of LLMs in addressing time conflicts by evaluating metrics such as answer accuracy, conflict rate, and total delay time (without the wrong answer). These findings underscore the feasibility of employing LLMs in the field of air traffic control. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Cephalo: Multi‐Modal Vision‐Language Models for Bio‐Inspired Materials Analysis and Design.
- Author
-
Buehler, Markus J.
- Subjects
- *
LANGUAGE models , *NATURAL language processing , *GENERATIVE artificial intelligence , *COMPUTER vision , *BIOMATERIALS - Abstract
Cephalo is presented as a series of multimodal vision large language models (V‐LLMs) designed for materials science applications, integrating visual and linguistic data for enhanced understanding. A key innovation of Cephalo is its advanced dataset generation method. Cephalo is trained on integrated image and text data from thousands of scientific papers and science‐focused Wikipedia data demonstrates it can interpret complex visual scenes, generate precise language descriptions, and answer queries about images effectively. The combination of a vision encoder with an autoregressive transformer supports multimodal natural language understanding, which can be coupled with other generative methods to create an image‐to‐text‐to‐3D pipeline. To develop more capable models from smaller ones, both mixture‐of‐expert methods and model merging are reported. The models are examined in diverse use cases that incorporate biological materials, fracture and engineering analysis, protein biophysics, and bio‐inspired design based on insect behavior. Generative applications include bio‐inspired designs, including pollen‐inspired architected materials, as well as the synthesis of bio‐inspired material microstructures from a photograph of a solar eclipse. Additional model fine‐tuning with a series of molecular dynamics results demonstrate Cephalo's enhanced capabilities to accurately predict statistical features of stress and atomic energy distributions, as well as crack dynamics and damage in materials. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Evaluating Quantized Llama 2 Models for IoT Privacy Policy Language Generation.
- Author
-
Malisetty, Bhavani and Perez, Alfredo J.
- Subjects
LANGUAGE models ,INTERNET of things ,SMART homes ,PERSONAL assistants ,LANGUAGE policy - Abstract
Quantized large language models are large language models (LLMs) optimized for model size while preserving their efficacy. They can be executed on consumer-grade computers without the powerful features of dedicated servers needed to execute regular (non-quantized) LLMs. Because of their ability to summarize, answer questions, and provide insights, LLMs are being used to analyze large texts/documents. One of these types of large texts/documents are Internet of Things (IoT) privacy policies, which are documents specifying how smart home gadgets, health-monitoring wearables, and personal voice assistants (among others) collect and manage consumer/user data on behalf of Internet companies providing services. Even though privacy policies are important, they are difficult to comprehend due to their length and how they are written, which makes them attractive for analysis using LLMs. This study evaluates how quantized LLMs are modeling the language of privacy policies to be potentially used to transform IoT privacy policies into simpler, more usable formats, thus aiding comprehension. While the long-term goal is to achieve this usable transformation, our work focuses on evaluating quantized LLM models used for IoT privacy policy language. Particularly, we study 4-bit, 5-bit, and 8-bit quantized versions of the large language model Meta AI version 2 (Llama 2) and the base Llama 2 model (zero-shot, without fine-tuning) under different metrics and prompts to determine how well these quantized versions model the language of IoT privacy policy documents by completing and generating privacy policy text. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Transformer Model Applications: A Comprehensive Survey and Analysis
- Author
-
Mittal, Dolly, Pant, Ashish, Jajoo, Palika, Yadav, Veena, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Goar, Vishal, editor, Sharma, Aditi, editor, Shin, Jungpil, editor, and Mridha, M. Firoz, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Enhancing LM’s Task Adaptability: Powerful Post-training Framework with Reinforcement Learning from Model Feedback
- Author
-
Rong, Fuju, Gao, Weihao, Deng, Zhuo, Gong, Zheng, Chen, Chucheng, Zhang, Wenze, Niu, Zhiyuan, Li, Fang, Ma, Lan, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Wand, Michael, editor, Malinovská, Kristína, editor, Schmidhuber, Jürgen, editor, and Tetko, Igor V., editor
- Published
- 2024
- Full Text
- View/download PDF
10. Pointer-Guided Pre-training: Infusing Large Language Models with Paragraph-Level Contextual Awareness
- Author
-
Hillebrand, Lars, Pradhan, Prabhupad, Bauckhage, Christian, Sifa, Rafet, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bifet, Albert, editor, Davis, Jesse, editor, Krilavičius, Tomas, editor, Kull, Meelis, editor, Ntoutsi, Eirini, editor, and Žliobaitė, Indrė, editor
- Published
- 2024
- Full Text
- View/download PDF
11. On the Way to Controllable Text Summarization in Russian
- Author
-
Dremina, Alena, Tikhonova, Maria, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ignatov, Dmitry I., editor, Khachay, Michael, editor, Kutuzov, Andrey, editor, Madoyan, Habet, editor, Makarov, Ilya, editor, Nikishina, Irina, editor, Panchenko, Alexander, editor, Panov, Maxim, editor, M. Pardalos, Panos, editor, Savchenko, Andrey V., editor, Tsymbalov, Evgenii, editor, Tutubalina, Elena, editor, and Zagoruyko, Sergey, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Static, Dynamic, or Contextualized: What is the Best Approach for Discovering Semantic Shifts in Russian Media?
- Author
-
Nikonova, Veronika, Tikhonova, Maria, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ignatov, Dmitry I., editor, Khachay, Michael, editor, Kutuzov, Andrey, editor, Madoyan, Habet, editor, Makarov, Ilya, editor, Nikishina, Irina, editor, Panchenko, Alexander, editor, Panov, Maxim, editor, Pardalos, Panos M., editor, Savchenko, Andrey V., editor, Tsymbalov, Evgenii, editor, Tutubalina, Elena, editor, and Zagoruyko, Sergey, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Empathy-Driven Chatbots for the Arabic Language: A Transformer Based Approach
- Author
-
Rabii, Ismail, Boussakssou, Mohamed, Erritali, Mohammed, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Santosh, KC, editor, Makkar, Aaisha, editor, Conway, Myra, editor, Singh, Ashutosh K., editor, Vacavant, Antoine, editor, Abou el Kalam, Anas, editor, Bouguelia, Mohamed-Rafik, editor, and Hegadi, Ravindra, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling
- Author
-
Kesgin, Himmet Toprak, Amasyali, Mehmet Fatih, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ortis, Alessandro, editor, Hameed, Alaa Ali, editor, and Jamil, Akhtar, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Pathological Liars: Algorithmic Knowing in the Rhetorical Ecosystem of Wallstreetbets.
- Author
-
Yang, Misti H. and Majdik, Zoltan P.
- Abstract
This essay demonstrates the value of using artificial intelligence (AI) technologies to address specific kinds of research questions in rhetoric. The essay builds on a study of a novel rhetorical object first observed by Yang on the Reddit subreddit r/wallstreetbets. We demonstrate how the rhetorical structure of "pathologics" (1) generated a kind of rhetorical authority that can be measured by higher-than-average user engagement on Reddit and (2) circulated from Reddit into more traditional legacy media. Through our research on the rhetorical circulation of pathologics, we argue that researching rhetoric with AI can center new ways of knowing about concepts relevant in rhetoric, like circulation and rhetorical ecosystems. Further, we argue that researching rhetoric with AI always also entails considering a "rhetoric of AI," requiring critical attention to the platforms, infrastructures, and data sources connected to AI systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Promptable Game Models: Text-guided Game Simulation via Masked Diffusion Models.
- Author
-
MENAPACE, WILLI, SIAROHIN, ALIAKSANDR, LATHUILIÈRE, STÉPHANE, ACHLIOPTAS, PANOS, GOLYANIK, VLADISLAV, TULYAKOV, SERGEY, and RICCI, ELISA
- Published
- 2024
- Full Text
- View/download PDF
17. Flight Arrival Scheduling via Large Language Model
- Author
-
Wentao Zhou, Jinlin Wang, Longtao Zhu, Yi Wang, and Yulong Ji
- Subjects
LLMs ,supervised fine-tuning ,arrival scheduling ,language modeling ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
The flight arrival scheduling problem is one of the critical tasks in air traffic operations, aiming to ensure that the flight arrive in the correct sequence safely. Existing methods primarily focus on the terminal area and often overlook the presence of training flight at the airport. Due to the limited generalization of traditional methods and varying control practices at different airports, training flight at airports still rely on manual control for arrival sorting. To effectively address these issues, we propose a novel method for slot allocation that leverages the strong reasoning capabilities and generalization potential of large language models (LLMs). Our method conceptualizes the dynamic scheduling problem for training flight as a language modeling problem, a perspective not previously explored. Specifically, we represent the allocator’s inputs and outputs as language tokens, utilizing LLMs to generate conflict-free results based on a language description of requested landing information and assigned training flight information. Additionally, we employ a reset strategy to create a small dataset for scenario-specific samples, enabling LLMs to quickly learn allocation schemes from the dataset. We demonstrated the capability of LLMs in addressing time conflicts by evaluating metrics such as answer accuracy, conflict rate, and total delay time (without the wrong answer). These findings underscore the feasibility of employing LLMs in the field of air traffic control.
- Published
- 2024
- Full Text
- View/download PDF
18. Spoken Keyword Detection in Speech Processing using Error Rate Estimations.
- Author
-
Naga Sai Manish, Katakam Venkata, Prabhu, K, Arjun, Koppineni Harish, kanth, Surya, and Aravinth, S. S.
- Subjects
NATURAL language processing ,SPEECH ,MACHINE learning ,SPEECH perception ,DEEP learning ,AUTOMATIC speech recognition ,ERROR rates - Abstract
Spoken keyword detection is a technique used to identify specific keywords or phrases in spoken language. It is often used in the field of speech recognition to trigger specific actions or responses. For example, a spoken keyword detection system might be used to activate a voicecontrolled assistant or to initiate a search query. Keyword detection systems can be trained to recognize a wide range of keywords, depending on the specific needs of the application. They may use machine learning techniques to analyze the audio signal and identify patterns that correspond to specific keywords. [ABSTRACT FROM AUTHOR]
- Published
- 2024
19. Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling
- Author
-
Kavya Manohar, Jayan A R, and Rajeev Rajan
- Subjects
Subword tokens ,Language modeling ,Open vocabulary ,Speech recognition ,Morphological complexity ,Malayalam language ,Acoustics. Sound ,QC221-246 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract This article presents the research work on improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling. The speech recognition system is built using a deep neural network–hidden Markov model (DNN-HMM)-based automatic speech recognition (ASR). We propose a novel method, syllable-byte pair encoding (S-BPE), that combines linguistically informed syllable tokenization with the data-driven tokenization method of byte pair encoding (BPE). The proposed method ensures words are always segmented at valid pronunciation boundaries. On a text corpus that has been divided into tokens using the proposed method, we construct statistical n-gram language models and assess the modeling effectiveness in terms of both information-theoretic and corpus linguistic metrics. A comparative study of the proposed method with other data-driven (BPE, Morfessor, and Unigram), linguistic (Syllable), and baseline (Word) tokenization algorithms is also presented. Pronunciation lexicons of subword tokenized units are built with pronunciation described as graphemes. We develop ASR systems employing the subword tokenized language models and pronunciation lexicons. The resulting ASR models are comprehensively evaluated to answer the research questions regarding the impact of subword tokenization algorithms on language modeling complexity and on ASR performance. Our study highlights the strong performance of the hybrid S-BPE tokens, achieving a notable 10.6% word error rate (WER), which represents a substantial 16.8% improvement over the baseline word-level ASR system. The ablation study has revealed that the performance of S-BPE segmentation, which initially underperformed compared to syllable tokens with lower amounts of textual data for language modeling, exhibited steady improvement with the increase in LM training data. The extensive ablation study indicates that there is a limited advantage in raising the n-gram order of the language model beyond $$n=3$$ n = 3 . Such an increase results in considerable model size growth without significant improvements in WER. The implementation of the algorithm and all associated experiments are available under an open license, allowing for reproduction, adaptation, and reuse.
- Published
- 2023
- Full Text
- View/download PDF
20. Improvements in Language Modeling, Voice Activity Detection, and Lexicon in OpenASR21 Low Resource Languages
- Author
-
Gupta, Vishwa, Boulianne, Gilles, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Karpov, Alexey, editor, Samudravijaya, K., editor, Deepak, K. T., editor, Hegde, Rajesh M., editor, Agrawal, Shyam S., editor, and Prasanna, S. R. Mahadeva, editor
- Published
- 2023
- Full Text
- View/download PDF
21. Automated Text Generation and Summarization for Academic Writing
- Author
-
Benites, Fernando, Delorme Benites, Alice, Anson, Chris M., Kruse, Otto, editor, Rapp, Christian, editor, Anson, Chris M., editor, Benetos, Kalliopi, editor, Cotos, Elena, editor, Devitt, Ann, editor, and Shibani, Antonette, editor
- Published
- 2023
- Full Text
- View/download PDF
22. Adapting Code-Switching Language Models with Statistical-Based Text Augmentation
- Author
-
Prachaseree, Chaiyasait, Gupta, Kshitij, Ho, Thi Nga, Peng, Yizhou, Zin Tun, Kyaw, Chng, Eng Siong, Chalapthi, G. S. S., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nguyen, Ngoc Thanh, editor, Boonsang, Siridech, editor, Fujita, Hamido, editor, Hnatkowska, Bogumiła, editor, Hong, Tzung-Pei, editor, Pasupa, Kitsuchart, editor, and Selamat, Ali, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Recurrent Neural Networks
- Author
-
Jo, Taeho and Jo, Taeho
- Published
- 2023
- Full Text
- View/download PDF
24. A Statistical Approach for Extractive Hindi Text Summarization Using Machine Translation
- Author
-
Gupta, Pooja, Nigam, Swati, Singh, Rajiv, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Reddy, K. Ashoka, editor, Devi, B. Rama, editor, George, Boby, editor, Raju, K. Srujan, editor, and Sellathurai, Mathini, editor
- Published
- 2023
- Full Text
- View/download PDF
25. Initial Explorations on Chaotic Behaviors of Recurrent Neural Networks
- Author
-
Myrzakhmetov, Bagdat, Takhanov, Rustem, Assylbekov, Zhenisbek, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
- Published
- 2023
- Full Text
- View/download PDF
26. Multiplicative Models for Recurrent Language Modeling
- Author
-
Maupomé, Diego, Meurs, Marie-Jean, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, and Gelbukh, Alexander, editor
- Published
- 2023
- Full Text
- View/download PDF
27. Research on the Reform of the Teaching Mode of Rural English Education Assistance Based on the Technical Support of Network Technology
- Author
-
Su Zinan
- Subjects
speech recognition technology ,language modeling ,data embedding ,deep learning networks ,english education assistance ,97b60 ,Mathematics ,QA1-939 - Abstract
Under the background of the development of network technology, this paper aims to promote rural English teaching and constructs an English teaching model that combines English recognition technology and rural teaching. The main process of speech recognition is examined by analyzing different speech recognition technologies. Using a deep learning network, an English speech recognition model has been established. Combined with the English acoustic features in the network data, fluency of English speech is evaluated. Data embedding is performed on the English sequences in the network, combined with the sequence probability in the English data, so as to determine whether the English speech is correct or not. The Eval value for the English recognition model based on deep learning is 5.49%, while the test value is 5.89%, as per the results. As the English dataset increases, so does the English recognition technique proposed in this paper, and the accuracy remains above 0.6, and when the dataset is 500, the speech recognition accuracy is 0.8. The teaching model that combines speech recognition techniques with English teaching improves students’ English to a certain extent.
- Published
- 2024
- Full Text
- View/download PDF
28. GAMBARAN PENGGUNAAN FLASHCARD DAN PEMODELAN BAHASA DALAM PENINGKATAN KEMAMPUAN BAHASA SISWA USIA DINI 5 - 6 TAHUN DENGAN SPECTRUM AUTISM DISORDER.
- Author
-
I Kbarek, Christina Makmeser and Yunitasari, Septiyani Endang
- Abstract
The aim of this study was to measure improvements in language skills in children aged five to six who have autism spectrum in an early childhood education (ECCE) school. Flash cards are used to teach children language. In addition, this study investigated language modeling in an effort to improve children's language skills. This study used observation and interviews as qualitative measures. The results showed that the use of flashcards and language modeling was very effective in improving the language skills of early childhood students with autism spectrum disorders. Flash cards serve as visual tools that help students remember words and understand their meaning. After students used flashcards frequently, their comprehension and use of new words improved significantly. Language modeling by proficient teachers and peers also helps students become better at speaking. This study also used the Zone of Proximal Development (ZPD) method to assess the use of flashcards and language modeling in students. These results have important consequences for the development of linguistic ability intervention techniques for students with autism spectrum disorders that allow them to play in ECCE environments to aid their language development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Contemporary Approaches in Evolving Language Models.
- Author
-
Oralbekova, Dina, Mamyrbayev, Orken, Othman, Mohamed, Kassymova, Dinara, and Mukhsina, Kuralai
- Subjects
LANGUAGE models ,NATURAL language processing ,HIDDEN Markov models ,TRANSFORMER models - Abstract
This article provides a comprehensive survey of contemporary language modeling approaches within the realm of natural language processing (NLP) tasks. This paper conducts an analytical exploration of diverse methodologies employed in the creation of language models. This exploration encompasses the architecture, training processes, and optimization strategies inherent in these models. The detailed discussion covers various models ranging from traditional n-gram and hidden Markov models to state-of-the-art neural network approaches such as BERT, GPT, LLAMA, and Bard. This article delves into different modifications and enhancements applied to both standard and neural network architectures for constructing language models. Special attention is given to addressing challenges specific to agglutinative languages within the context of developing language models for various NLP tasks, particularly for Arabic and Turkish. The research highlights that contemporary transformer-based methods demonstrate results comparable to those achieved by traditional methods employing Hidden Markov Models. These transformer-based approaches boast simpler configurations and exhibit faster performance during both training and analysis. An integral component of the article is the examination of popular and actively evolving libraries and tools essential for constructing language models. Notable tools such as NLTK, TensorFlow, PyTorch, and Gensim are reviewed, with a comparative analysis considering their simplicity and accessibility for implementing diverse language models. The aim is to provide readers with insights into the landscape of contemporary language modeling methodologies and the tools available for their implementation. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Transformer-Based Composite Language Models for Text Evaluation and Classification.
- Author
-
Škorić, Mihailo, Utvić, Miloš, and Stanković, Ranka
- Subjects
- *
LANGUAGE models , *MACHINE translating , *NATURAL language processing , *GENERATIVE pre-trained transformers , *TRANSFORMER models , *SERBIAN language , *ATTRIBUTION of authorship - Abstract
Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper is to present the advantages of using composite language models in the processing and evaluation of texts written in arbitrary highly inflective and morphology-rich natural language, particularly Serbian. A perplexity-based dataset, the main asset for the methodology assessment, was created using a series of generative pre-trained transformers trained on different representations of the Serbian language corpus and a set of sentences classified into three groups (expert translations, corrupted translations, and machine translations). The paper describes a comparative analysis of calculated perplexities in order to measure the classification capability of different models on two binary classification tasks. In the course of the experiment, we tested three standalone language models (baseline) and two composite language models (which are based on perplexities outputted by all three standalone models). The presented results single out a complex stacked classifier using a multitude of features extracted from perplexity vectors as the optimal architecture of composite language models for both tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling.
- Author
-
Manohar, Kavya, A R, Jayan, and Rajan, Rajeev
- Subjects
AUTOMATIC speech recognition ,LANGUAGE models ,LINGUISTIC complexity ,COMPUTATIONAL linguistics ,SPEECH processing systems ,GRAPHEMICS ,CORPORA - Abstract
This article presents the research work on improving speech recognition systems for the morphologically complex Malayalam language using subword tokens for language modeling. The speech recognition system is built using a deep neural network–hidden Markov model (DNN-HMM)-based automatic speech recognition (ASR). We propose a novel method, syllable-byte pair encoding (S-BPE), that combines linguistically informed syllable tokenization with the data-driven tokenization method of byte pair encoding (BPE). The proposed method ensures words are always segmented at valid pronunciation boundaries. On a text corpus that has been divided into tokens using the proposed method, we construct statistical n-gram language models and assess the modeling effectiveness in terms of both information-theoretic and corpus linguistic metrics. A comparative study of the proposed method with other data-driven (BPE, Morfessor, and Unigram), linguistic (Syllable), and baseline (Word) tokenization algorithms is also presented. Pronunciation lexicons of subword tokenized units are built with pronunciation described as graphemes. We develop ASR systems employing the subword tokenized language models and pronunciation lexicons. The resulting ASR models are comprehensively evaluated to answer the research questions regarding the impact of subword tokenization algorithms on language modeling complexity and on ASR performance. Our study highlights the strong performance of the hybrid S-BPE tokens, achieving a notable 10.6% word error rate (WER), which represents a substantial 16.8% improvement over the baseline word-level ASR system. The ablation study has revealed that the performance of S-BPE segmentation, which initially underperformed compared to syllable tokens with lower amounts of textual data for language modeling, exhibited steady improvement with the increase in LM training data. The extensive ablation study indicates that there is a limited advantage in raising the n-gram order of the language model beyond n = 3 . Such an increase results in considerable model size growth without significant improvements in WER. The implementation of the algorithm and all associated experiments are available under an open license, allowing for reproduction, adaptation, and reuse. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Flexible neural architectures for sequence modeling
- Author
-
Krause, Benjamin, Renals, Stephen, and Murray, Iain
- Subjects
006.3 ,language modeling ,multiplicative LSTM ,mLSTM ,dynamic evaluation ,sequence modeling - Abstract
Auto-regressive sequence models can estimate the distribution of any type of sequential data. To study sequence models, we consider the problem of language modeling, which entails predicting probability distributions over sequences of text. This thesis improves on previous language modeling approaches by giving models additional flexibility to adapt to their inputs. In particular, we focus on multiplicative LSTM (mLSTM), which has added flexibility to change its recurrent transition function depending on its input as compared with traditional LSTM, and dynamic evaluation, which helps LSTM (or other sequence models) adapt to the recent sequence history to exploit re-occurring patterns within a sequence. We find that using these adaptive approaches for language modeling improves their predictions by helping them recover from surprising tokens and sequences. mLSTM is a hybrid of a multiplicative recurrent neural network (mRNN) and an LSTM. mLSTM is characterized by its ability to have recurrent transition functions that can vary more for each possible input token, and makes better predictions as compared with LSTM after viewing unexpected inputs in our experiments. mLSTM also outperformed all previous neural architectures at character level language modeling. Dynamic evaluation is a method for adapting sequence models to the recent sequence history at inference time using gradient descent, assigning higher probabilities to re-occurring sequential patterns. While dynamic evaluation was often previously viewed as a way of using additional training data, this thesis argues that dynamic evaluation is better thought of as a way of adapting probability distributions to their own predictions. We also explore and develop dynamic evaluation methods with the goals of achieving the best prediction performance and computational/memory efficiency, as well as understanding why these methods work. Different variants of dynamic evaluation are applied to a number of different architectures, resulting in improvements to language modeling over a longer contexts, as well as polyphonic music prediction. Dynamically evaluated models are also able to generate conditional samples that repeat patterns from the conditioning text, and achieve improved generalization in modeling out of domain sequences. The added flexibility that dynamic evaluation gives models allows them to recover faster when predicting unexpected sequences. The proposed approaches improve on previous language models by giving them additional flexibility to adapt to their inputs. mLSTM and dynamic evaluation both contributed to improvements to the state of the art in language modeling, and have potential applications to a wider range of sequence modeling problems.
- Published
- 2020
- Full Text
- View/download PDF
33. How do the kids speak? Improving educational use of text mining with child-directed language models
- Author
-
Organisciak, Peter, Newman, Michele, Eby, David, Acar, Selcuk, and Dumas, Denis
- Published
- 2023
- Full Text
- View/download PDF
34. Evaluating Quantized Llama 2 Models for IoT Privacy Policy Language Generation
- Author
-
Bhavani Malisetty and Alfredo J. Perez
- Subjects
large language models ,Internet of Things ,privacy policies ,language modeling ,quantized models ,usable privacy ,Information technology ,T58.5-58.64 - Abstract
Quantized large language models are large language models (LLMs) optimized for model size while preserving their efficacy. They can be executed on consumer-grade computers without the powerful features of dedicated servers needed to execute regular (non-quantized) LLMs. Because of their ability to summarize, answer questions, and provide insights, LLMs are being used to analyze large texts/documents. One of these types of large texts/documents are Internet of Things (IoT) privacy policies, which are documents specifying how smart home gadgets, health-monitoring wearables, and personal voice assistants (among others) collect and manage consumer/user data on behalf of Internet companies providing services. Even though privacy policies are important, they are difficult to comprehend due to their length and how they are written, which makes them attractive for analysis using LLMs. This study evaluates how quantized LLMs are modeling the language of privacy policies to be potentially used to transform IoT privacy policies into simpler, more usable formats, thus aiding comprehension. While the long-term goal is to achieve this usable transformation, our work focuses on evaluating quantized LLM models used for IoT privacy policy language. Particularly, we study 4-bit, 5-bit, and 8-bit quantized versions of the large language model Meta AI version 2 (Llama 2) and the base Llama 2 model (zero-shot, without fine-tuning) under different metrics and prompts to determine how well these quantized versions model the language of IoT privacy policy documents by completing and generating privacy policy text.
- Published
- 2024
- Full Text
- View/download PDF
35. BENCHMARKING DYNAMIC CONVOLUTIONAL NEURAL NETWORK WITH LANGUAGE MODELING PRE-TRAINING FOR SENTIMENT AND QUESTION CLASSIFICATION TASKS.
- Author
-
Ceylan, Ali Mert
- Subjects
CONVOLUTIONAL neural networks ,BENCHMARKING (Management) ,LANGUAGE models ,SENTIMENT analysis ,COMPUTER architecture - Abstract
One-dimensional convolutional models are used for various natural language processing tasks. This study revisits Dynamic Convolutional Neural Network (DCNN) architecture. The study investigates the effect of language modeling pre-training on Wikicorpus on published experiment results for DCNN. Therefore, the reference study integrates a top layer for the language-modeling training into DCNN. Also, benchmarks were reported for the original DCNN compared to the pre-trained language model version. The revisited model was then benchmarked for sentiment classification and question classification tasks. Benchmarks include transfer learning from pre-trained DCNN for language modeling and ground-up trained versions of DCNN on Stanford Sentiment Tree Bank and TREC Question Classification datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
36. Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models.
- Author
-
Stanojević, Miloš, Brennan, Jonathan R., Dunagan, Donald, Steedman, Mark, and Hale, John T.
- Subjects
- *
LANGUAGE models , *NEUROLINGUISTICS , *FUNCTIONAL magnetic resonance imaging , *EXPRESSIVE language , *TEMPORAL lobe - Abstract
To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad‐coverage tools from natural‐language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context‐free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next‐word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure‐building predominantly in the left posterior temporal lobe: CCG‐derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure‐building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. On the Predictive Power of Neural Language Models for Human Real-TimeComprehension Behavior
- Author
-
Wilcox, Ethan G., Gauthier, Jon, Hu, Jennifer, Qian, Peng, and Levy, Roger P.
- Subjects
Language modeling ,real-time language compre-hension ,Deep learning ,eye-tracking ,self-paced reading - Abstract
Human reading behavior is tuned to the statistics of natural lan-guage: the time it takes human subjects to read a word can bepredicted from estimates of the word’s probability in context.However, it remains an open question what computational ar-chitecture best characterizes the expectations deployed in realtime by humans that determine the behavioral signatures ofreading. Here we test over two dozen models, independentlymanipulating computational architecture and training datasetsize, on how well their next-word expectations predict humanreading time behavior on naturalistic text corpora. Consistentwith previous work, we find that across model architecturesand training dataset sizes the relationship between word log-probability and reading time is (near-)linear. We next evalu-ate how features of these models determine their psychometricpredictive power, or ability to predict human reading behav-ior. In general, the better a model’s next-word expectations(as measured by the traditional language modeling perplexityobjective), the better its psychometric predictive power. How-ever, we find nontrivial differences in psychometric predictivepower across model architectures. For any given perplexity,deep Transformer models and n-gram models generally showsuperior psychometric predictive power over LSTM or struc-turally supervised neural models, especially for eye movementdata. Finally, we compare models’ psychometric predictivepower to the depth of their syntactic knowledge, as measuredby a battery of syntactic generalization tests developed usingmethods from controlled psycholinguistic experiments. Onceperplexity is controlled for, we find no significant relationshipbetween syntactic knowledge and predictive power. These re-sults suggest that, at least for the present state of natural lan-guage technology, different approaches may be required to bestmodel human real-time language comprehension behavior innaturalistic reading versus behavior for controlled linguisticmaterials designed for targeted probing of syntactic knowl-edge.
- Published
- 2020
38. Boosting Item Coverage in Session-Based Recommendation
- Author
-
Anarfi, Richard, Sen, Amartya, Fletcher, Kenneth K., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Qingyang, Wang, editor, and Zhang, Liang-Jie, editor
- Published
- 2022
- Full Text
- View/download PDF
39. ELECTRA-KG: A Transformer-Knowledge Graph Recommender System
- Author
-
Kwapong, Benjamin, Sen, Amartya, Fletcher, Kenneth K., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Qingyang, Wang, editor, and Zhang, Liang-Jie, editor
- Published
- 2022
- Full Text
- View/download PDF
40. A Comparative Study of BERT-Based Attention Flows Versus Human Attentions on Fill-in-Blank Task
- Author
-
Qian, Ming, Lee, Ka Wai, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chen, Jessie Y. C., editor, Fragomeni, Gino, editor, Degen, Helmut, editor, and Ntoa, Stavroula, editor
- Published
- 2022
- Full Text
- View/download PDF
41. Label2Label: A Language Modeling Framework for Multi-attribute Learning
- Author
-
Li, Wanhua, Cao, Zhexuan, Feng, Jianjiang, Zhou, Jie, Lu, Jiwen, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
42. Test Sample Selection for Handwriting Recognition Through Language Modeling
- Author
-
Rosello, Adrian, Ayllon, Eric, Valero-Mas, Jose J., Calvo-Zaragoza, Jorge, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pinho, Armando J., editor, Georgieva, Petia, editor, Teixeira, Luís F., editor, and Sánchez, Joan Andreu, editor
- Published
- 2022
- Full Text
- View/download PDF
43. Natural Language Processing for Small Businesses and Future Trends in Healthcare
- Author
-
Jha, Saurav, Tiwari, Priyesh, Gupta, Shiv Narain, Gupta, Vivek, Kacprzyk, Janusz, Series Editor, Agrawal, Rajeev, editor, He, Jing, editor, Shubhakar Pilli, Emmanuel, editor, and Kumar, Sanjeev, editor
- Published
- 2022
- Full Text
- View/download PDF
44. Grapheme to Phoneme Conversion for Malayalam Speech Using Encoder-Decoder Architecture
- Author
-
Priyamvada, R., Govind, D., Menon, Vijay Krishna, Premjith, B., Soman, K. P., Howlett, Robert J., Series Editor, Jain, Lakhmi C., Series Editor, Satapathy, Suresh Chandra, editor, Peer, Peter, editor, Tang, Jinshan, editor, Bhateja, Vikrant, editor, and Ghosh, Anumoy, editor
- Published
- 2022
- Full Text
- View/download PDF
45. Adapting Automatic Speech Recognition to the Radiology Domain for a Less-Resourced Language: The Case of Latvian
- Author
-
Gruzitis, Normunds, Dargis, Roberts, Lasmanis, Viesturs Julijs, Garkaje, Ginta, Gosko, Didzis, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Nagar, Atulya K., editor, Jat, Dharm Singh, editor, Marín-Raventós, Gabriela, editor, and Mishra, Durgesh Kumar, editor
- Published
- 2022
- Full Text
- View/download PDF
46. Towards Many to Many Communication Among Blind, Deaf and Dumb Users
- Author
-
Chaithra, A. S., Athiya, Umme, Aishwarya, R., Rajesh, Aswathi, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Kumar, Amit, editor, Senatore, Sabrina, editor, and Gunjan, Vinit Kumar, editor
- Published
- 2022
- Full Text
- View/download PDF
47. Generating Correction Candidates for OCR Errors using BERT Language Model and FastText SubWord Embeddings
- Author
-
Hajiali, Mahdi, Fonseca Cacho, Jorge Ramón, Taghva, Kazem, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2022
- Full Text
- View/download PDF
48. Slovak Question Answering Dataset Based on the Machine Translation of the Squad V2.0.
- Author
-
Staš, Ján, Hládek, Daniel, and Koctúr, Tomáš
- Subjects
- *
MACHINE translating , *LANGUAGE models , *QUESTION answering systems , *NATURAL language processing - Abstract
This paper describes the process of building the first large-scale machinetranslated question answering dataset SQuAD-sk for the Slovak language. The dataset was automatically translated from the original English SQuAD v2.0 using the Marian neural machine translation together with the Helsinki-NLP Opus English-Slovak model. Moreover, we proposed an effective approach for the approximate search of the translated answer in the translated paragraph based on measuring their similarity using their word vectors. In this way, we obtained more than 92% of the translated questions and answers from the original English dataset. We then used this machine-translated dataset to train the Slovak question answering system by fine-tuning monolingual and multilingual BERT-based language models. The scores achieved by EM = 69.48% and F1 = 78.87% for the fine-tuned mBERT model show comparable results of question answering with recently published machinetranslated SQuAD datasets for other European languages. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Fixed global memory for controllable long text generation.
- Author
-
Chen, Zheng and Liu, Zhejun
- Subjects
LANGUAGE models ,MEMORY ,TEXT recognition - Abstract
Long text generation is a challenging yet unsolved task. To generate long, coherent, and consistent text, existing approaches need to increase the language model length accordingly. However, the cost of the computational and memory resources grows as the square of the length. Even trained with thousands of GPUs, the length of language models is still limited to a few thousand, which may cause the generation of longer texts to be inconsistent with the topics and ideas in their preceding texts. To address this, we propose a novel Transformer architecture called Transformer with Local and Global Memory (Transformer LGM). It is inspired by the way people write long articles, which generate a key idea first and then guide the writing of the entire article with the idea in mind. Such a "key idea" can be put into the fixed global memory of the Transformer LGM to guide the whole generation process. On the contrary, the local memory, which is responsible for local coherence, could shift and drop with the increasing length of the generated text. We implement the global memory by introducing a negative positional embedding, while the traditional positive positional embedding is still used for the local memory. Experiments show that by utilizing the global memory, our model could generate long, coherent, and consistent text without enlarging the length of the language model. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Optimization integrated generative adversarial network for occluded text recognition with language modeling.
- Author
-
Selvaraj, Selvin Ebenezer and Tripuraribhatla, Raghuveera
- Subjects
GENERATIVE adversarial networks ,TEXT recognition ,PATTERN recognition systems ,MOVING average process - Abstract
Summary: Text recognition has attracted increased attention recently as a result of the complexity of natural settings and the variety of text instances. Various text or character recognition methods are introduced to distinguish the text from the natural scene, but existing methods struggle with the distorted and highly curved text instances. Consequently, an effective method for occluded text or character detection from object‐background images is developed using the suggested elephant herding exponential sailfish optimizer‐based generative adversarial network (EHESFO‐based GAN). In order to build the proposed EHESFO, elephant herding optimization and Exponential SailFish Optimizer (ESFO) are merged. ESFO was created by fusing exponentially weighted moving average and SailFish Optimizer. With GAN, features extracted from the background and foreground of an image are efficiently used for image annotation and text recognition. The best features from the background and foreground images are extracted to create the optimal solution, which increases the efficacy and efficiency of text recognition. While taking the occlusion as 0.4, the proposed EHESFO‐based GAN achieved higher accuracy of 98.1090% and lower error of 1.4%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.