"Language Modeling" / Publisher: wiley-blackwell - Searchworks@Jio Institute Digital Library Search Results

1. Robust language modeling for a small corpus of target tasks using class-combined word statistics and selective use of a general corpus.

Author: Wada, Yosuke, Kobayashi, Norihiko, and Kobayashi, Tetsunori
Subjects: LANGUAGE & languages, ANTHROPOLOGY, COMMUNICATION, ETHNOLOGY, COMPARATIVE grammar, INFORMATION theory
Abstract: In order to improve the accuracy of language models in speech recognition tasks for which collecting a large text corpus for language model training is difficult, we propose a class-combined bigram and selective use of general text. In the class-combined bigram, the word bigram and the class bigram are combined using weights that are expressed as the functions of the preceding word frequency and the succeeding word-type count. An experiment has shown that the accuracy of the proposed class-combined bigram is equivalent to that of the word bigram trained with a text corpus that is approximately three times larger. In the selective use of general text, the language model was corrected by automatically selecting sentences that were expected to produce better accuracy from a large volume of text collected without specifying the task and by adding these sentences to a small corpus of target tasks. An experiment has shown that the recognition error rate was reduced by up to 12% compared to a case in which text was not selected. Lastly, when we created a model that uses both the class-combined bigram and text addition, further improvements were obtained, resulting in improvements of approximately 34% in adjusted perplexity and approximately 31% in the recognition error rate compared to the word bigram created from the target task text only. © 2003 Wiley Periodicals, Inc. Syst Comp Jpn, 34(12): 92–102, 2003; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/scj.1219 [ABSTRACT FROM AUTHOR]
Published: 2003
Full Text: View/download PDF

2. Language modeling by stochastic dependency grammar for Japanese speech recognition.

Author: Ito, Akinori, Hori, Chiori, Katoh, Masaharu, and Kohda, Masaki
Subjects: GRAMMAR, SPEECH perception, SYNTAX (Grammar), ALGORITHMS, LEARNING, JAPANESE language
Abstract: This article has been removed by the publisher. [ABSTRACT FROM AUTHOR]
Published: 2001
Full Text: View/download PDF

3. Optimization integrated generative adversarial network for occluded text recognition with language modeling.

Author: Selvaraj, Selvin Ebenezer and Tripuraribhatla, Raghuveera
Subjects: GENERATIVE adversarial networks, TEXT recognition, PATTERN recognition systems, MOVING average process
Abstract: Summary: Text recognition has attracted increased attention recently as a result of the complexity of natural settings and the variety of text instances. Various text or character recognition methods are introduced to distinguish the text from the natural scene, but existing methods struggle with the distorted and highly curved text instances. Consequently, an effective method for occluded text or character detection from object‐background images is developed using the suggested elephant herding exponential sailfish optimizer‐based generative adversarial network (EHESFO‐based GAN). In order to build the proposed EHESFO, elephant herding optimization and Exponential SailFish Optimizer (ESFO) are merged. ESFO was created by fusing exponentially weighted moving average and SailFish Optimizer. With GAN, features extracted from the background and foreground of an image are efficiently used for image annotation and text recognition. The best features from the background and foreground images are extracted to create the optimal solution, which increases the efficacy and efficiency of text recognition. While taking the occlusion as 0.4, the proposed EHESFO‐based GAN achieved higher accuracy of 98.1090% and lower error of 1.4%. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

4. Language-modeling kernel based approach for information retrieval.

Author: Ying Xie and Raghavan, Vijay V.
Subjects: INFORMATION resources management, INFORMATION science, INFORMATION services, INFORMATION retrieval, ACCESS to information, SEARCH engines, INFORMATION-seeking strategies, INFORMATION filtering, FUNCTIONAL analysis, MACHINE learning
Abstract: In this presentation, we propose a novel integrated information retrieval approach that provides a unified solution for two challenging problems in the field of information retrieval. The first problem is how to build an optimal vector space corresponding to users' different information needs when applying the vector space model. The second one is how to smoothly incorporate the advantages of machine learning techniques into the language modeling approach. To solve these problems, we designed the language-modeling kernel function, which has all the modeling powers provided by language modeling techniques. In addition, for each information need, this kernel function automatically determines an optimal vector space, for which a discriminative learning machine, such as the support vector machine, can be applied to find an optimal decision boundary between relevant and nonrelevant documents. Large-scale experiments on standard test-beds show that our approach makes significant improvements over other state-of-the-art information retrieval methods. [ABSTRACT FROM AUTHOR]
Published: 2007
Full Text: View/download PDF

5. Enhancing information retrieval through concept-based language modeling and semantic smoothing.

Author: Said Lhadj, Lynda, Boughanem, Mohand, and Amrouche, Karima
Subjects: *EXPERIMENTAL design, *INFORMATION retrieval, *PRESS, *SEMANTICS, *STATISTICAL models
Abstract: Traditionally, many information retrieval models assume that terms occur in documents independently. Although these models have already shown good performance, the word independency assumption seems to be unrealistic from a natural language point of view, which considers that terms are related to each other. Therefore, such an assumption leads to two well-known problems in information retrieval ( IR), namely, polysemy, or term mismatch, and synonymy. In language models, these issues have been addressed by considering dependencies such as bigrams, phrasal-concepts, or word relationships, but such models are estimated using simple n-grams or concept counting. In this paper, we address polysemy and synonymy mismatch with a concept-based language modeling approach that combines ontological concepts from external resources with frequently found collocations from the document collection. In addition, the concept-based model is enriched with subconcepts and semantic relationships through a semantic smoothing technique so as to perform semantic matching. Experiments carried out on TREC collections show that our model achieves significant improvements over a single word-based model and the Markov Random Field model (using a Markov classifier). [ABSTRACT FROM AUTHOR]
Published: 2016
Full Text: View/download PDF

6. EXPLOITING SYNTACTIC, SEMANTIC, AND LEXICAL REGULARITIES IN LANGUAGE MODELING VIA DIRECTED MARKOV RANDOM FIELDS.

Author: Wang, Shaojun, Wang, Shaomin, Cheng, Li, Greiner, Russell, and Schuurmans, Dale
Subjects: *MARKOV processes, *MARKOV random fields, *SEMANTICS, *SENTENCES (Grammar), *ALGORITHMS, *PROBABILITY theory
Abstract: We present a directed Markov random field (MRF) model that combines w-gram models, probabilistic context-free grammars (PCFGs), and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. Even though the composite directed MRF model potentially has an exponential number of loops and becomes a context-sensitive grammar, we are nevertheless able to estimate its parameters in cubic time using an efficient modified Expectation-Maximization (EM) method, the generalized inside-outside algorithm, which extends the inside-outside algorithm to incorporate the effects of the n-gram and PLSA language models. We generalize various smoothing techniques to alleviate the sparseness of w-gram counts in cases where there are hidden variables. We also derive an analogous algorithm to find the most likely parse of a sentence and to calculate the probability of initial subsequence of a sentence, all generated by the composite language model. Our experimental results on the Wall Street Journal corpus show that we obtain significant reductions in perplexity compared to the state-of-the-art baseline trigram model with Good-Turing and Kneser-Ney smoothing techniques. [ABSTRACT FROM AUTHOR]
Published: 2013
Full Text: View/download PDF

7. Learning to rank using smoothing methods for language modeling.

Author: Lin, Yuan, Lin, Hongfei, Xu, Kan, and Sun, Xiaoling
Subjects: *EXPERIMENTAL design, *INFORMATION retrieval, *LANGUAGE & languages, *RESEARCH funding, *STATISTICAL models
Abstract: The central issue in language model estimation is smoothing, which is a technique for avoiding zero probability estimation problem and overcoming data sparsity. There are three representative smoothing methods: Jelinek- Mercer ( JM) method; Bayesian smoothing using Dirichlet priors ( Dir) method; and absolute discounting ( Dis) method, whose parameters are usually estimated empirically. Previous research in information retrieval ( IR) on smoothing parameter estimation tends to select a single value from optional values for the collection, but it may not be appropriate for all the queries. The effectiveness of all the optional values should be considered to improve the ranking performance. Recently, learning to rank has become an effective approach to optimize the ranking accuracy by merging the existing retrieval methods. In this article, the smoothing methods for language modeling in information retrieval ( LMIR) with different parameters are treated as different retrieval methods, then a learning to rank approach to learn a ranking model based on the features extracted by smoothing methods is presented. In the process of learning, the effectiveness of all the optional smoothing parameters is taken into account for all queries. The experimental results on the Learning to Rank for Information Retrieval ( LETOR) LETOR3.0 and LETOR4.0 data sets show that our approach is effective in improving the performance of LMIR. [ABSTRACT FROM AUTHOR]
Published: 2013
Full Text: View/download PDF

8. Using Morphological Data in Language Modeling for Serbian Large Vocabulary Speech Recognition.

Author: Pakoci, Edvin, Popović, Branislav, and Pekar, Darko
Subjects: AUTOMATIC speech recognition, ARTIFICIAL neural networks, SERBIAN language, VOCABULARY, SUFFIXES & prefixes (Grammar)
Abstract: Serbian is in a group of highly inflective and morphologically rich languages that use a lot of different word suffixes to express different grammatical, syntactic, or semantic features. This kind of behaviour usually produces a lot of recognition errors, especially in large vocabulary systems—even when, due to good acoustical matching, the correct lemma is predicted by the automatic speech recognition system, often a wrong word ending occurs, which is nevertheless counted as an error. This effect is larger for contexts not present in the language model training corpus. In this manuscript, an approach which takes into account different morphological categories of words for language modeling is examined, and the benefits in terms of word error rates and perplexities are presented. These categories include word type, word case, grammatical number, and gender, and they were all assigned to words in the system vocabulary, where applicable. These additional word features helped to produce significant improvements in relation to the baseline system, both for n-gram-based and neural network-based language models. The proposed system can help overcome a lot of tedious errors in a large vocabulary system, for example, for dictation, both for Serbian and for other languages with similar characteristics. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

11. Erratum: Language modeling by stochastic dependency grammer for Japanese speech recognition<FNR></FNR><FN>The following article was incorrectly published in Systems and Computers in Japan, Vol. 32, No. 12, November 15, 2001, pages 10–15 </FN>.

Author: Ito, Akinori, Hori, Chiori, Katoh, Masaharu, and Kohda, Masaki
Subjects: AUTOMATIC speech recognition
Abstract: Presents the title of the correct article that should have been published in place of "Language Modeling by Stochastic Dependency Grammar for Japanese Speech Recognition".
Published: 2002
Full Text: View/download PDF

12. Digital Simulation of Superconductive Memory System Based on Hardware Description Language Modeling.

Author: Narendran, S. and Selvakumar, J.
Subjects: DIGITAL computer simulation, COMPUTER hardware description languages, COMPUTER storage devices, METAL oxide semiconductors, STATIC random access memory, JOSEPHSON junctions
Abstract: We have modeled a memory system using Josephson Junction to attain low power consumption using low input voltage compared to conventional Complementary Metal Oxide Semiconductor-Static Random Access Memory (CMOS-SRAM). We attained the low power by connecting a shared/common bit line and using a 1-bit memory cell. Through our design we may attain 2.5–3.5 microwatts of power using lower input voltage of 0.6 millivolts. Comparative study has been made to find which memory system will attain low power consumption. Conventional SRAM techniques consume power in the range of milliwatts with the supply input in the range of 0-10 volts. Using HDL language, we made a memory logic design of RAM cells using Josephson Junction in FreeHDL software which is dedicated only for Josephson Junction based design. With use of XILINX, we have calculated the power consumption and equivalent Register Transfer Level (RTL) schematic is drawn. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

15. Statistical language modeling for information retrieval.

Author: Liu, Xiaoyong and Croft, W. Bruce
Published: 2005
Full Text: View/download PDF

16. Knowledge Graph-Based Hierarchical Text Semantic Representation.

Author: Wu, Yongliang, Pan, Xiao, Li, Jinghui, Dou, Shimao, Dong, Jiahao, and Wei, Dan
Subjects: KNOWLEDGE graphs, KNOWLEDGE representation (Information theory), SYNTAX (Grammar), NATURAL languages, TASK performance
Abstract: Document representation is the basis of language modeling. Its goal is to turn natural language text that flows into a structured form that can be stored and processed by a computer. The bag-of-words model is used by most of the text-representation methods that are currently available. And yet, they do not consider how phrases are used in the text, which hurts the performance of tasks that use natural language processing later on. Representing the meaning of text by phrases is a promising area of future research, but it is hard to do well because phrases are organized in a hierarchy and mining efficiency is low. In this paper, we put forward a method called hierarchical text semantic representation using the knowledge graph (HTSRKG), which uses syntactic structure features to find hierarchical phrases and knowledge graphs to improve how phrases are evaluated. First, we use CKY and PCFG to build the syntax tree sentence by sentence. Second, we walk through the parse tree using the hierarchical routing process to obtain the mixed phrase semantics in passages. Finally, the introduction of the knowledge graph improves the efficiency of text semantic extraction and the accuracy of text representation. This gives us a solid foundation for tasks involving natural language processing that come after. Extensive testing on actual datasets shows that HTSRKG surpasses baseline approaches with respect to text semantic representation, and the results of a recent benchmarking study support this. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. ChatENT: Augmented Large Language Model for Expert Knowledge Retrieval in Otolaryngology–Head and Neck Surgery.

Author: Long, Cai, Subburam, Deepak, Lowe, Kayle, dos Santos, André, Zhang, Jessica, Hwang, Sang, Saduka, Neil, Horev, Yoav, Su, Tao, Côté, David W.J., and Wright, Erin D.
Abstract: Objective: The recent surge in popularity of large language models (LLMs), such as ChatGPT, has showcased their proficiency in medical examinations and potential applications in health care. However, LLMs possess inherent limitations, including inconsistent accuracy, specific prompting requirements, and the risk of generating harmful hallucinations. A domain‐specific model might address these limitations effectively. Study Design: Developmental design. Setting: Virtual. Methods: Otolaryngology–head and neck surgery (OHNS) relevant data were systematically gathered from open‐access Internet sources and indexed into a knowledge database. We leveraged Retrieval‐Augmented Language Modeling to recall this information and utilized it for pretraining, which was then integrated into ChatGPT4.0, creating an OHNS‐specific knowledge question & answer platform known as ChatENT. The model is further tested on different types of questions. Results: ChatENT showed enhanced performance in the analysis and interpretation of OHNS information, outperforming ChatGPT4.0 in both the Canadian Royal College OHNS sample examination questions challenge and the US board practice questions challenge, with a 58.4% and 26.0% error reduction, respectively. ChatENT generated fewer hallucinations and demonstrated greater consistency. Conclusion: To the best of our knowledge, ChatENT is the first specialty‐specific knowledge retrieval artificial intelligence in the medical field that utilizes the latest LLM. It appears to have considerable promise in areas such as medical education, patient education, and clinical decision support. The model has demonstrated the capacity to overcome the limitations of existing LLMs, thereby signaling a future of more precise, safe, and user‐friendly applications in the realm of OHNS and other medical fields. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Assessing the impact of minor modifications on the interior structure of GRU: GRU1 and GRU2.

Author: Yigit, Gulsum and Amasyali, Mehmet Fatih
Subjects: QUESTION answering systems, SENTIMENT analysis, LEARNING strategies, RECURRENT neural networks
Abstract: In this study, two GRU variants named GRU1 and GRU2 are proposed by employing simple changes to the internal structure of the standard GRU, which is one of the popular RNN variants. Comparative experiments are conducted on four problems: language modeling, question answering, addition task, and sentiment analysis. Moreover, in the addition task, curriculum learning and anti‐curriculum learning strategies, which extend the training data having examples from easy to hard or from hard to easy, are comparatively evaluated. Accordingly, the GRU1 and GRU2 variants outperformed the standard GRU. In addition, the curriculum learning approach, in which the training data is expanded from easy to difficult, improves the performance considerably. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

19. Never‐Ending Learning for Explainable Brain Computing.

Author: Kuai, Hongzhi, Chen, Jianhui, Tao, Xiaohui, Cai, Lingyun, Imamura, Kazuyuki, Matsumoto, Hiroki, Liang, Peipeng, and Zhong, Ning
Subjects: *COGNITIVE neuroscience, *SOCIAL interaction, *LEARNING, *FUNCTIONAL analysis, *COGNITION
Abstract: Exploring the nature of human intelligence and behavior is a longstanding pursuit in cognitive neuroscience, driven by the accumulation of knowledge, information, and data across various studies. However, achieving a unified and transparent interpretation of findings presents formidable challenges. In response, an explainable brain computing framework is proposed that employs the never‐ending learning paradigm, integrating evidence combination and fusion computing within a Knowledge‐Information‐Data (KID) architecture. The framework supports continuous brain cognition investigation, utilizing joint knowledge‐driven forward inference and data‐driven reverse inference, bolstered by the pre‐trained language modeling techniques and the human‐in‐the‐loop mechanisms. In particular, it incorporates internal evidence learning through multi‐task functional neuroimaging analyses and external evidence learning via topic modeling of published neuroimaging studies, all of which involve human interactions at different stages. Based on two case studies, the intricate uncertainty surrounding brain localization in human reasoning is revealed. The present study also highlights the potential of systematization to advance explainable brain computing, offering a finer‐grained understanding of brain activity patterns related to human intelligence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Word Forms Reflect Trade-Offs Between Speaker Effort and Robust Listener Recognition.

Author: Meylan SC and Griffiths TL
Subjects: Humans, Speech, Speech Perception, Language, Phonetics, Recognition, Psychology
Abstract: How do cognitive pressures shape the lexicons of natural languages? Here, we reframe George Kingsley Zipf's proposed "law of abbreviation" within a more general framework that relates it to cognitive pressures that affect speakers and listeners. In this new framework, speakers' drive to reduce effort (Zipf's proposal) is counteracted by the need for low-frequency words to have word forms that are sufficiently distinctive to allow for accurate recognition by listeners. To support this framework, we replicate and extend recent work using the prevalence of subword phonemic sequences (phonotactic probability) to measure speakers' production effort in place of Zipf's measure of length. Across languages and corpora, phonotactic probability is more strongly correlated with word frequency than word length. We also show this measure of ease of speech production (phonotactic probability) is strongly correlated with a measure of perceptual difficulty that indexes the degree of competition from alternative interpretations in word recognition. This is consistent with the claim that there must be trade-offs between these two factors, and is inconsistent with a recent proposal that phonotactic probability facilitates both perception and production. To our knowledge, this is the first work to offer an explanation why long, phonotactically improbable word forms remain in the lexicons of natural languages., (© 2024 The Author(s). Cognitive Science published by Wiley Periodicals LLC on behalf of Cognitive Science Society (CSS).)
Published: 2024
Full Text: View/download PDF

21. Software Systems Security Vulnerabilities Management by Exploring the Capabilities of Language Models Using NLP.

Author: Althar, Raghavendra Rao, Samanta, Debabrata, Kaur, Manjit, Alnuaim, Abeer Ali, Aljaffan, Nouf, and Aman Ullah, Mohammad
Subjects: COMPUTER software security, SECURITY management, SECURITY systems software, SOFTWARE engineering, COMPUTER software development
Abstract: Security of the software system is a prime focus area for software development teams. This paper explores some data science methods to build a knowledge management system that can assist the software development team to ensure a secure software system is being developed. Various approaches in this context are explored using data of insurance domain-based software development. These approaches will facilitate an easy understanding of the practical challenges associated with actual-world implementation. This paper also discusses the capabilities of language modeling and its role in the knowledge system. The source code is modeled to build a deep software security analysis model. The proposed model can help software engineers build secure software by assessing the software security during software development time. Extensive experiments show that the proposed models can efficiently explore the software language modeling capabilities to classify software systems' security vulnerabilities. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. Modeling Structure‐Building in the Brain With CCG Parsing and Large Language Models.

Author: Stanojević, Miloš, Brennan, Jonathan R., Dunagan, Donald, Steedman, Mark, and Hale, John T.
Subjects: *LANGUAGE models, *NEUROLINGUISTICS, *FUNCTIONAL magnetic resonance imaging, *EXPRESSIVE language, *TEMPORAL lobe
Abstract: To model behavioral and neural correlates of language comprehension in naturalistic environments, researchers have turned to broad‐coverage tools from natural‐language processing and machine learning. Where syntactic structure is explicitly modeled, prior work has relied predominantly on context‐free grammars (CFGs), yet such formalisms are not sufficiently expressive for human languages. Combinatory categorial grammars (CCGs) are sufficiently expressive directly compositional models of grammar with flexible constituency that affords incremental interpretation. In this work, we evaluate whether a more expressive CCG provides a better model than a CFG for human neural signals collected with functional magnetic resonance imaging (fMRI) while participants listen to an audiobook story. We further test between variants of CCG that differ in how they handle optional adjuncts. These evaluations are carried out against a baseline that includes estimates of next‐word predictability from a transformer neural network language model. Such a comparison reveals unique contributions of CCG structure‐building predominantly in the left posterior temporal lobe: CCG‐derived measures offer a superior fit to neural signals compared to those derived from a CFG. These effects are spatially distinct from bilateral superior temporal effects that are unique to predictability. Neural effects for structure‐building are thus separable from predictability during naturalistic listening, and those effects are best characterized by a grammar whose expressive power is motivated on independent linguistic grounds. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

23. Holistic Evaluation of Language Models.

Author: Bommasani, Rishi, Liang, Percy, and Lee, Tony
Subjects: LANGUAGE models, NATURAL language processing, CHATGPT
Abstract: Language models (LMs) like GPT‐3, PaLM, and ChatGPT are the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of LMs. LMs can serve many purposes and their behavior should satisfy many desiderata. To navigate the vast space of potential scenarios and metrics, we taxonomize the space and select representative subsets. We evaluate models on 16 core scenarios and 7 metrics, exposing important trade‐offs. We supplement our core evaluation with seven targeted evaluations to deeply analyze specific aspects (including world knowledge, reasoning, regurgitation of copyrighted content, and generation of disinformation). We benchmark 30 LMs, from OpenAI, Microsoft, Google, Meta, Cohere, AI21 Labs, and others. Prior to HELM, models were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: all 30 models are now benchmarked under the same standardized conditions. Our evaluation surfaces 25 top‐level findings. For full transparency, we release all raw model prompts and completions publicly. HELM is a living benchmark for the community, continuously updated with new scenarios, metrics, and models https://crfm.stanford.edu/helm/latest/. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

24. Evaluation of Interpretable Speech Biomarkers for Monitoring Alzheimer's Disease and Mild Cognitive Impairment Progression.

Author: Favaro, Anna, Dehak, Najim, Thebaud, Thomas, Oh, Esther S, and Moro‐Velazquez, Laureano
Abstract: Background: Alzheimer's disease (AD) is a progressive neurodegenerative disease. Like for other dementias, biomarkers may help characterize distinct aspects of the underlying pathology, predict decline, and monitor disease progression. We extracted a composite array of 24 speech‐based biomarkers using machine learning and signal processing techniques. We then determined which of the given biomarkers was predictive of MCI and AD progression from baseline by conducting a correlation analysis. Method: Spoken responses to a Cookie Theft Picture description task were recorded from 2 subjects with AD, 4 subjects with Mild Cognitive Impairment (MCI) due to AD (biomarker confirmed AD), and 4 subjects with MCI. We automatically extracted acoustic, linguistic, and cognitive biomarkers using speech recognition, acoustic, and language modeling techniques. We later computed Kendall's tau‐b (τb) correlation between the biomarker values and Montreal Cognitive Assessment (MoCA) scores. Recordings and MoCA scores were collected at baseline, at 6 months after baseline, and at 12 months for a subgroup of the subjects. To perform the correlation analysis, we used scipy.stats.kendalltau library in Python. Result: With respect to the acoustic biomarkers, a strong negative correlation (τb= ‐0.50) was observed between the MoCA scores and features such as pause time, pause percentage, and pause speech ratio, while a moderate positive correlation (τb= 0.2) was found for rhythm standard deviation. With respect to the linguistic biomarkers a strong positive correlation(τb= 0.50) for the features of corrected type‐token ratio, average sentence length in words, and noun count was reported, while a moderate positive correlation (τb= 0.27) was found for the features of word type count, word token count, and moving average type‐token ratio. The other acoustic, linguistic, and cognitive biomarkers showed a very weak (τb= 0.00 ‐ 0.10) or weak correlation (τb= 0.10 ‐ 0.20). Conclusion: Altogether, these results suggest that speech‐based interpretable biomarkers may help clinicians to diagnose AD at earlier stages and monitor disease progression. Our preliminary data suggest that AD patients encounter more problems delivering longer and linguistically elaborated narratives as the disease progresses. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

25. Using Active Learning Methods to Strategically Select Essays for Automated Scoring.

Author: Firoozi, Tahereh, Mohammadi, Hamid, and Gierl, Mark J.
Subjects: *ACTIVE learning, *LANGUAGE models
Abstract: Research on Automated Essay Scoring has become increasing important because it serves as a method for evaluating students' written responses at scale. Scalable methods for scoring written responses are needed as students migrate to online learning environments resulting in the need to evaluate large numbers of written‐response assessments. The purpose of this study is to describe and evaluate three active learning methods that can be used to minimize the number of essays that must be scored by human raters while still providing the data needed to train a modern Automated Essay Scoring system. The three active learning methods are the uncertainty‐based, the topological‐based, and the hybrid method. These three methods were used to select essays included in the Automated Student Assessment Prize competition that were then classified using a scoring model that was trained with the bidirectional encoder representations from a transformer language model. All three active learning methods produced strong results, with the topological‐based method producing the most efficient classification. Growth rate accuracy was also evaluated. The active learning methods produced different levels of efficiency under different sample size allocations but, overall, all three methods were highly efficient and produced classifications that were similar to one another. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

26. DMA‐Net: A dual branch encoder and multi‐scale cross attention fusion network for skin lesion segmentation.

Author: Zhai, Guangyao, Wang, Guanglei, Shang, Qinghua, Li, Yan, and Wang, Hongrui
Abstract: Automatic segmentation of skin lesion is an important step in computer‐aided diagnosis. However, due to the significant variations in the size and shape of the lesion areas, as well as the low contrast with normal skin tissue, the boundaries are not clearly distinguishable, leading to a high possibility of incorrect segmentation. Therefore, this task is highly challenging. To overcome these difficulties, this paper proposes a medical image segmentation architecture named dual branch encoder and multi‐scale cross attention fusion network, which includes a dual‐branch encoder based on convolutional neural network and an improved channel‐enhanced Mamba to comprehensively extract local and global information from dermoscopy images. Additionally, to enhance the feature interaction and fusion of local and global information, a multi‐scale cross attention fusion module is adopted to cross‐merge features in different directions and at different scales, maximizing the advantages of the dual‐branch encoder and achieving precise segmentation of skin lesions. Extensive experiments are conducted on three public skin lesion datasets: ISIC‐2018, ISIC‐2017, and ISIC‐2016, to verify the effectiveness and superiority of the proposed method. The dice similarity coefficient scores on the three datasets reached 81.77%, 81.68% and 85.60%, respectively, surpassing most state‐of‐the‐art methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. A task‐centric knowledge graph construction method based on multi‐modal representation learning for industrial maintenance automation.

Author: Liu, Zengkun and Lu, Yuqian
Abstract: Maintenance manuals are crucial information sources for maintenance and repair. Prior studies explored factual knowledge extraction from textual documents. However, maintenance knowledge in manuals is more task‐centric rather than factual knowledge and often documented in an unstructured Portable Document Format (PDF), posing challenges for knowledge extraction. Addressing this, this research develops effective methods to extract task‐centric maintenance knowledge from unstructured PDF manuals. A new Task‐centric Knowledge Graph (TCKG) schema centralized on maintenance task components (MTCs) is proposed to address the need for structured knowledge representation. A method (Heterogeneous Graph‐based Method, HGM) for knowledge extraction is then proposed, which is enhanced by incorporating visual and spatial information. In the experiments, the proposed HGM exhibits robust performance in the knowledge extraction process, surpassing the baseline Graph‐based Interaction Model with a Tracker (GIT) method in MTCs extraction by 13.3%, and the baseline Translate Embedding (TransE) method in MTCs' relation extraction by 3.8%. A series of ablation studies also prove that including visual and spatial information through the proposed method can improve the relation extraction performance by over 10%. This research supplies valuable insights for future developments in information extraction from maintenance manuals. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. A Novel Transformer‐Based Anomaly Detection Model for the Reactor Coolant Pump in Nuclear Power Plants.

Author: Zhou, Guangrong, Zheng, Sheng, Yang, Senquan, Yi, Shuang, and Lomonaco, Guglielmo
Subjects: NUCLEAR reactor shutdowns, SIGNAL reconstruction, SPATIAL ability, LEARNING ability, COOLANTS, NUCLEAR power plants
Abstract: As a key equipment of the reactor coolant system, the reactor coolant pump's operational state can have a direct impact on the coolant circulation. However, the harsh working environment can accelerate the process of equipment deterioration for the reactor coolant pump while the current alarm mechanism cannot detect the deterioration of equipment performance at the early stage. This issue may lead to anomalies developing into faults and causing unplanned shutdowns and reactor trips. In this paper, we proposed a transformer‐based anomaly detection model for reactor coolant pump condition monitoring. On the basis of retaining the time‐dependent capture ability of the original transformer network for sequence data, the proposed model has obtained a stronger learning ability of the spatial correlation between variables through the application of the attention mechanism. Historical operating data under normal conditions are used for the training process and the reconstruction errors of input signals are utilized to identify anomalies. The experimental results have indicated that the proposed model possesses stronger feature learning capabilities, evidenced by improved performance in signal reconstruction and anomaly detection, which can help to detect the abnormal status at an earlier stage. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. A Robust Displacement Monitoring Model for High‐Arch Dams Integrating Signal Dimensionality Reduction and Deep Learning‐Based Residual Correction.

Author: Zhu, Yantao, Niu, Xinqiang, Yan, Tianyou, Xu, Lifu, and Zhang, Jian
Subjects: DAM safety, FAST Fourier transforms, WATER temperature, TIME series analysis, ATMOSPHERIC temperature, DEEP learning, ARCH dams
Abstract: Deformation is a critical indicator for the safety control of high‐arch dams, yet traditional statistical regression methods often exhibit poor predictive performance when applied to long‐sequence time series data. In this study, we develop a robust predictive model for deformation behavior in high‐arch dams by integrating signal dimensionality reduction with deep learning (DL)‐based residual correction techniques. First, the fast Fourier transform is employed to decompose air and water temperature sequences, enabling the extraction of temperature cycle characteristics at the dam boundary. A data‐driven statistical monitoring model for dam deformation, based on actual temperature data, is then proposed. Subsequently, an improved Bayesian Ridge regression model is used to construct the dam deformation monitoring framework. The residuals that traditional statistical methods fail to capture are input into an enhanced Long Short‐Term Memory (LSTM) network to effectively learn the temporal characteristics of the sequence. A high‐arch dam with a history of long‐term service is used as a case study. Experimental results indicate that the data dimensionality reduction method effectively extracts relevant information from observed temperature data, reducing the number of input variables. Comparative evaluation experiments show that the proposed hybrid predictive model outperforms existing state‐of‐the‐art benchmark algorithms in terms of predictive efficiency and accuracy. Additionally, this approach combines the interpretability of statistical regression methods with the powerful nonlinear modeling capabilities of DL‐based models, achieving a synergistic effect. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Data‐Driven Materials Research and Development for Functional Coatings.

Author: Xu, Kai, Xiao, Xuelian, Wang, Linjing, Lou, Ming, Wang, Fangming, Li, Changheng, Ren, Hui, Wang, Xue, and Chang, Keke
Subjects: ARTIFICIAL intelligence, MANUFACTURING processes, ORGANIC coatings, MATERIALS science, MACHINE learning
Abstract: Functional coatings, including organic and inorganic coatings, play a vital role in various industries by providing a protective layer and introducing unique functionalities. However, its design often involves time‐consuming experimentation with multiple materials and processing parameters. To overcome these limitations, data‐driven approaches are gaining traction in materials science. In this paper, recent advances in data‐driven materials research and development (R&D) for functional coatings, highlighting the importance, data sources, working processes, and applications of this paradigm are summarized. It is begun by discussing the challenges of traditional methods, then introduce typical data‐driven processes. It is demonstrated how data‐driven approaches enable the identification of correlations between input parameters and coating performance, thus allowing for efficient prediction and design. Furthermore, carefully selected case studies are presented across diverse industries that exemplify the effectiveness of data‐driven methods in accelerating the discovery of new functional coatings with tailored properties. Finally, the emerging research directions, involving integrating advanced techniques and data from different sources, are addressed. Overall, this review provides an overview of data‐driven materials R&D for functional coatings, shedding light on its potential and future developments. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Classification and Comparative Evaluation of Text and Emoji‐Based Tweets With Deep Neural Network Models.

Author: Chandra Sekhar, J. N., Kiran Mayee, M., Nadagoudar, Ranjana, Chinna Alluraiah, N., Dhanamjayulu, C., Chinthaginjala, Ravikumar, K., Ravi, M., Praveenkumar, Mohanty, Satyajit, Khan, Baseem, and Ahsan, Mominul
Subjects: ARTIFICIAL neural networks, DIGITAL technology, SENTIMENT analysis, ARTIFICIAL intelligence, SOCIAL network analysis, EMOTICONS & emojis, DEEP learning
Abstract: Emojis have become increasingly prevalent in today's digital world, allowing individuals to convey a wide range of emotions, from uncomplicated to intricate, to a greater extent than previously. Consequently, emojis are being utilized in sentiment analysis and tailored marketing strategies. The ongoing research on conducting emotion detection on both tweets and a symbolic expression dataset sourced from Kaggle. Given that tweets are largely commentaries, we utilized two end‐to‐end sentence embedding models, the DistilBERT, USE‐Large3, and RoBERTa, which generate embeddings. These embeddings are further utilized for training with dense neural networks (NNs) and LSTM techniques. Remarkably, it was perceived that the text classification accuracy for both models was consistently high, hovering around 98%. However, when the validation set is constructed with that of symbolic expression or emojis not included in the training dataset, a significant drop in accuracy for both models, plummeting to 75%, had been observed. Additionally, a distributed training methodology is utilized as a substitute for the conventional single‐threaded model to enhance scalability. This approach resulted in a roughly 17% reduction in the runtime while maintaining accuracy. Lastly, in pursuit of explainable AI, the SHAP and LIME algorithms are employed to elucidate the model's behavior and assess any potential biases in the dataset. The creative use of advanced deep NN techniques customized for the delicate complexities of hybrid‐data sentiment analysis indicates a significant leap forward. Our proposed work provides the critical gap in existing sentiment analysis methods, which primarily aimed at either text or emojis in isolation, thereby exploring more holistic understanding of sentiment in digital communications. Moreover, the application of explainable AI techniques, SHAP and LIME, to demystify model decisions emphasizes commitment to advancing transparent and trustworthy deep learning technologies in sentiment analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. The role of artificial intelligence in the management of liver diseases.

Author: Lu, Ming‐Ying, Chuang, Wan‐Long, and Yu, Ming‐Lung
Subjects: HEPATITIS C virus, HEPATITIS B virus, ARTIFICIAL intelligence, MACHINE learning, DISEASE management, HEPATITIS C
Abstract: Universal neonatal hepatitis B virus (HBV) vaccination and the advent of direct‐acting antivirals (DAA) against hepatitis C virus (HCV) have reshaped the epidemiology of chronic liver diseases. However, some aspects of the management of chronic liver diseases remain unresolved. Nucleotide analogs can achieve sustained HBV DNA suppression but rarely lead to a functional cure. Despite the high efficacy of DAAs, successful antiviral therapy does not eliminate the risk of hepatocellular carcinoma (HCC), highlighted the need for cost‐effective identification of high‐risk populations for HCC surveillance and tailored HCC treatment strategies for these populations. The accessibility of high‐throughput genomic data has accelerated the development of precision medicine, and the emergence of artificial intelligence (AI) has led to a new era of precision medicine. AI can learn from complex, non‐linear data and identify hidden patterns within real‐world datasets. The combination of AI and multi‐omics approaches can facilitate disease diagnosis, biomarker discovery, and the prediction of treatment efficacy and prognosis. AI algorithms have been implemented in various aspects, including non‐invasive tests, predictive models, image diagnosis, and the interpretation of histopathology findings. AI can support clinicians in decision‐making, alleviate clinical burdens, and curtail healthcare expenses. In this review, we introduce the fundamental concepts of machine learning and review the role of AI in the management of chronic liver diseases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. AI in microbiome‐related healthcare.

Author: Probul, Niklas, Huang, Zihua, Saak, Christina Caroline, Baumbach, Jan, and List, Markus
Subjects: VISUAL fields, ARTIFICIAL intelligence, DIAGNOSTIC imaging, PRIVACY, MEDICAL care
Abstract: Artificial intelligence (AI) has the potential to transform clinical practice and healthcare. Following impressive advancements in fields such as computer vision and medical imaging, AI is poised to drive changes in microbiome‐based healthcare while facing challenges specific to the field. This review describes the state‐of‐the‐art use of AI in microbiome‐related healthcare. It points out limitations across topics such as data handling, AI modelling and safeguarding patient privacy. Furthermore, we indicate how these current shortcomings could be overcome in the future and discuss the influence and opportunities of increasingly complex data on microbiome‐based healthcare. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Transformer‐based representation learning and multiple‐instance learning for cancer diagnosis exclusively from raw sequencing fragments of bisulfite‐treated plasma cell‐free DNA.

Author: Liu, Jilei, Shen, Hongru, Yang, Yichen, Yang, Meng, Zhang, Qiang, Chen, Kexin, and Li, Xiangchun
Published: 2024
Full Text: View/download PDF

35. Assessing ChatGPT's Information Quality Through the Lens of User Information Satisfaction and Information Quality Theory in Higher Education: A Theoretical Framework.

Author: Fu, Chung-Jen, Silalahi, Andri Dayarana K., Shih, I-Tung, Phuong, Do Thi Thanh, Eunike, Ixora Javanisa, Jargalsaikhan, Shinetsetseg, and Pratiwi, Puspa Setia
Subjects: CHATGPT, STRUCTURAL equation modeling, ARTIFICIAL intelligence, INFORMATION theory, SATISFACTION
Abstract: Incorporating AI tools like ChatGPT into higher education has been beneficial, yet the extent of user satisfaction with the quality of information provided by these tools, known as user information satisfaction (UIS) and information quality (IQ) theory, remains underexplored. This study introduces a UIS model specifically designed for ChatGPT's application in the educational sector based on multidimensions of IQ theory. Drawing from established UIS and IQ theory, we crafted a model centered around seven essential factors that influence the effective use of ChatGPT, aiming to guide educators and learners in overcoming common challenges such as plagiarism and ensuring the ethical use of AI. Data was collected from Indonesian university participants (N = 508) and analyzed using structural equation modeling with Smart‐PLS 4.0. The results reveal that completeness, precision, timeliness, convenience, and information format are the most influential factors driving user satisfaction with ChatGPT. Interestingly, our research indicated that the accuracy and reliability of the information, typically deemed paramount, were not the primary concerns in the academic use of ChatGPT. Our findings recommend a cautious approach to integrating ChatGPT in higher education. We advocate for strategic use that recognizes its innovative potential while acknowledging its limitations, ensuring responsible and effective application in educational contexts. This balanced perspective is crucial for integrating AI tools into the academic fabric without compromising educational integrity or quality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. The Road Ahead: Emerging Trends, Unresolved Issues, and Concluding Remarks in Generative AI—A Comprehensive Review.

Author: S., Balasubramaniam, Chirchi, Vanajaroselin, Kadry, Seifedine, Agoramoorthy, Moorthy, P., Gururama Senthilvel, K., Satheesh Kumar, T. A., Sivakumar, and Vocaturo, Eugenio
Subjects: GENERATIVE artificial intelligence, SPEECH synthesis, NATURAL language processing, LANGUAGE models, GENERATIVE adversarial networks, DEEP learning
Abstract: The field of generative artificial intelligence (AI) is experiencing rapid advancements, impacting a multitude of sectors, from computer vision to healthcare. This paper provides a comprehensive review of generative AI's evolution, significance, and applications, including the foundational architectures such as generative adversarial networks (GANs), variational autoencoders (VAEs), autoregressive models, flow‐based models, and diffusion models. We delve into the impact of generative algorithms on computer vision, natural language processing, artistic creation, and healthcare, demonstrating their revolutionary potential in data augmentation, text and speech synthesis, and medical image interpretation. While the transformative capabilities of generative AI are acknowledged, the paper also examines ethical concerns, most notably the advent of deepfakes, calling for the development of robust detection frameworks and responsible use guidelines. As generative AI continues to evolve, driven by advances in neural network architectures and deep learning methodologies, this paper provides a holistic overview of the current landscape and a roadmap for future research and ethical considerations in generative AI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. A hybrid neural network model based on transfer learning for Arabic sentiment analysis of customer satisfaction.

Author: Bakhit, Duha Mohamed Adam, Nderu, Lawrence, and Ngunyi, Antony
Subjects: LANGUAGE models, SOCIAL media, ARTIFICIAL neural networks, RECURRENT neural networks, SENTIMENT analysis
Abstract: Sentiment analysis, a method used to classify textual content into positive, negative, or neutral sentiments, is commonly applied to data from social media platforms. Arabic, an official language of the United Nations, presents unique challenges for sentiment analysis due to its complex morphology and dialectal diversity. Compared to English, research on Arabic sentiment analysis is relatively scarce. Transfer learning, which applies the knowledge learned from one domain to another, can address the limitations of training time and computational resources. However, the development of transfer learning for Arabic sentiment analysis is still underdeveloped. In this study, we develop a new hybrid model, RNN‐BiLSTM, which merges recurrent neural networks (RNN) and bidirectional long short‐term memory (BiLSTM) networks. We used Arabic bidirectional encoder representations from transformers (AraBERT), a state‐of‐the‐art Arabic language pre‐trained transformer‐based model, to generate word‐embedding vectors. The RNN‐BiLSTM model integrates the strengths of RNN and BiLSTM, including the ability to learn sequential dependencies and bidirectional context. We trained the RNN‐BiLSTM model on the source domain, specifically the Arabic reviews dataset (ARD). The RNN‐BiLSTM model outperforms the RNN and BiLSTM models with default parameters, achieving an accuracy of 95.75%. We further applied transfer learning to the RNN‐BiLSTM model by fine‐tuning its parameters using random search. We compared the performance of the fine‐tuned RNN‐BiLSTM model with the RNN and BiLSTM models on two target domain datasets: ASTD and Aracust. The results showed that the fine‐tuned RNN‐BiLSTM model is more effective for transfer learning, achieving an accuracy of 95.44% and 96.19% on the ASTD and Aracust datasets, respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. PF‐GEMV: Utilization maximizing architecture in fast matrix–vector multiplication for GPT‐2 inference.

Author: Kim, Hyeji, Lee, Yeongmin, and Lyuh, Chun‐Gi
Subjects: ARTIFICIAL neural networks, ARTIFICIAL intelligence, MULTIPLICATION
Abstract: Owing to the widespread advancement of transformer‐based artificial neural networks, artificial intelligence (AI) processors are now required to perform matrix–vector multiplication in addition to the conventional matrix–matrix multiplication. However, current AI processor architectures are optimized for general matrix–matrix multiplications (GEMMs), which causes significant throughput degradation when processing general matrix–vector multiplications (GEMVs). In this study, we proposed a port‐folding GEMV (PF‐GEMV) scheme employing multiformat and low‐precision techniques while reusing an outer product‐based processor optimized for conventional GEMM operations. This approach achieves 93.7% utilization in GEMV operations with an 8‐bit format on an 8 × 8 processor, thus resulting in a 7.5 × increase in throughput compared with that of the original scheme. Furthermore, when applied to the matrix operation of the GPT‐2 large model, an increase in speed by 7 × is achieved in single‐batch inferences. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Incorporating Intra-Query Term Dependencies in an Aspect Query Language Model.

Author: Song, Dawei, Shi, Yanjie, Zhang, Peng, Huang, Qiang, Kruschwitz, Udo, Hou, Yuexian, and Wang, Bo
Subjects: QUERY languages (Computer science), INFORMATION retrieval software, MARKOV processes, TEXT Retrieval Conference, QUERY (Information retrieval system)
Abstract: Query language modeling based on relevance feedback has been widely applied to improve the effectiveness of information retrieval. However, intra-query term dependencies (i.e., the dependencies between different query terms and term combinations) have not yet been sufficiently addressed in the existing approaches. This article aims to investigate this issue within a comprehensive framework, namely the Aspect Query Language Model (AM). We propose to extend the AM with a hidden Markov model (HMM) structure to incorporate the intra-query term dependencies and learn the structure of a novel aspect HMM (AHMM) for query language modeling. In the proposed AHMM, the combinations of query terms are viewed as latent variables representing query aspects. They further form an ergodic HMM, where the dependencies between latent variables (nodes) are modeled as the transitional probabilities. The segmented chunks from the feedback documents are considered as observables of the HMM. Then the AHMM structure is optimized by the HMM, which can estimate the prior of the latent variables and the probability distribution of the observed chunks. Our extensive experiments on three large-scale text retrieval conference (TREC) collections have shown that our method not only significantly outperforms a number of strong baselines in terms of both effectiveness and robustness but also achieves better results than the AM and another state-of-the-art approach, namely the latent concept expansion model. © 2014 Wiley Periodicals, Inc. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

40. Exploring the Temperature Effect on Enantioselectivity of a Baeyer‐Villiger Biooxidation by the 2,5‐DKCMO Module: The SLM Approach.

Author: Röllig, Robert, Paul, Caroline E., Duquesne, Katia, Kara, Selin, and Alphand, Véronique
Published: 2022
Full Text: View/download PDF

41. Crystal Composition Transformer: Self‐Learning Neural Language Model for Generative and Tinkering Design of Materials.

Author: Wei, Lai, Li, Qinyang, Song, Yuqi, Stefanov, Stanislav, Dong, Rongzhi, Fu, Nihang, Siriwardane, Edirisuriya M. D., Chen, Fanglin, and Hu, Jianjun
Subjects: GENERATIVE artificial intelligence, NATURAL language processing, LANGUAGE models, PROBABILISTIC generative models, LOGIC design
Abstract: Self‐supervised neural language models have recently achieved unprecedented success from natural language processing to learning the languages of biological sequences and organic molecules. These models have demonstrated superior performance in the generation, structure classification, and functional predictions for proteins and molecules with learned representations. However, most of the masking‐based pre‐trained language models are not designed for generative design, and their black‐box nature makes it difficult to interpret their design logic. Here a Blank‐filling Language Model for Materials (BLMM) Crystal Transformer is proposed, a neural network‐based probabilistic generative model for generative and tinkering design of inorganic materials. The model is built on the blank‐filling language model for text generation and has demonstrated unique advantages in learning the "materials grammars" together with high‐quality generation, interpretability, and data efficiency. It can generate chemically valid materials compositions with as high as 89.7% charge neutrality and 84.8% balanced electronegativity, which are more than four and eight times higher compared to a pseudo‐random sampling baseline. The probabilistic generation process of BLMM allows it to recommend materials tinkering operations based on learned materials chemistry, which makes it useful for materials doping. The model is applied to discover a set of new materials as validated using the Density Functional Theory (DFT) calculations. This work thus brings the unsupervised transformer language models based generative artificial intelligence to inorganic materials. A user‐friendly web app for tinkering materials design has been developed and can be accessed freely at www.materialsatlas.org/blmtinker. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Cephalo: Multi‐Modal Vision‐Language Models for Bio‐Inspired Materials Analysis and Design.

Author: Buehler, Markus J.
Subjects: *LANGUAGE models, *NATURAL language processing, *GENERATIVE artificial intelligence, *COMPUTER vision, *BIOMATERIALS
Abstract: Cephalo is presented as a series of multimodal vision large language models (V‐LLMs) designed for materials science applications, integrating visual and linguistic data for enhanced understanding. A key innovation of Cephalo is its advanced dataset generation method. Cephalo is trained on integrated image and text data from thousands of scientific papers and science‐focused Wikipedia data demonstrates it can interpret complex visual scenes, generate precise language descriptions, and answer queries about images effectively. The combination of a vision encoder with an autoregressive transformer supports multimodal natural language understanding, which can be coupled with other generative methods to create an image‐to‐text‐to‐3D pipeline. To develop more capable models from smaller ones, both mixture‐of‐expert methods and model merging are reported. The models are examined in diverse use cases that incorporate biological materials, fracture and engineering analysis, protein biophysics, and bio‐inspired design based on insect behavior. Generative applications include bio‐inspired designs, including pollen‐inspired architected materials, as well as the synthesis of bio‐inspired material microstructures from a photograph of a solar eclipse. Additional model fine‐tuning with a series of molecular dynamics results demonstrate Cephalo's enhanced capabilities to accurately predict statistical features of stress and atomic energy distributions, as well as crack dynamics and damage in materials. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Mechanical Neural Networks with Explicit and Robust Neurons.

Author: Mei, Tie, Zhou, Yuan, and Chen, Chang Qing
Subjects: CONVOLUTIONAL neural networks, ARTIFICIAL neural networks, MECHANICAL ability, ASSOCIATIVE learning, INTELLIGENCE levels, RECURRENT neural networks
Abstract: Mechanical computing provides an information processing method to realize sensing‐analyzing‐actuation integrated mechanical intelligence and, when combined with neural networks, can be more efficient for data‐rich cognitive tasks. The requirement of solving implicit and usually nonlinear equilibrium equations of motion in training mechanical neural networks makes computation challenging and costly. Here, an explicit mechanical neuron is developed of which the response can be directly determined without the need of solving equilibrium equations. A training method is proposed to ensure the robustness of the neuron, i.e., insensitivity to defects and perturbations. The explicitness and robustness of the neurons facilitate the assembly of various network structures. Two exemplified networks, a robust mechanical convolutional neural network and a mechanical recurrent neural network with long short‐term memory capabilities for associative learning, are experimentally demonstrated. The introduction of the explicit and robust mechanical neuron streamlines the design of mechanical neural networks fulfilling robotic matter with a level of intelligence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Enhancing EEG signals classification using LSTM‐CNN architecture.

Author: Omar, Swaleh M., Kimwele, Michael, Olowolayemo, Akeem, and Kaburu, Dennis M.
Subjects: SIGNAL classification, ELECTROENCEPHALOGRAPHY, CONVOLUTIONAL neural networks, DEEP learning
Abstract: Epilepsy is a disorder that interferes with regular brain activity and can occasionally cause seizures, odd sensations, and momentary unconsciousness. Epilepsy is frequently diagnosed using electroencephalograph (EEG) records, although conventional analysis is subjective and prone to error. The dynamic and non‐stationary nature of EEG structure restricted the performance of Deep Learning (DL) approaches used in earlier work to improve EEG classification. Our multi‐channel EEG classification model, dubbed LConvNet in this paper, combines Convolutional Neural Networks (CNN) for extracting spatial features and Long Short‐Term Memory (LSTM) for identifying temporal dependencies. To discriminate between epileptic and healthy EEG signals, the model is trained using open‐source secondary EEG data from Temple University Hospital (TUH). Our model outperformed other EEG classification models employed in comparable tasks, such as EEGNet, DeepConvNet, and ShallowConvNet, which had accuracy rates of 86%, 96%, and 78%, respectively. Our model attained an amazing accuracy rate of 97%. During additional testing, our model also displayed excellent performance in trainability, scalability, and parameter efficiency. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Deep learning methods for protein structure prediction.

Author: Qin, Yiming, Chen, Zihan, Peng, Ye, Xiao, Ying, Zhong, Tian, and Yu, Xi
Published: 2024
Full Text: View/download PDF

46. Soft Valves: A Review of Structures, Materials, and Modeling.

Author: Han, Fenglin, Li, Qixin, Xiong, Huang, He, Chunli, Zhao, Haiming, and Chen, Zhi
Subjects: VALVES, SOFT robotics, RESEARCH personnel, APPROPRIATE technology, HIGH technology
Abstract: Soft robots have been advancing rapidly, but their control is still limited by rigid control elements. Soft valves offer a solution to this problem by enabling soft robots to no longer rely on rigid control elements. They have become an emerging research topic in soft robotics. However, with a large number of publications on soft valves, it may be challenging for researchers to quickly grasp the advanced technology related to soft valves. To address this issue, this article summarizes the current state of development in soft valves. The design principles and applications of soft valves in terms of structures and materials are discussed, along with the modeling ideas for soft valves. Finally, the current challenges faced by soft valves are outlined, and potential solutions to these problems are proposed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Gated Spatial–Temporal Merged Transformer Inspired by Multimask and Dual Branch for Traffic Forecasting.

Author: Yang, Yongpeng, Yang, Zhenzhen, Yang, Zhen, and Zheng, Qinghe
Subjects: TRAFFIC estimation, INTELLIGENT transportation systems, SPATIAL ability, TIME series analysis, INFORMATION resources management
Abstract: As an essential part of intelligent transportation system (ITS), traffic forecasting has provided crucial role for traffic management and risk assessment. However, complex spatial–temporal dependencies, heterogeneity, dynamicity, and periodicity of traffic data influence the traffic forecasting performance. Consequently, we propose a novel effective gated spatial–temporal merged transformer (GSTMT) inspired by multimask and dual branch for accurate traffic forecasting in this paper. Specifically, we first conduct a concatenation of gated spatial static mask transformer (GSSMT) and gated spatial dynamic mask transformer (GSDMT) with residual network. The GSSMT and GSDMT evolve from the traditional transformer by making preferable modifications that include gated linear unit (GLU), multimask mechanism including static mask matrix (SMM) and dynamic mask matrix (DMM), and spatial attention (SA). Among them, GLU is to promote the performance of capturing spatial dependency, dynamicity, and heterogeneity due to advanced performance for controlling information flow through layers. Additionally, by developing multimask mechanism including two novel SMM and DMM, the proposed GSTMT can precisely model the static and dynamic spatial structure for effectively highlighting static dependency and dynamicity. And SA is injected for enhancing the ability of capturing spatial dependency of GSSMT and GSDMT. Secondly, we develop a dual‐branch gated temporal transformer (DBGTT) for capturing temporal dependency, heterogeneity, dynamicity, and periodicity via incorporating the GLU and mixed time series decomposition (MTD) into traditional transformer. Similarly, we also introduce the GLU for empowering DBGTT with capability of capturing temporal dependency, dynamicity, and heterogeneity. In addition, MTD, which brings dual‐branch mechanism, can enhance the DBGTT for capturing more detailed temporal information via exploiting global and periodic profile of traffic data. At last, some experiments, which are performed on several real‐world traffic datasets, demonstrate the better results over classic traffic forecasting methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. HydrogelFinder: A Foundation Model for Efficient Self‐Assembling Peptide Discovery Guided by Non‐Peptidal Small Molecules.

Author: Ren, Xuanbai, Wei, Jiaying, Luo, Xiaoli, Liu, Yuansheng, Li, Kenli, Zhang, Qiang, Gao, Xin, Yan, Sizhe, Wu, Xia, Jiang, Xingyue, Liu, Mingquan, Cao, Dongsheng, Wei, Leyi, Zeng, Xiangxiang, and Shi, Junfeng
Subjects: SMALL molecules, PEPTIDES, PEPTIDE amphiphiles, MOLECULAR structure, MOLECULAR self-assembly, AMINO acids
Abstract: Self‐assembling peptides have numerous applications in medicine, food chemistry, and nanotechnology. However, their discovery has traditionally been serendipitous rather than driven by rational design. Here, HydrogelFinder, a foundation model is developed for the rational design of self‐assembling peptides from scratch. This model explores the self‐assembly properties by molecular structure, leveraging 1,377 self‐assembling non‐peptidal small molecules to navigate chemical space and improve structural diversity. Utilizing HydrogelFinder, 111 peptide candidates are generated and synthesized 17 peptides, subsequently experimentally validating the self‐assembly and biophysical characteristics of nine peptides ranging from 1–10 amino acids—all achieved within a 19‐day workflow. Notably, the two de novo‐designed self‐assembling peptides demonstrated low cytotoxicity and biocompatibility, as confirmed by live/dead assays. This work highlights the capacity of HydrogelFinder to diversify the design of self‐assembling peptides through non‐peptidal small molecules, offering a powerful toolkit and paradigm for future peptide discovery endeavors. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. LsAc ∗‐MJ: A Low‐Resource Consumption Reinforcement Learning Model for Mahjong Game.

Author: Li, Xiali, Wang, Zhaoqi, Liu, Bo, Dai, Junxue, and Tan, Yu-an
Subjects: DEEP reinforcement learning, REINFORCEMENT learning, ALGORITHMS, SCARCITY
Abstract: This article proposes a novel Mahjong game model, LsAc ∗‐MJ, designed to address challenges posed by data scarcity, difficulty in leveraging contextual information, and the computational resource‐intensive nature of self‐play zero‐shot learning. The model is applied to Japanese Mahjong for experiments. LsAc ∗‐MJ employs long short‐term memory (LSTM) neural networks, utilizing hidden nodes to store and propagate contextual historical information, thereby enhancing decision accuracy. Additionally, the paper introduces an optimized Advantage Actor‐Critic (A2C) algorithm incorporating an experience replay mechanism to enhance the model's decision‐making capabilities and mitigate convergence difficulties arising from strong data correlations. Furthermore, the paper presents a two‐stage training approach for self‐play deep reinforcement learning models guided by expert knowledge, thereby improving training efficiency. Extensive ablation experiments and performance comparisons demonstrate that, in contrast to other typical deep reinforcement learning models on the RLcard platform, the LsAc ∗‐MJ model consumes lower computational and time resources, has higher training efficiency, faster average decision time, higher win‐rate, and stronger decision‐making ability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Evaluating ChatGPT as a Patient Education Tool for COVID‐19‐Induced Olfactory Dysfunction.

Author: Sina, Elliott M., Campbell, Daniel J., Duffy, Alexander, Mandloi, Shreya, Benedict, Peter, Farquhar, Douglas, Unsal, Aykut, and Nyquist, Gurston
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

841 results on '"Language Modeling"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources