271 results on '"Language Modeling"'
Search Results
2. Comparing SMILES and SELFIES tokenization for enhanced chemical language modeling
- Author
-
Miguelangel Leon, Yuriy Perezhohin, Fernando Peres, Aleš Popovič, and Mauro Castelli
- Subjects
Medicine ,Science - Abstract
Abstract Life sciences research and experimentation are resource-intensive, requiring extensive trials and considerable time. Often, experiments do not achieve their intended objectives, but progress is made through trial and error, eventually leading to breakthroughs. Machine learning is transforming this traditional approach, providing methods to expedite processes and accelerate discoveries. Deep Learning is becoming increasingly prominent in chemistry, with Convolutional Graph Networks (CGN) being a key focus, though other approaches also show significant potential. This research explores the application of Natural Language Processing (NLP) to evaluate the effectiveness of chemical language representations, specifically SMILES and SELFIES, using tokenization methods such as Byte Pair Encoding (BPE) and a novel approach developed in this study, Atom Pair Encoding (APE), in BERT-based models. The primary objective is to assess how these tokenization techniques influence the performance of chemical language models in biophysics and physiology classification tasks. The findings reveal that APE, particularly when used with SMILES representations, significantly outperforms BPE by preserving the integrity and contextual relationships among chemical elements, thereby enhancing classification accuracy. Performance was evaluated in downstream classification tasks using three distinct datasets for HIV, toxicology, and blood–brain barrier penetration, with ROC-AUC serving as the evaluation metric. This study highlights the critical role of tokenization in processing chemical language and suggests that refining these techniques could lead to significant advancements in drug discovery and material science.
- Published
- 2024
- Full Text
- View/download PDF
3. Early detection of pediatric health risks using maternal and child health data.
- Author
-
Ilin, Cornelia
- Subjects
EMERGENCY room visits ,CHILDREN'S health ,CHILD patients ,HOSPITAL admission & discharge ,LENGTH of stay in hospitals ,HEALTH services accessibility - Abstract
Machine learning (ML)-driven diagnosis systems are particularly relevant in pediatrics given the well-documented impact of early-life health conditions on later-life outcomes. Yet, early identification of diseases and their subsequent impact on length of hospital stay for this age group has so far remained uncharacterized, likely because access to relevant health data is severely limited. Thanks to a confidential data use agreement with the California Department of Health Care Access and Information, we introduce Ped-BERT: a state-of-the-art deep learning model that accurately predicts the likelihood of 100+ conditions and the length of stay in a pediatric patient's next medical visit. We link mother-specific pre- and postnatal period health information to pediatric patient hospital discharge and emergency room visits. Our data set comprises 513.9K mother–baby pairs and contains medical diagnosis codes, length of stay, as well as temporal and spatial pediatric patient characteristics, such as age and residency zip code at the time of visit. Following the popular bidirectional encoder representations from the transformers (BERT) approach, we pre-train Ped-BERT via the masked language modeling objective to learn embedding features for the diagnosis codes contained in our data. We then continue to fine-tune our model to accurately predict primary diagnosis outcomes and length of stay for a pediatric patient's next visit, given the history of previous visits and, optionally, the mother's pre- and postnatal health information. We find that Ped-BERT generally outperforms contemporary and state-of-the-art classifiers when trained with minimum features. We also find that incorporating mother health attributes leads to significant improvements in model performance overall and across all patient subgroups in our data. Our most successful Ped-BERT model configuration achieves an area under the receiver operator curve (ROC AUC) of 0.927 and an average precision score (APS) of 0.408 for the diagnosis prediction task, and a ROC AUC of 0.855 and APS of 0.815 for the length of hospital stay task. Further, we examine Ped-BERT's fairness by determining whether prediction errors are evenly distributed across various subgroups of mother–baby demographics and health characteristics, or if certain subgroups exhibit a higher susceptibility to prediction errors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Molecule generation using transformers and policy gradient reinforcement learning.
- Author
-
Mazuz, Eyal, Shtar, Guy, Shapira, Bracha, and Rokach, Lior
- Subjects
DEEP learning ,REINFORCEMENT learning ,BIOMOLECULES ,MOLECULES ,TAIGAS ,CHEMISTS - Abstract
Generating novel valid molecules is often a difficult task, because the vast chemical space relies on the intuition of experienced chemists. In recent years, deep learning models have helped accelerate this process. These advanced models can also help identify suitable molecules for disease treatment. In this paper, we propose Taiga, a transformer-based architecture for the generation of molecules with desired properties. Using a two-stage approach, we first treat the problem as a language modeling task of predicting the next token, using SMILES strings. Then, we use reinforcement learning to optimize molecular properties such as QED. This approach allows our model to learn the underlying rules of chemistry and more easily optimize for molecules with desired properties. Our evaluation of Taiga, which was performed with multiple datasets and tasks, shows that Taiga is comparable to, or even outperforms, state-of-the-art baselines for molecule optimization, with improvements in the QED ranging from 2 to over 20 percent. The improvement was demonstrated both on datasets containing lead molecules and random molecules. We also show that with its two stages, Taiga is capable of generating molecules with higher biological property scores than the same model without reinforcement learning. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. ProteinGLUE multi-task benchmark suite for self-supervised protein modeling.
- Author
-
Capel, Henriette, Weiler, Robin, Dijkstra, Maurits, Vleugels, Reinier, Bloem, Peter, and Feenstra, K. Anton
- Subjects
PROTEIN models ,AMINO acid sequence ,PROTEIN structure ,PROTEIN analysis ,SEQUENCE analysis - Abstract
Self-supervised language modeling is a rapidly developing approach for the analysis of protein sequence data. However, work in this area is heterogeneous and diverse, making comparison of models and methods difficult. Moreover, models are often evaluated only on one or two downstream tasks, making it unclear whether the models capture generally useful properties. We introduce the ProteinGLUE benchmark for the evaluation of protein representations: a set of seven per-amino-acid tasks for evaluating learned protein representations. We also offer reference code, and we provide two baseline models with hyperparameters specifically trained for these benchmarks. Pre-training was done on two tasks, masked symbol prediction and next sentence prediction. We show that pre-training yields higher performance on a variety of downstream tasks such as secondary structure and protein interaction interface prediction, compared to no pre-training. However, the larger base model does not outperform the smaller medium model. We expect the ProteinGLUE benchmark dataset introduced here, together with the two baseline pre-trained models and their performance evaluations, to be of great value to the field of protein sequence-based property prediction. Availability: code and datasets from https://github.com/ibivu/protein-glue. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Table to text generation with accurate content copying.
- Author
-
Yang, Yang, Cao, Juan, Wen, Yujun, and Zhang, Pengzhou
- Subjects
PROBABILITY theory ,VOCABULARY - Abstract
Generating fluent, coherent, and informative text from structured data is called table-to-text generation. Copying words from the table is a common method to solve the "out-of-vocabulary" problem, but it's difficult to achieve accurate copying. In order to overcome this problem, we invent an auto-regressive framework based on the transformer that combines a copying mechanism and language modeling to generate target texts. Firstly, to make the model better learn the semantic relevance between table and text, we apply a word transformation method, which incorporates the field and position information into the target text to acquire the position of where to copy. Then we propose two auxiliary learning objectives, namely table-text constraint loss and copy loss. Table-text constraint loss is used to effectively model table inputs, whereas copy loss is exploited to precisely copy word fragments from a table. Furthermore, we improve the text search strategy to reduce the probability of generating incoherent and repetitive sentences. The model is verified by experiments on two datasets and better results are obtained than the baseline model. On WIKIBIO, the result is improved from 45.47 to 46.87 on BLEU and from 41.54 to 42.28 on ROUGE. On ROTOWIRE, the result is increased by 4.29% on CO metric, and 1.93 points higher on BLEU. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
7. De-identification is not enough: a comparison between de-identified and synthetic clinical notes.
- Author
-
Sarkar, Atiquer Rahman, Chuang, Yao-Shun, Mohammed, Noman, and Jiang, Xiaoqian
- Abstract
For sharing privacy-sensitive data, de-identification is commonly regarded as adequate for safeguarding privacy. Synthetic data is also being considered as a privacy-preserving alternative. Recent successes with numerical and tabular data generative models and the breakthroughs in large generative language models raise the question of whether synthetically generated clinical notes could be a viable alternative to real notes for research purposes. In this work, we demonstrated that (i) de-identification of real clinical notes does not protect records against a membership inference attack, (ii) proposed a novel approach to generate synthetic clinical notes using the current state-of-the-art large language models, (iii) evaluated the performance of the synthetically generated notes in a clinical domain task, and (iv) proposed a way to mount a membership inference attack where the target model is trained with synthetic data. We observed that when synthetically generated notes closely match the performance of real data, they also exhibit similar privacy concerns to the real data. Whether other approaches to synthetically generated clinical notes could offer better trade-offs and become a better alternative to sensitive real notes warrants further investigation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Manner implicatures in large language models.
- Author
-
Cong, Yan
- Subjects
LANGUAGE models ,NATURAL languages ,PRAGMATICS ,INFERENCE (Logic) ,SEMANTICS ,GLOVES - Abstract
In human speakers' daily conversations, what we do not say matters. We not only compute the literal semantics but also go beyond and draw inferences from what we could have said but chose not to. How well is this pragmatic reasoning process represented in pre-trained large language models (LLM)? In this study, we attempt to address this question through the lens of manner implicature, a pragmatic inference triggered by a violation of the Grice manner maxim. Manner implicature is a central member of the class of context-sensitive phenomena. The current work investigates to what extent pre-trained LLMs are able to identify and tease apart different shades of meaning in manner implicature. We constructed three metrics to explain LLMs' behavior, including LLMs-surprisals, embedding vectors' similarities, and natural language prompting. Results showed no striking evidence that LLMs have explainable representations of meaning. First, the LLMs-surprisal findings suggest that some LLMs showed above chance accuracy in capturing different dimensions of meaning, and they were able to differentiate neutral relations from entailment or implications, but they did not show consistent and robust sensitivities to more nuanced comparisons, such as entailment versus implications and equivalence versus entailment. Second, the similarity findings suggest that the perceived advantage of contextual over static embeddings was minimal, and contextual LLMs did not notably outperform static GloVe embeddings. LLMs and GloVe showed no significant difference, though distinctions between entailment and implication were slightly more observable in LLMs. Third, the prompting findings suggest no further supportive evidence indicating LLM's competence in fully representing different shades of meaning. Overall, our study suggests that current dominant pre-training paradigms do not seem to lead to significant competence in manner implicature within our models. Our investigation sheds light on the design of datasets and benchmark metrics driven by formal and distributional linguistic theories. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Medical language model specialized in extracting cardiac knowledge.
- Author
-
Gwon, Hansle, Seo, Jiahn, Park, Seohyun, Kim, Young-Hak, and Jun, Tae Joon
- Subjects
NATURAL language processing ,LANGUAGE models ,MEDICAL language ,TRANSFORMER models ,LANGUAGE acquisition ,DEEP learning - Abstract
The advent of the Transformer has significantly altered the course of research in Natural Language Processing (NLP) within the domain of deep learning, making Transformer-based studies the mainstream in subsequent NLP research. There has also been considerable advancement in domain-specific NLP research, including the development of specialized language models for medical. These medical-specific language models were trained on medical data and demonstrated high performance. While these studies have treated the medical field as a single domain, in reality, medical is divided into multiple departments, each requiring a high level of expertise and treated as a unique domain. Recognizing this, our research focuses on constructing a model specialized for cardiology within the medical sector. Our study encompasses the creation of open-source datasets, training, and model evaluation in this nuanced domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Rumor detection model with weighted GraphSAGE focusing on node location.
- Author
-
Ma, Manfu, Zhang, Cong, Li, Yong, Chen, Jiahao, and Wang, Xuegang
- Subjects
SOCIAL media ,DEEP learning ,INFORMATION dissemination ,RUMOR ,MICROBLOGS - Abstract
While social media platforms promote people's information exchange and dissemination, they also make rumors spread rapidly on online platforms. Therefore, how to detect rumors quickly, timely and accurately has become a hot topic for scholars in related fields. Traditional deep learning models ignore the relationship and topology between nodes in the rumor detection task and use fixed weights or mean aggregation strategies in the feature aggregation process, which fail to capture the complex interactions between nodes and the dynamics of information propagation, limiting the accuracy and robustness of the rumor detection model. To address these problems, we propose a location-aware weighted GraphSAGE rumor detection model GSMA. We first introduce an attention mechanism that dynamically assigns different attention weights to different neighboring nodes for different degrees of aggregation, improving GraphSAGE's strategy of using mean-value aggregation for all neighboring nodes during the aggregation process; second, we introduce a modulated position encoding into the model and encode the position information of nodes into the features to improve the model's ability to perceive the relative position and order of nodes; finally, the post text sentiment is incorporated into the features to provide additional semantic information for the model as a way to achieve rumor detection in microblogging platforms. Experiments show that the accuracy of the GSMA model on Ma-Weibo and Weibo23 reaches 97.43% and 97.55%, which is an improvement of 1.11% and 0.77% compared to the benchmark GraphSAGE, and all the evaluation metrics are also improved compared to other optimal rumor detection models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Chaotic vibration control of an axially moving string of multidimensional nonlinear dynamic system with an improved FSMC.
- Author
-
Liu, Ming, Lv, Jiaole, Wu, Liping, and Li, Yining
- Subjects
ARTIFICIAL neural networks ,NONLINEAR dynamical systems ,VON Karman equations ,SLIDING mode control ,RECURRENT neural networks - Abstract
A new control approach based on fuzzy sliding mode control (FSMC) is proposed to regulate the chaotic vibration of an axial string. Hamilton's principle is used to formulate the nonlinear equation of motion of the axial translation string, and the von Kármán equations are used to analyse the geometric nonlinearity. The governing equations are nondimensionalized as partial differential equations and transformed into a nonlinear 3-dimensional system via the third-order Galerkin approach. An active control technique based on the FSMC approach is suggested for the derived dynamic system. By using a recurrent neural network model, we can accurately predict and effectively apply a control strategy to suppress chaotic movements. The necessity of the suggested active control method in the regulation of the nonlinear axial translation string system is proven using different chaotic vibrations. The results show that the study of the chaotic vibrations of axially translating strings requires nonlinear multidimensional dynamic systems of axially moving strings; the validity of the proposed control strategy in controlling the chaotic vibration of axially moving strings in a multidimensional form is demonstrated. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Enhancing domain-specific text generation for power grid maintenance with P2FT.
- Author
-
Yang, Yi, Li, Chenhao, Zhu, Binghang, Zheng, Wenjie, Zhang, Fengda, and Li, Zhuangzhuang
- Subjects
LANGUAGE models ,NATURAL language processing ,ELECTRIC power distribution grids ,PROCESS capability ,COMPUTER performance - Abstract
The digitization of operation and maintenance in the intelligent power grid equipment relies on a diverse array of information for smart decision-making. In the domain of intelligent decision generation, proficiency is contingent upon extensive learning from copious amounts of text. This necessitates not only robust processing capabilities but also a high level of specialization. In addressing situations where authorization is lacking, pre-trained language models (PLMs) have already provided ideas when confronted with specialized domains or tasks. In consideration of the complexity of textual content in the field of the power grid, which encompasses a multitude of specialized knowledge and involves an abundance of proprietary terminology, we have undertaken an exploration of pre-trained model specialization using the power grid domain as an example, specifically for the task of generating maintenance strategies. A two-stage fine-tuning approach (P2FT) is employed, utilizing a large-scale pre-training model specifically designed for natural language processing. The efficacy and practical value of this method were evaluated through multiple metrics, juxtaposed with other advanced approaches involving low-parameter or parameter-free fine-tuning methods. Through a meticulous analysis and validation of experimental outcomes, we have corroborated the feasibility and practical application value of employing this approach for pre-trained model specialization. Additionally, it has furnished valuable guidance for text generation within both the Chinese language domain and the power grid domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Admissions in the age of AI: detecting AI-generated application materials in higher education.
- Author
-
Zhao, Yijun, Borelli, Alexander, Martinez, Fernando, Xue, Haoran, and Weiss, Gary M.
- Subjects
LANGUAGE models ,ARTIFICIAL intelligence ,GRADUATE education ,CHATGPT ,COMPUTER science - Abstract
Recent advances in Artificial Intelligence (AI), such as the development of large language models like ChatGPT, have blurred the boundaries between human and AI-generated text. This has led to a pressing need for tools that can determine whether text has been created or revised using AI. A general and universally effective detection model would be extremely useful, but appears to be beyond the reach of current technology and detection methods. The research described in this study adopts a domain and task specific approach and shows that specialized detection models can attain high accuracy. The study focuses on the higher education graduate admissions process, with the specific goal of identifying AI-generated and AI-revised Letters of Recommendation (LORs) and Statements of Intent (SOIs). Detecting such application materials is essential to ensure that applicants are evaluated on their true merits and abilities, and to foster an equitable and trustworthy admissions process. Our research is based on 3755 LORs and 1973 SOIs extracted from the application records of Fordham University's Master's programs in Computer Science and Data Science. To facilitate the construction and evaluation of detection models, we generated AI counterparts for each LOR and SOI using the GPT-3.5 Turbo API. The prompts for AI-generation text were derived from the admission data of the respective applicants, and the AI-revised LORs and SOIs were generated directly from the human-authored versions. We also utilize an open-access GPT-wiki-intro dataset to further validate our hypothesis regarding the feasibility of constructing domain-specific AI content detectors. Our experiments yield promising results in developing classifiers tailored to a specific domain when provided with sufficient training samples. Additionally, we present a comparative analysis of the word frequency and statistical characteristics of the text, which provides convincing evidence that ChatGPT employs distinctive vocabulary and paragraph structure compared to human-authored text. The code for this study is available on GitHub, and the models can be executed on user-provided data via an interactive web interface. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. FlexSleepTransformer: a transformer-based sleep staging model with flexible input channel configurations.
- Author
-
Guo, Yanchen, Nowakowski, Maciej, and Dai, Weiying
- Subjects
ARTIFICIAL neural networks ,SLEEP stages ,DEEP learning ,TRANSFORMER models ,SLEEP - Abstract
Clinical sleep diagnosis traditionally relies on polysomnography (PSG) and expert manual classification of sleep stages. Recent advancements in deep learning have shown promise in automating sleep stage classification using a single PSG channel. However, variations in PSG acquisition devices and environments mean that the number of PSG channels can differ across sleep centers. To integrate a sleep staging method into clinical practice effectively, it must accommodate a flexible number of PSG channels. In this paper, we proposed FlexSleepTransformer, a transformer-based model designed to handle varying number of input channels, making it adaptable to diverse sleep staging datasets. We evaluated FlexSleepTransformer using two distinct datasets: the public SleepEDF-78 dataset and the local SleepUHS dataset. Notably, FlexSleepTransformer is the first model capable of simultaneously training on datasets with differing number of PSG channels. Our experiments showed that FlexSleepTransformer trained on both datasets together achieved 98% of the accuracy compared to models trained on each dataset individually. Furthermore, it outperformed models trained exclusively on one dataset when tested on the other dataset. Additionally, FlexSleepTransformer surpassed state-of-the-art CNN and RNN-based models on both datasets. Due to its adaptability with varying channels numbers, FlexSleepTransformer holds significant potential for clinical adoption, especially when trained with data from a wide range of sleep centers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Accurate and 30-plus days reliable cuffless blood pressure measurements with 9-minutes personal photoplethysmograph data and mixed deduction learning.
- Author
-
Mekonne, Bitewulign Kassa, Lu, Wei‑Ru, Hsieh, Tung‑Han, Chu, Justin, and Yang, Fu-Liang
- Subjects
CONVOLUTIONAL neural networks ,BLOOD pressure measurement ,INTENSIVE care units ,PERSONALLY identifiable information ,DATA reduction ,BLOOD pressure testing machines - Abstract
Cuffless blood pressure (BP) measurements have long been anticipated, and the PPG (Photoplethysmography)-only method is the most promising one since already embedded in many wearable devices. To further meet the clinical accuracy requirements, PPG-only BP predictions with personalized modeling for overcoming personal deviations have been widely studied, but all required tens to hundreds of minutes of personal PPG measurements for training. Moreover, their accurate test periods without calibration practice were not reported. In this work, we collected records of PPG data from our recruited subjects in real-life scenarios instead of relying on the openly available MIMIC dataset obtained from intensive care unit (ICU) patients. Since our objective is commercial application and a substantial reduction in training data, we tailored our model training to closely mimic real-world usage. To achieve this, we developed a training approach that only requires 9-minutes of personal PPG signal recordings and mixed with other PPG data from our recruited 364 subjects. The modeling is conducted with two-channel paired inputs to the convolutional neural network (CNN)-based model, which we called Mixed Deduction Learning (MDL). The test results of 88 samples from 15 subjects, under testing period up to 30-plus days without extra calibration, revealed that MDL meets most of the standards of AAMI, BHS, and IEEE 1708–2014 (for static test only) for BP measurement devices, which indicates MDL's long-term stability and consistency. Furthermore, we found that the model with two-channel inputs presents a trend of improving performance as the pool of mixed training data increased, while the conventional one-channel input revealed degraded performance. The outperformance of MDL is attributed to many significant features remained in the first CNN layer even when mixing personal 9-minutes data with the other 364 subjects. Consequently, PPG-only with MDL introduces a new avenue for overcoming challenges in training due to personal physiological variations. Given our consideration of real-life usage, this technology can be seamlessly translated to commercial applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Soft sensor modeling method and application based on TSECIT2FNN-LSTM.
- Author
-
Dai, Huangtao, Zhao, Taoyan, Cao, Jiangtao, and Li, Ping
- Subjects
MACHINE learning ,FUZZY logic ,GAS turbines ,FUZZY systems ,DETECTORS - Abstract
To address the issue of low accuracy in soft sensor modeling of key variables caused by multi-variable coupling and parameter sensitivity in complex processes, this paper introduces a TSK-type-based self-evolving compensatory interval type-2 fuzzy Long short-term memory (LSTM) neural network (TSECIT2FNN-LSTM) soft sensor model. The proposed TSECIT2FNN-LSTM integrates the LSTM neural network with the interval type-2 fuzzy inference system to address long-term dependencies in sequence data by utilizing the gate mechanism of the LSTM neural network. The TSECIT2FNN-LSTM structure learning algorithm uses the firing strength of the network rule antecedent to decide whether to generate new rules to improve the rationality of the network structure. TSECIT2FNN-LSTM parameter learning utilizes the gradient descent method to optimize network parameters. However, unlike other interval type-2 fuzzy neural network gradient calculation processes, the error term in the LSTM node parameter gradient of TSECIT2FNN-LSTM is propagated backwards in the time dimension. Additionally, the error term is simultaneously transferred to the upper layer network to enhance network prediction accuracy and memory capabilities. The TSECIT2FNN-LSTM soft sensor model is utilized to predict the alcohol concentration in wine and the nitrogen oxide emission in gas turbines. Experimental results demonstrate that the proposed TSECIT2FNN-LSTM soft sensing model achieves higher prediction accuracy compared to other models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Enhanced knowledge graph recommendation algorithm based on multi-level contrastive learning.
- Author
-
Rong, Zhang, Yuan, Liu, and Yang, Li
- Subjects
GRAPH neural networks ,KNOWLEDGE graphs ,RECOMMENDER systems ,LEARNING strategies ,ALGORITHMS - Abstract
Integrating the Knowledge Graphs (KGs) into recommendation systems enhances personalization and accuracy. However, the long-tail distribution of knowledge graphs often leads to data sparsity, which limits the effectiveness in practical applications. To address this challenge, this study proposes a knowledge-aware recommendation algorithm framework that incorporates multi-level contrastive learning. This framework enhances the Collaborative Knowledge Graph (CKG) through a random edge dropout method, which constructs feature representations at three levels: user-user interactions, item-item interactions and user-item interactions. A dynamic attention mechanism is employed in the Graph Attention Networks (GAT) for modeling the KG. Combined with the nonlinear transformation and Momentum Contrast (Moco) strategy for contrastive learning, it can effectively extract high-quality feature information. Additionally, multi-level contrastive learning, as an auxiliary self-supervised task, is jointly trained with the primary supervised task, which further enhances recommendation performance. Experimental results on the MovieLens and Amazon-books datasets demonstrate that this framework effectively improves the performance of knowledge graph-based recommendations, addresses the issue of data sparsity, and outperforms other baseline models across multiple evaluation metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Meta-learning for real-world class incremental learning: a transformer-based approach.
- Author
-
Kumar, Sandeep, Sharma, Amit, Shokeen, Vikrant, Azar, Ahmad Taher, Amin, Syed Umar, and Khan, Zafar Iqbal
- Subjects
NATURAL language processing ,MACHINE learning ,DEEP learning ,CLASS size ,MODERN languages - Abstract
Modern natural language processing (NLP) state-of-the-art (SoTA) deep learning (DL) models have hundreds of millions of parameters, making them extremely complex. Large datasets are required for training these models, and while pretraining has reduced this requirement, human-labelled datasets are still necessary for fine-tuning. Few-shot learning (FSL) techniques, such as meta-learning, try to train models from smaller datasets to mitigate this cost. However, the tasks used to evaluate these meta-learners frequently diverge from the problems in the real world that they are meant to resolve. This work aims to apply meta-learning to a problem that is more pertinent to the real world: class incremental learning (IL). In this scenario, after completing its training, the model learns to classify newly introduced classes. One unique quality of meta-learners is that they can generalise from a small sample size to classes that have never been seen before, which makes them especially useful for class incremental learning (IL). The method describes how to emulate class IL using proxy new classes. This method allows a meta-learner to complete the task without the need for retraining. To generate predictions, the transformer-based aggregation function in a meta-learner that modifies data from examples across all classes has been proposed. The principal contributions of the model include concurrently considering the entire support and query sets, and prioritising attention to crucial samples, such as the question, to increase the significance of its impact during inference. The outcomes demonstrate that the model surpasses prevailing benchmarks in the industry. Notably, most meta-learners demonstrate significant generalisation in the context of class IL even without specific training for this task. This paper establishes a high-performing baseline for subsequent transformer-based aggregation techniques, thereby emphasising the practical significance of meta-learners in class IL. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Early detection of pediatric health risks using maternal and child health data
- Author
-
Cornelia Ilin
- Subjects
Medicine ,Science - Abstract
Abstract Machine learning (ML)-driven diagnosis systems are particularly relevant in pediatrics given the well-documented impact of early-life health conditions on later-life outcomes. Yet, early identification of diseases and their subsequent impact on length of hospital stay for this age group has so far remained uncharacterized, likely because access to relevant health data is severely limited. Thanks to a confidential data use agreement with the California Department of Health Care Access and Information, we introduce Ped-BERT: a state-of-the-art deep learning model that accurately predicts the likelihood of 100+ conditions and the length of stay in a pediatric patient’s next medical visit. We link mother-specific pre- and postnatal period health information to pediatric patient hospital discharge and emergency room visits. Our data set comprises 513.9K mother–baby pairs and contains medical diagnosis codes, length of stay, as well as temporal and spatial pediatric patient characteristics, such as age and residency zip code at the time of visit. Following the popular bidirectional encoder representations from the transformers (BERT) approach, we pre-train Ped-BERT via the masked language modeling objective to learn embedding features for the diagnosis codes contained in our data. We then continue to fine-tune our model to accurately predict primary diagnosis outcomes and length of stay for a pediatric patient’s next visit, given the history of previous visits and, optionally, the mother’s pre- and postnatal health information. We find that Ped-BERT generally outperforms contemporary and state-of-the-art classifiers when trained with minimum features. We also find that incorporating mother health attributes leads to significant improvements in model performance overall and across all patient subgroups in our data. Our most successful Ped-BERT model configuration achieves an area under the receiver operator curve (ROC AUC) of 0.927 and an average precision score (APS) of 0.408 for the diagnosis prediction task, and a ROC AUC of 0.855 and APS of 0.815 for the length of hospital stay task. Further, we examine Ped-BERT’s fairness by determining whether prediction errors are evenly distributed across various subgroups of mother–baby demographics and health characteristics, or if certain subgroups exhibit a higher susceptibility to prediction errors.
- Published
- 2024
- Full Text
- View/download PDF
20. Analyzing differences between discursive communities using dialectograms.
- Author
-
Enggaard, Thyge, Lohse, August, Axel Pedersen, Morten, and Lehmann, Sune
- Subjects
POLARIZATION (Social sciences) ,INTERNET forums ,POLITICAL participation ,VOCABULARY - Abstract
Word embeddings provide an unsupervised way to understand differences in word usage between discursive communities. A number of papers have focused on identifying words that are used differently by two or more communities. But word embeddings are complex, high-dimensional spaces and a focus on identifying differences only captures a fraction of their richness. Here, we take a step towards leveraging the richness of the full embedding space, by using word embeddings to map out how words are used differently. Specifically, we describe the construction of dialectograms, an unsupervised way to visually explore the characteristic ways in which each community uses a focal word. Based on these dialectograms, we provide a new measure of the degree to which words are used differently that overcomes the tendency for existing measures to pick out low-frequency or polysemous words. We apply our methods to explore the discourses of two US political subreddits and show how our methods identify stark affective polarisation of politicians and political entities, differences in the assessment of proper political action as well as disagreement about whether certain issues require political intervention at all. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Learning long sequences in spiking neural networks.
- Author
-
Stan, Matei-Ioan and Rhodes, Oliver
- Abstract
Spiking neural networks (SNNs) take inspiration from the brain to enable energy-efficient computations. Since the advent of Transformers, SNNs have struggled to compete with artificial networks on modern sequential tasks, as they inherit limitations from recurrent neural networks (RNNs), with the added challenge of training with non-differentiable binary spiking activations. However, a recent renewed interest in efficient alternatives to Transformers has given rise to state-of-the-art recurrent architectures named state space models (SSMs). This work systematically investigates, for the first time, the intersection of state-of-the-art SSMs with SNNs for long-range sequence modelling. Results suggest that SSM-based SNNs can outperform the Transformer on all tasks of a well-established long-range sequence modelling benchmark. It is also shown that SSM-based SNNs can outperform current state-of-the-art SNNs with fewer parameters on sequential image classification. Finally, a novel feature mixing layer is introduced, improving SNN accuracy while challenging assumptions about the role of binary activations in SNNs. This work paves the way for deploying powerful SSM-based architectures, such as large language models, to neuromorphic hardware for energy-efficient long-range sequence modelling. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Enhancing deep learning-based slope stability classification using a novel metaheuristic optimization algorithm for feature selection.
- Author
-
Zerouali, Bilel, Bailek, Nadjem, Tariq, Aqil, Kuriqi, Alban, Guermoui, Mawloud, Alharbi, Amal H., Khafaga, Doaa Sami, and El-kenawy, El-Sayed M.
- Abstract
The evaluation of slope stability is of crucial importance in geotechnical engineering and has significant implications for infrastructure safety, natural hazard mitigation, and environmental protection. This study aimed to identify the most influential factors affecting slope stability and evaluate the performance of various machine learning models for classifying slope stability. Through correlation analysis and feature importance evaluation using a random forest regressor, cohesion, unit weight, slope height, and friction angle were identified as the most critical parameters influencing slope stability. This research assessed the effectiveness of machine learning techniques combined with modern feature selection algorithms and conventional feature analysis methods. The performance of deep learning models, including recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and generative adversarial networks (GANs), in slope stability classification was evaluated. The GAN model demonstrated superior performance, achieving the highest overall accuracy of 0.913 and the highest area under the ROC curve (AUC) of 0.9285. Integration of the binary bGGO technique for feature selection with the GAN model led to significant improvements in classification performance, with the bGGO-GAN model showing enhanced sensitivity, positive predictive value, negative predictive value, and F1 score compared to the classical GAN model. The bGGO-GAN model achieved 95% accuracy on a substantial dataset of 627 samples, demonstrating competitive performance against other models in the literature while offering strong generalizability. This study highlights the potential of advanced machine learning techniques and feature selection methods for improving slope stability classification and provides valuable insights for geotechnical engineering applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Text classification models for assessing the completeness of randomized controlled trial publications based on CONSORT reporting guidelines.
- Author
-
Jiang, Lan, Lan, Mengfei, Menke, Joe D., Vorland, Colby J., and Kilicoglu, Halil
- Abstract
Complete and transparent reporting of randomized controlled trial publications (RCTs) is essential for assessing their credibility. We aimed to develop text classification models for determining whether RCT publications report CONSORT checklist items. Using a corpus annotated with 37 fine-grained CONSORT items, we trained sentence classification models (PubMedBERT fine-tuning, BioGPT fine-tuning, and in-context learning with GPT-4) and compared their performance. We assessed the impact of data augmentation methods (Easy Data Augmentation (EDA), UMLS-EDA, text generation and rephrasing with GPT-4) on model performance. We also fine-tuned section-specific PubMedBERT models (e.g., Methods) to evaluate whether they could improve performance compared to the single full model. We performed 5-fold cross-validation and report precision, recall, F
1 score, and area under curve (AUC). Fine-tuned PubMedBERT model that uses the sentence along with the surrounding sentences and section headers yielded the best overall performance (sentence level: 0.71 micro-F1 , 0.67 macro-F1 ; article-level: 0.90 micro-F1 , 0.84 macro-F1 ). Data augmentation had limited positive effect. BioGPT fine-tuning and GPT-4 in-context learning exhibited suboptimal results. Methods-specific model improved recognition of methodology items, other section-specific models did not have significant impact. Most CONSORT checklist items can be recognized reasonably well with the fine-tuned PubMedBERT model but there is room for improvement. Improved models can underpin the journal editorial workflows and CONSORT adherence checks. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
24. The Two Word Test as a semantic benchmark for large language models.
- Author
-
Riccardi, Nicholas, Yang, Xuan, and Desai, Rutvik H.
- Subjects
ARTIFICIAL intelligence ,LANGUAGE models ,GENERATIVE pre-trained transformers ,BINARY number system ,TERMS & phrases - Abstract
Large language models (LLMs) have shown remarkable abilities recently, including passing advanced professional exams and demanding benchmark tests. This performance has led many to suggest that they are close to achieving humanlike or "true" understanding of language, and even artificial general intelligence (AGI). Here, we provide a new open-source benchmark, the Two Word Test (TWT), that can assess semantic abilities of LLMs using two-word phrases in a task that can be performed relatively easily by humans without advanced training. Combining multiple words into a single concept is a fundamental linguistic and conceptual operation routinely performed by people. The test requires meaningfulness judgments of 1768 noun-noun combinations that have been rated as meaningful (e.g., baby boy) or as having low meaningfulness (e.g., goat sky) by human raters. This novel test differs from existing benchmarks that rely on logical reasoning, inference, puzzle-solving, or domain expertise. We provide versions of the task that probe meaningfulness ratings on a 0–4 scale as well as binary judgments. With both versions, we conducted a series of experiments using the TWT on GPT-4, GPT-3.5, Claude-3-Optus, and Gemini-1-Pro-001. Results demonstrated that, compared to humans, all models performed relatively poorly at rating meaningfulness of these phrases. GPT-3.5-turbo, Gemini-1.0-Pro-001 and GPT-4-turbo were also unable to make binary discriminations between sensible and nonsense phrases, with these models consistently judging nonsensical phrases as making sense. Claude-3-Opus made a substantial improvement in binary discrimination of combinatorial phrases but was still significantly worse than human performance. The TWT can be used to understand and assess the limitations of current LLMs, and potentially improve them. The test also reminds us that caution is warranted in attributing "true" or human-level understanding to LLMs based only on tests that are challenging for humans. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Efficient diagnostic classification of diverse pathologies through contextual eye movement data analysis with a novel hybrid architecture.
- Author
-
El Hmimdi, Alae Eddine, Palpanas, Themis, and Kapoula, Zoi
- Abstract
The analysis of eye movements has proven valuable for understanding brain function and the neuropathology of various disorders. This research aims to utilize eye movement data analysis as a screening tool for differentiation between eight different groups of pathologies, including scholar, neurologic, and postural disorders. Leveraging a dataset from 20 clinical centers, all employing AIDEAL and REMOBI eye movement technologies this study extends prior research by considering a multi-annotation setting, incorporating information from recordings from saccade and vergence eye movement tests, and using contextual information (e.g. target signals and latency of the eye movement relative to the target and confidence level of the quality of eye movement recording) to improve accuracy while reducing noise interference. Additionally, we introduce a novel hybrid architecture that combines the weight-sharing feature of convolution layers with the long-range capabilities of the transformer architecture to improve model efficiency and reduce the computation cost by a factor of 3.36, while still being competitive in terms of macro F1 score. Evaluated on two diverse datasets, our method demonstrates promising results, the most powerful discrimination being Attention & Neurologic; with a macro F1 score of up to 78.8%; disorder. The results indicate the effectiveness of our approach in classifying eye movement data from different pathologies and different clinical centers accurately, thus enabling the creation of an assistant tool in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. LLM-Twin: mini-giant model-driven beyond 5G digital twin networking framework with semantic secure communication and computation.
- Author
-
Hong, Yang, Wu, Jun, and Morello, Rosario
- Subjects
DIGITAL twins ,DIGITAL communications ,LANGUAGE models ,DIGITAL technology ,5G networks ,ELECTRONIC data processing ,CHAOS synchronization - Abstract
Beyond 5G networks provide solutions for next-generation communications, especially digital twins networks (DTNs) have gained increasing popularity for bridging physical and digital space. However, current DTNs pose some challenges, especially when applied to scenarios that require efficient and multimodal data processing. Firstly, current DTNs are limited in communication and computational efficiency, since they require to transmit large amounts of raw data collected from physical sensors, as well as to ensure model synchronization through high-frequency computation. Second, current models of DTNs are domain-specific (e.g. E-health), making it difficult to handle DT scenarios with multimodal data processing requirements. Finally, current security schemes for DTNs introduce additional overheads that impair the efficiency. Against the above challenges, we propose a large language model (LLM) empowered DTNs framework, LLM-Twin. First, based on LLM, we propose digital twin semantic networks (DTSNs), which enable more efficient communication and computation. Second, we design a mini-giant model collaboration scheme, which enables efficient deployment of LLM in DTNs and is adapted to handle multimodal data. Then, we designed a native security policy for LLM-twin without compromising efficiency. Numerical experiments and case studies demonstrate the feasibility of LLM-Twin. To our knowledge, this is the first to propose an LLM-based semantic-level DTNs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. PLMACPred prediction of anticancer peptides based on protein language model and wavelet denoising transformation.
- Author
-
Arif, Muhammad, Musleh, Saleh, Fida, Huma, and Alam, Tanvir
- Subjects
LANGUAGE models ,PROTEIN models ,DRUG discovery ,MICROBIAL peptides ,PEPTIDES ,MACHINE learning - Abstract
Anticancer peptides (ACPs) perform a promising role in discovering anti-cancer drugs. The growing research on ACPs as therapeutic agent is increasing due to its minimal side effects. However, identifying novel ACPs using wet-lab experiments are generally time-consuming, labor-intensive, and expensive. Leveraging computational methods for fast and accurate prediction of ACPs would harness the drug discovery process. Herein, a machine learning-based predictor, called PLMACPred, is developed for identifying ACPs from peptide sequence only. PLMACPred adopted a set of encoding schemes representing evolutionary-property, composition-property, and protein language model (PLM), i.e., evolutionary scale modeling (ESM-2)- and ProtT5-based embedding to encode peptides. Then, two-dimensional (2D) wavelet denoising (WD) was employed to remove the noise from extracted features. Finally, ensemble-based cascade deep forest (CDF) model was developed to identify ACP. PLMACPred model attained superior performance on all three benchmark datasets, namely, ACPmain, ACPAlter, and ACP740 over tenfold cross validation and independent dataset. PLMACPred outperformed the existing models and improved the prediction accuracy by 18.53%, 2.4%, 7.59% on ACPmain, ACPalter, ACP740 dataset, respectively. We showed that embedding from ProtT5 and ESM-2 was capable of capturing better contextual information from the entire sequence than the other encoding schemes for ACP prediction. For the explainability of proposed model, SHAP (SHapley Additive exPlanations) method was used to analyze the feature effect on the ACP prediction. A list of novel sequence motifs was proposed from the ACP sequence using MEME suites. We believe, PLMACPred will support in accelerating the discovery of novel ACPs as well as other activities of microbial peptides. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Enhancing handwritten text recognition accuracy with gated mechanisms.
- Author
-
Chinthaginjala, Ravikumar, Dhanamjayulu, C., Kim, Tai-hoon, Ahmed, Suhaib, Kim, Si-Yeong, Kumar, A. S., Annepu, Visalakshi, and Ahmad, Shafiq
- Subjects
TEXT recognition ,HANDWRITING recognition (Computer science) ,NATURAL language processing ,CONVOLUTIONAL neural networks ,PATTERN recognition systems ,RECURRENT neural networks ,HISTORICAL source material ,MATHEMATICAL optimization - Abstract
Handwritten Text Recognition (HTR) is a challenging task due to the complex structures and variations present in handwritten text. In recent years, the application of gated mechanisms, such as Long Short-Term Memory (LSTM) networks, has brought significant advancements to HTR systems. This paper presents an overview of HTR using a gated mechanism and highlights its novelty and advantages. The gated mechanism enables the model to capture long-term dependencies, retain relevant context, handle variable length sequences, mitigate error propagation, and adapt to contextual variations. The pipeline involves preprocessing the handwritten text images, extracting features, modeling the sequential dependencies using the gated mechanism, and decoding the output into readable text. The training process utilizes annotated datasets and optimization techniques to minimize transcription discrepancies. HTR using a gated mechanism has found applications in digitizing historical documents, automatic form processing, and real-time transcription. The results show improved accuracy and robustness compared to traditional HTR approaches. The advancements in HTR using a gated mechanism open up new possibilities for effectively recognizing and transcribing handwritten text in various domains. This research does a better job than the most recent iteration of the HTR system when compared to five different handwritten datasets (Washington, Saint Gall, RIMES, Bentham and IAM). Smartphones and robots are examples of low-cost computing devices that can benefit from this research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Prediction of protein secondary structure by the improved TCN-BiLSTM-MHA model with knowledge distillation.
- Author
-
Zhao, Lufei, Li, Jingyi, Zhan, Weiqiang, Jiang, Xuchu, and Zhang, Biao
- Subjects
PROTEIN structure prediction ,DEEP learning ,AMINO acid sequence ,PROTEIN structure ,DRUG development ,PROTEIN folding - Abstract
Secondary structure prediction is a key step in understanding protein function and biological properties and is highly important in the fields of new drug development, disease treatment, bioengineering, etc. Accurately predicting the secondary structure of proteins helps to reveal how proteins are folded and how they function in cells. The application of deep learning models in protein structure prediction is particularly important because of their ability to process complex sequence information and extract meaningful patterns and features, thus significantly improving the accuracy and efficiency of prediction. In this study, a combined model integrating an improved temporal convolutional network (TCN), bidirectional long short-term memory (BiLSTM), and a multi-head attention (MHA) mechanism is proposed to enhance the accuracy of protein prediction in both eight-state and three-state structures. One-hot encoding features and word vector representations of physicochemical properties are incorporated. A significant emphasis is placed on knowledge distillation techniques utilizing the ProtT5 pretrained model, leading to performance improvements. The improved TCN, achieved through multiscale fusion and bidirectional operations, allows for better extraction of amino acid sequence features than traditional TCN models. The model demonstrated excellent prediction performance on multiple datasets. For the TS115, CB513 and PDB (2018–2020) datasets, the prediction accuracy of the eight-state structure of the six datasets in this paper reached 88.2%, 84.9%, and 95.3%, respectively, and the prediction accuracy of the three-state structure reached 91.3%, 90.3%, and 96.8%, respectively. This study not only improves the accuracy of protein secondary structure prediction but also provides an important tool for understanding protein structure and function, which is particularly applicable to resource-constrained contexts and provides a valuable tool for understanding protein structure and function. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. MGLEP: Multimodal Graph Learning for Modeling Emerging Pandemics with Big Data.
- Author
-
Tran, Khanh-Tung, Hy, Truong Son, Jiang, Lili, and Vu, Xuan-Son
- Subjects
GRAPH neural networks ,BIG data ,LANGUAGE models ,PANDEMICS ,HEBBIAN memory - Abstract
Accurate forecasting and analysis of emerging pandemics play a crucial role in effective public health management and decision-making. Traditional approaches primarily rely on epidemiological data, overlooking other valuable sources of information that could act as sensors or indicators of pandemic patterns. In this paper, we propose a novel framework, MGLEP, that integrates temporal graph neural networks and multi-modal data for learning and forecasting. We incorporate big data sources, including social media content, by utilizing specific pre-trained language models and discovering the underlying graph structure among users. This integration provides rich indicators of pandemic dynamics through learning with temporal graph neural networks. Extensive experiments demonstrate the effectiveness of our framework in pandemic forecasting and analysis, outperforming baseline methods across different areas, pandemic situations, and prediction horizons. The fusion of temporal graph learning and multi-modal data enables a comprehensive understanding of the pandemic landscape with less time lag, cheap cost, and more potential information indicators. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Athena: Automated Tuning of k-mer based Genomic Error Correction Algorithms using Language Models.
- Author
-
Abdallah, Mustafa, Mahgoub, Ashraf, Ahmed, Hany, and Chaterji, Somali
- Subjects
ERROR correction (Information theory) ,ALGORITHMS ,NATURAL language processing ,RECURRENT neural networks ,NUCLEOTIDE sequencing - Abstract
The performance of most error-correction (EC) algorithms that operate on genomics reads is dependent on the proper choice of its configuration parameters, such as the value of k in k-mer based techniques. In this work, we target the problem of finding the best values of these configuration parameters to optimize error correction and consequently improve genome assembly. We perform this in an adaptive manner, adapted to different datasets and to EC tools, due to the observation that different configuration parameters are optimal for different datasets, i.e., from different platforms and species, and vary with the EC algorithm being applied. We use language modeling techniques from the Natural Language Processing (NLP) domain in our algorithmic suite, Athena, to automatically tune the performance-sensitive configuration parameters. Through the use of N-Gram and Recurrent Neural Network (RNN) language modeling, we validate the intuition that the EC performance can be computed quantitatively and efficiently using the "perplexity" metric, repurposed from NLP. After training the language model, we show that the perplexity metric calculated from a sample of the test (or production) data has a strong negative correlation with the quality of error correction of erroneous NGS reads. Therefore, we use the perplexity metric to guide a hill climbing-based search, converging toward the best configuration parameter value. Our approach is suitable for both de novo and comparative sequencing (resequencing), eliminating the need for a reference genome to serve as the ground truth. We find that Athena can automatically find the optimal value of k with a very high accuracy for 7 real datasets and using 3 different k-mer based EC algorithms, Lighter, Blue, and Racer. The inverse relation between the perplexity metric and alignment rate exists under all our tested conditions—for real and synthetic datasets, for all kinds of sequencing errors (insertion, deletion, and substitution), and for high and low error rates. The absolute value of that correlation is at least 73%. In our experiments, the best value of k found by Athena achieves an alignment rate within 0.53% of the oracle best value of k found through brute force searching (i.e., scanning through the entire range of k values). Athena's selected value of k lies within the top-3 best k values using N-Gram models and the top-5 best k values using RNN models With best parameter selection by Athena, the assembly quality (NG50) is improved by a Geometric Mean of 4.72X across the 7 real datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models.
- Author
-
Maharjan, Jenish, Garikipati, Anurag, Singh, Navan Preet, Cyrus, Leo, Sharma, Mayank, Ciobanu, Madalina, Barnes, Gina, Thapa, Rahul, Mao, Qingqing, and Das, Ritankar
- Subjects
LANGUAGE models ,CLINICAL decision support systems ,ENGINEERING ,FEATURE selection ,ARTIFICIAL intelligence - Abstract
LLMs can accomplish specialized medical knowledge tasks, however, equitable access is hindered by the extensive fine-tuning, specialized medical data requirement, and limited access to proprietary models. Open-source (OS) medical LLMs show performance improvements and provide the transparency and compliance required in healthcare. We present OpenMedLM, a prompting platform delivering state-of-the-art (SOTA) performance for OS LLMs on medical benchmarks. We evaluated OS foundation LLMs (7B-70B) on medical benchmarks (MedQA, MedMCQA, PubMedQA, MMLU medical-subset) and selected Yi34B for developing OpenMedLM. Prompting strategies included zero-shot, few-shot, chain-of-thought, and ensemble/self-consistency voting. OpenMedLM delivered OS SOTA results on three medical LLM benchmarks, surpassing previous best-performing OS models that leveraged costly and extensive fine-tuning. OpenMedLM displays the first results to date demonstrating the ability of OS foundation models to optimize performance, absent specialized fine-tuning. The model achieved 72.6% accuracy on MedQA, outperforming the previous SOTA by 2.4%, and 81.7% accuracy on MMLU medical-subset, establishing itself as the first OS LLM to surpass 80% accuracy on this benchmark. Our results highlight medical-specific emergent properties in OS LLMs not documented elsewhere to date and validate the ability of OS models to accomplish healthcare tasks, highlighting the benefits of prompt engineering to improve performance of accessible LLMs for medical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Multilingual end-to-end ASR for low-resource Turkic languages with common alphabets.
- Author
-
Bekarystankyzy, Akbayan, Mamyrbayev, Orken, Mendes, Mateus, Fazylzhanova, Anar, and Assam, Muhammad
- Subjects
TURKIC languages ,MACHINE learning ,LANGUAGE models ,UNIVERSAL language ,COGNATE words - Abstract
To obtain a reliable and accurate automatic speech recognition (ASR) machine learning model, it is necessary to have sufficient audio data transcribed, for training. Many languages in the world, especially the agglutinative languages of the Turkic family, suffer from a lack of this type of data. Many studies have been conducted in order to obtain better models for low-resource languages, using different approaches. The most popular approaches include multilingual training and transfer learning. In this study, we combined five agglutinative languages from the Turkic family—Kazakh, Bashkir, Kyrgyz, Sakha, and Tatar,—in order to provide multilingual training using connectionist temporal classification and an attention mechanism including a language model, because these languages have cognate words, sentence formation rules, and alphabet (Cyrillic). Data from the open-source database Common voice was used for the study, to make the experiments reproducible. The results of the experiments showed that multilingual training could improve ASR performances for all languages included in the experiment, except Bashkir language. A dramatic result was achieved for the Kyrgyz language: word error rate decreased to nearly one-fifth and character error rate decreased to one-fourth, which proves that this approach can be helpful for critically low-resource languages. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Recurrent quantum embedding neural network and its application in vulnerability detection.
- Author
-
Song, Zhihui, Zhou, Xin, Xu, Jinchen, Ding, Xiaodong, and Shan, Zheng
- Abstract
In recent years, deep learning has been widely used in vulnerability detection with remarkable results. These studies often apply natural language processing (NLP) technologies due to the natural similarity between code and language. Since NLP usually consumes a lot of computing resources, its combination with quantum computing is becoming a valuable research direction. In this paper, we present a Recurrent Quantum Embedding Neural Network (RQENN) for vulnerability detection. It aims to reduce the memory consumption of classical models for vulnerability detection tasks and improve the performance of quantum natural language processing (QNLP) methods. We show that the performance of RQENN achieves the above goals. Compared with the classic model, the space complexity of each stage of its execution is exponentially reduced, and the number of parameters used and the number of bits consumed are significantly reduced. Compared with other QNLP methods, RQENN uses fewer qubit resources and achieves a 15.7% higher accuracy in vulnerability detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Towards knowledge-infused automated disease diagnosis assistant.
- Author
-
Tomar, Mohit, Tiwari, Abhisek, and Saha, Sriparna
- Subjects
ARTIFICIAL neural networks ,DIAGNOSIS ,KNOWLEDGE graphs ,SYMPTOMS ,TRANSFORMER models ,VIRTUAL networks ,NEURAL codes ,MEDICAL telematics - Abstract
With the advancement of internet communication and telemedicine, people are increasingly turning to the web for various healthcare activities. With an ever-increasing number of diseases and symptoms, diagnosing patients becomes challenging. In this work, we build a diagnosis assistant to assist doctors, which identifies diseases based on patient–doctor interaction. During diagnosis, doctors utilize both symptomatology knowledge and diagnostic experience to identify diseases accurately and efficiently. Inspired by this, we investigate the role of medical knowledge in disease diagnosis through doctor–patient interaction. We propose a two-channel, knowledge-infused, discourse-aware disease diagnosis model (KI-DDI), where the first channel encodes patient–doctor communication using a transformer-based encoder, while the other creates an embedding of symptom-disease using a graph attention network (GAT). In the next stage, the conversation and knowledge graph embeddings are infused together and fed to a deep neural network for disease identification. Furthermore, we first develop an empathetic conversational medical corpus comprising conversations between patients and doctors, annotated with intent and symptoms information. The proposed model demonstrates a significant improvement over the existing state-of-the-art models, establishing the crucial roles of (a) a doctor's effort for additional symptom extraction (in addition to patient self-report) and (b) infusing medical knowledge in identifying diseases effectively. Many times, patients also show their medical conditions, which acts as crucial evidence in diagnosis. Therefore, integrating visual sensory information would represent an effective avenue for enhancing the capabilities of diagnostic assistants. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. A novel framework based on explainable AI and genetic algorithms for designing neurological medicines.
- Author
-
Singh, Vishakha, Singh, Sanjay Kumar, and Sharma, Ritesh
- Subjects
ARTIFICIAL intelligence ,DEEP learning ,INDUSTRY 4.0 ,GENETIC algorithms ,AMINO acid sequence ,NEUROPEPTIDES ,BIOPHARMACEUTICS - Abstract
The advent of the fourth industrial revolution, characterized by artificial intelligence (AI) as its central component, has resulted in the mechanization of numerous previously labor-intensive activities. The use of in silico tools has become prevalent in the design of biopharmaceuticals. Upon conducting a comprehensive analysis of the genomes of many organisms, it has been discovered that their tissues can generate specific peptides that confer protection against certain diseases. This study aims to identify a selected group of neuropeptides (NPs) possessing favorable characteristics that render them ideal for production as neurological biopharmaceuticals. Until now, the construction of NP classifiers has been the primary focus, neglecting to optimize these characteristics. Therefore, in this study, the task of creating ideal NPs has been formulated as a multi-objective optimization problem. The proposed framework, NPpred, comprises two distinct components: NSGA-NeuroPred and BERT-NeuroPred. The former employs the NSGA-II algorithm to explore and change a population of NPs, while the latter is an interpretable deep learning-based model. The utilization of explainable AI and motifs has led to the proposal of two novel operators, namely p-crossover and p-mutation. An online application has been deployed at https://neuropred.anvil.app for designing an ideal collection of synthesizable NPs from protein sequences. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Clustering swap prediction for image-text pre-training.
- Author
-
Fayou, Sun, Ngo, Hea Choon, Sek, Yong Wee, and Meng, Zuqiang
- Subjects
OPEN clusters of stars ,FORECASTING ,LEARNING strategies - Abstract
It is essential to delve into the strategy of multimodal model pre-training, which is an obvious impact on downstream tasks. Currently, clustering learning has achieved noteworthy benefits in multiple methods. However, due to the availability of open image-text pairs, it is challenging for multimodal with clustering learning. In this paper, we propose an approach that utilizes clustering swap prediction strategy to learn image-text clustering embedding space by interaction prediction between image and text features. Unlike existing models with clustering learning, our method (Clus) allows for an open number of clusters for web-scale alt-text data. Furthermore, in order to train the image and text encoders efficiently, we introduce distillation learning approach and evaluate the performance of the image-encoder in downstream visual tasks. In addition, Clus is pre-trained end-to-end by using large-scale image-text pairs. Specifically, both text and image serve as ground truth for swap prediction, enabling effective representation learning. Concurrently, extensive experiments demonstrate that Clus achieves state-of-the-art performance on multiple downstream fine-tuning and zero-shot tasks (i.e., Image-Text Retrieval, VQA, NLVR
2 , Image Captioning, Object Detection, and Semantic Segmentation). [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
38. Attributions toward artificial agents in a modified Moral Turing Test.
- Author
-
Aharoni, Eyal, Fernandes, Sharlene, Brady, Daniel J., Alexander, Caelan, Criner, Michael, Queen, Kara, Rando, Javier, Nahmias, Eddy, and Crespo, Victor
- Subjects
TURING test ,LANGUAGE models ,MORAL agent (Philosophy) ,ARTIFICIAL intelligence ,MORAL reasoning ,SPATIAL ability - Abstract
Advances in artificial intelligence (AI) raise important questions about whether people view moral evaluations by AI systems similarly to human-generated moral evaluations. We conducted a modified Moral Turing Test (m-MTT), inspired by Allen et al. (Exp Theor Artif Intell 352:24–28, 2004) proposal, by asking people to distinguish real human moral evaluations from those made by a popular advanced AI language model: GPT-4. A representative sample of 299 U.S. adults first rated the quality of moral evaluations when blinded to their source. Remarkably, they rated the AI's moral reasoning as superior in quality to humans' along almost all dimensions, including virtuousness, intelligence, and trustworthiness, consistent with passing what Allen and colleagues call the comparative MTT. Next, when tasked with identifying the source of each evaluation (human or computer), people performed significantly above chance levels. Although the AI did not pass this test, this was not because of its inferior moral reasoning but, potentially, its perceived superiority, among other possible explanations. The emergence of language models capable of producing moral responses perceived as superior in quality to humans' raises concerns that people may uncritically accept potentially harmful moral guidance from AI. This possibility highlights the need for safeguards around generative language models in matters of morality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Deep learning-aided 3D proxy-bridged region-growing framework for multi-organ segmentation.
- Author
-
Chen, Zhihong, Yao, Lisha, Liu, Yue, Han, Xiaorui, Gong, Zhengze, Luo, Jichao, Zhao, Jietong, and Fang, Gang
- Subjects
COMPUTER-aided diagnosis ,DEEP learning ,THREE-dimensional imaging ,COMPUTED tomography ,SEED technology ,SPLEEN - Abstract
Accurate multi-organ segmentation in 3D CT images is imperative for enhancing computer-aided diagnosis and radiotherapy planning. However, current deep learning-based methods for 3D multi-organ segmentation face challenges such as the need for labor-intensive manual pixel-level annotations and high hardware resource demands, especially regarding GPU resources. To address these issues, we propose a 3D proxy-bridged region-growing framework specifically designed for the segmentation of the liver and spleen. Specifically, a key slice is selected from each 3D volume according to the corresponding intensity histogram. Subsequently, a deep learning model is employed to pinpoint the semantic central patch on this key slice, to calculate the growing seed. To counteract the impact of noise, segmentation of the liver and spleen is conducted on superpixel images created through proxy-bridging strategy. The segmentation process is then extended to adjacent slices by applying the same methodology iteratively, culminating in the comprehensive segmentation results. Experimental results demonstrate that the proposed framework accomplishes segmentation of the liver and spleen with an average Dice Similarity Coefficient of approximately 0.93 and a Jaccard Similarity Coefficient of around 0.88. These outcomes substantiate the framework's capability to achieve performance on par with that of deep learning methods, albeit requiring less guidance information and lower GPU resources. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Predicting quantum emitter fluctuations with time-series forecasting models.
- Author
-
Ramezani, Fereshteh, Strasbourg, Matthew, Parvez, Sheikh, Saxena, Ravindra, Jariwala, Deep, Borys, Nicholas J., and Whitaker, Bradley M.
- Subjects
QUANTUM fluctuations ,LIGHT emitting diodes ,DEEP learning ,SEMICONDUCTOR lasers ,OPTICAL modulators ,QUANTUM computing ,POLARIZED photons - Abstract
2D materials have important fundamental properties allowing for their use in many potential applications, including quantum computing. Various Van der Waals materials, including Tungsten disulfide (WS2), have been employed to showcase attractive device applications such as light emitting diodes, lasers and optical modulators. To maximize the utility and value of integrated quantum photonics, the wavelength, polarization and intensity of the photons from a quantum emission (QE) must be stable. However, random variation of emission energy, caused by the inhomogeneity in the local environment, is a major challenge for all solid-state single photon emitters. In this work, we assess the random nature of the quantum fluctuations, and we present time series forecasting deep learning models to analyse and predict QE fluctuations for the first time. Our trained models can roughly follow the actual trend of the data and, under certain data processing conditions, can predict peaks and dips of the fluctuations. The ability to anticipate these fluctuations will allow physicists to harness quantum fluctuation characteristics to develop novel scientific advances in quantum computing that will greatly benefit quantum technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Defense against adversarial attacks: robust and efficient compressed optimized neural networks.
- Author
-
Kraidia, Insaf, Ghenai, Afifa, and Belhaouari, Samir Brahim
- Abstract
In the ongoing battle against adversarial attacks, adopting a suitable strategy to enhance model efficiency, bolster resistance to adversarial threats, and ensure practical deployment is crucial. To achieve this goal, a novel four-component methodology is introduced. First, introducing a pioneering batch-cumulative approach, the exponential particle swarm optimization (ExPSO) algorithm was developed for meticulous parameter fine-tuning within each batch. A cumulative updating loss function was employed for overall optimization, demonstrating remarkable superiority over traditional optimization techniques. Second, weight compression is applied to streamline the deep neural network (DNN) parameters, boosting the storage efficiency and accelerating inference. It also introduces complexity to deter potential attackers, enhancing model accuracy in adversarial settings. This study compresses the generative pre-trained transformer (GPT) by 65%, saving time and memory without causing performance loss. Compared to state-of-the-art methods, the proposed method achieves the lowest perplexity (14.28), the highest accuracy (93.72%), and an 8 × speedup in the central processing unit. The integration of the preceding two components involves the simultaneous training of multiple versions of the compressed GPT. This training occurs across various compression rates and different segments of a dataset and is ultimately associated with a novel multi-expert architecture. This enhancement significantly fortifies the model's resistance to adversarial attacks by introducing complexity into attackers' attempts to anticipate the model's prediction integration process. Consequently, this leads to a remarkable average performance improvement of 25% across 14 different attack scenarios and various datasets, surpassing the capabilities of current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Application of the transformer model algorithm in chinese word sense disambiguation: a case study in chinese language.
- Author
-
Li, Linlin, Li, Juxing, Wang, Hongli, and Nie, Jianing
- Subjects
DEEP learning ,TRANSFORMER models ,CHINESE language ,MEAN square algorithms ,STANDARD deviations - Abstract
This study aims to explore the research methodology of applying the Transformer model algorithm to Chinese word sense disambiguation, seeking to resolve word sense ambiguity in the Chinese language. The study introduces deep learning and designs a Chinese word sense disambiguation model based on the fusion of the Transformer with the Bi-directional Long Short-Term Memory (BiLSTM) algorithm. By utilizing the self-attention mechanism of Transformer and the sequence modeling capability of BiLSTM, this model efficiently captures semantic information and context relationships in Chinese sentences, leading to accurate word sense disambiguation. The model's evaluation is conducted using the PKU Paraphrase Bank, a Chinese text paraphrase dataset. The results demonstrate that the model achieves a precision rate of 83.71% in Chinese word sense disambiguation, significantly outperforming the Long Short-Term Memory algorithm. Additionally, the root mean squared error of this algorithm is less than 17, with a loss function value remaining around 0.14. Thus, this study validates that the constructed Transformer-fused BiLSTM-based Chinese word sense disambiguation model algorithm exhibits both high accuracy and robustness in identifying word senses in the Chinese language. The findings of this study provide valuable insights for advancing the intelligent development of word senses in Chinese language applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Foundation models assist in human-robot collaboration assembly.
- Author
-
Ji Y, Zhang Z, Tang D, Zheng Y, Liu C, Zhao Z, and Li X
- Subjects
- Humans, Man-Machine Systems, Models, Theoretical, Robotics methods
- Abstract
Human-robot collaboration (HRC) is a novel manufacturing paradigm designed to fully leverage the advantage of humans and robots, efficiently and flexibly accomplishing customized manufacturing tasks. However, existing HRC systems lack the transfer and generalization capability for environment perception and task reasoning. These limitations manifest in: (1) current methods rely on specialized models to perceive scenes; and need retraining the model when facing unseen objects. (2) current methods only address predefined tasks, and cannot support undefined task reasoning. To avoid these limitations, this paper proposes a novel HRC approach based on Foundation Models (FMs), including Large Language models (LLMs) and Vision Foundation Models (VFMs). Specifically, a LLMs-based task reasoning method is introduced, utilizing prompt learning to transfer LLMs into the domain of HRC tasks, supporting undefined task reasoning. A VFMs-based scene semantic perception method is proposed, integrating various VFMs to achieve scene perception without training. Finally, a FMs-based HRC system is developed, comprising perception, reasoning, and execution modules for more flexible and generalized HRC. The superior performances of FMs in perception and reasoning are demonstrated by extensive experiments. Furthermore, the feasibility and effectiveness of the FMs-based HRC system are validated through an part assembly case involving a satellite component model., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
44. Scalable incident detection via natural language processing and probabilistic language models.
- Author
-
Walsh CG, Wilimitis D, Chen Q, Wright A, Kolli J, Robinson K, Ripperger MA, Johnson KB, Carrell D, Desai RJ, Mosholder A, Dharmarajan S, Adimadhyam S, Fabbri D, Stojanovic D, Matheny ME, and Bejan CA
- Subjects
- Humans, Models, Statistical, Female, Male, Suicide, Attempted, Adult, Middle Aged, Natural Language Processing, Electronic Health Records
- Abstract
Post marketing safety surveillance depends in part on the ability to detect concerning clinical events at scale. Spontaneous reporting might be an effective component of safety surveillance, but it requires awareness and understanding among healthcare professionals to achieve its potential. Reliance on readily available structured data such as diagnostic codes risks under-coding and imprecision. Clinical textual data might bridge these gaps, and natural language processing (NLP) has been shown to aid in scalable phenotyping across healthcare records in multiple clinical domains. In this study, we developed and validated a novel incident phenotyping approach using unstructured clinical textual data agnostic to Electronic Health Record (EHR) and note type. It's based on a published, validated approach (PheRe) used to ascertain social determinants of health and suicidality across entire healthcare records. To demonstrate generalizability, we validated this approach on two separate phenotypes that share common challenges with respect to accurate ascertainment: (1) suicide attempt; (2) sleep-related behaviors. With samples of 89,428 records and 35,863 records for suicide attempt and sleep-related behaviors, respectively, we conducted silver standard (diagnostic coding) and gold standard (manual chart review) validation. We showed Area Under the Precision-Recall Curve of ~ 0.77 (95% CI 0.75-0.78) for suicide attempt and AUPR ~ 0.31 (95% CI 0.28-0.34) for sleep-related behaviors. We also evaluated performance by coded race and demonstrated differences in performance by race differed across phenotypes. Scalable phenotyping models, like most healthcare AI, require algorithmovigilance and debiasing prior to implementation., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
45. Revolutionizing core muscle analysis in female sexual dysfunction based on machine learning.
- Author
-
Abdel Hady, Doaa A. and Abd El-Hafeez, Tarek
- Subjects
MACHINE learning ,CONVOLUTIONAL neural networks ,SEXUAL dysfunction ,RECURRENT neural networks ,DEEP learning - Abstract
The purpose of this study is to investigate the role of core muscles in female sexual dysfunction (FSD) and develop comprehensive rehabilitation programs to address this issue. We aim to answer the following research questions: what are the roles of core muscles in FSD, and how can machine and deep learning models accurately predict changes in core muscles during FSD? FSD is a common condition that affects women of all ages, characterized by symptoms such as decreased libido, difficulty achieving orgasm, and pain during intercourse. We conducted a comprehensive analysis of changes in core muscles during FSD using machine and deep learning. We evaluated the performance of multiple models, including multi-layer perceptron (MLP), long short-term memory (LSTM), convolutional neural network (CNN), recurrent neural network (RNN), ElasticNetCV, random forest regressor, SVR, and Bagging regressor. The models were evaluated based on mean squared error (MSE), mean absolute error (MAE), and R-squared (R
2 ) score. Our results show that CNN and random forest regressor are the most accurate models for predicting changes in core muscles during FSD. CNN achieved the lowest MSE (0.002) and the highest R2 score (0.988), while random forest regressor also performed well with an MSE of 0.0021 and an R2 score of 0.9905. Our study demonstrates that machine and deep learning models can accurately predict changes in core muscles during FSD. The neglected core muscles play a significant role in FSD, highlighting the need for comprehensive rehabilitation programs that address these muscles. By developing these programs, we can improve the quality of life for women with FSD and help them achieve optimal sexual health. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
46. Joint extraction model of entity relations based on decomposition strategy.
- Author
-
Li, Ran, La, kaijun, Lei, Jingsheng, Huang, Liya, Ouyang, Jing, Shu, Yu, and Yang, Shengying
- Subjects
NATURAL language processing ,IDENTIFICATION ,CLASSIFICATION ,SHARING - Abstract
Named entity recognition and relation extraction are two important fundamental tasks in natural language processing. The joint entity-relationship extraction model based on parameter sharing can effectively reduce the impact of cascading errors on model performance by performing joint learning of entities and relationships in a single model, but it still cannot essentially get rid of the influence of pipeline models and suffers from entity information redundancy and inability to recognize overlapping entities. To this end, we propose a joint extraction model based on the decomposition strategy of pointer mechanism is proposed. The joint extraction task is divided into two parts. First, identify the head entity, utilizing the positive gain effect of the head entity on tail entity identification.Then, utilize a hierarchical model to improve the accuracy of the tail entity and relationship identification. Meanwhile, we introduce a pointer model to obtain the joint features of entity boundaries and relationship types to achieve boundary-aware classification. The experimental results show that the model achieves better results on both NYT and WebNLG datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. A ResNet-LSTM hybrid model for predicting epileptic seizures using a pretrained model with supervised contrastive learning.
- Author
-
Lee, Dohyun, Kim, Byunghyun, Kim, Taejoon, Joe, Inwhee, Chong, Jongwha, Min, Kyeongyuk, and Jung, Kiyoung
- Subjects
EPILEPSY ,BLENDED learning ,PUBLIC hospitals ,FOURIER transforms ,SUPERVISED learning ,UNIVERSITY hospitals - Abstract
In this paper, we propose a method for predicting epileptic seizures using a pre-trained model utilizing supervised contrastive learning and a hybrid model combining residual networks (ResNet) and long short-term memory (LSTM). The proposed training approach encompasses three key phases: pre-processing, pre-training as a pretext task, and training as a downstream task. In the pre-processing phase, the data is transformed into a spectrogram image using short time Fourier transform (STFT), which extracts both time and frequency information. This step compensates for the inherent complexity and irregularity of electroencephalography (EEG) data, which often hampers effective data analysis. During the pre-training phase, augmented data is generated from the original dataset using techniques such as band-stop filtering and temporal cutout. Subsequently, a ResNet model is pre-trained alongside a supervised contrastive loss model, learning the representation of the spectrogram image. In the training phase, a hybrid model is constructed by combining ResNet, initialized with weight values from the pre-trained model, and LSTM. This hybrid model extracts image features and time information to enhance prediction accuracy. The proposed method's effectiveness is validated using datasets from CHB-MIT and Seoul National University Hospital (SNUH). The method's generalization ability is confirmed through Leave-one-out cross-validation. From the experimental results measuring accuracy, sensitivity, and false positive rate (FPR), CHB-MIT was 91.90%, 89.64%, 0.058 and SNUH was 83.37%, 79.89%, and 0.131. The experimental results demonstrate that the proposed method outperforms the conventional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Graph autoencoder with mirror temporal convolutional networks for traffic anomaly detection.
- Author
-
Ren, Zhiyu, Li, Xiaojie, Peng, Jing, Chen, Ken, Tan, Qushan, Wu, Xi, and Shi, Canghong
- Subjects
CONVOLUTIONAL neural networks ,ANOMALY detection (Computer security) ,TRAFFIC monitoring ,DEEP learning ,GAUSSIAN measures ,FEATURE extraction ,KERNEL functions - Abstract
Traffic time series anomaly detection has been intensively studied for years because of its potential applications in intelligent transportation. However, classical traffic anomaly detection methods often overlook the evolving dynamic associations between road network nodes, which leads to challenges in capturing the long-term temporal correlations, spatial characteristics, and abnormal node behaviors in datasets with high periodicity and trends, such as morning peak travel periods. In this paper, we propose a mirror temporal graph autoencoder (MTGAE) framework to explore anomalies and capture unseen nodes and the spatiotemporal correlation between nodes in the traffic network. Specifically, we propose the mirror temporal convolutional module to enhance feature extraction capabilities and capture hidden node-to-node features in the traffic network. Morever, we propose the graph convolutional gate recurrent unit cell (GCGRU CELL) module. This module uses Gaussian kernel functions to map data into a high-dimensional space, and enables the identification of anomalous information and potential anomalies within the complex interdependencies of the traffic network, based on prior knowledge and input data. We compared our work with several other advanced deep-learning anomaly detection models. Experimental results on the NYC dataset illustrate that our model works best compared to other models for traffic anomaly detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Non-contrast ultrasound image analysis for spatial and temporal distribution of blood flow after spinal cord injury.
- Author
-
Routkevitch, Denis, Soulé, Zoe, Kats, Nicholas, Baca, Emily, Hersh, Andrew M., Kempski-Leadingham, Kelley M., Menta, Arjun K., Bhimreddy, Meghana, Jiang, Kelly, Davidar, A. Daniel, Smit, Constantin, Theodore, Nicholas, Thakor, Nitish V., and Manbachi, Amir
- Subjects
BLOOD flow ,CONTRAST-enhanced ultrasound ,SPINAL cord injuries ,IMAGE analysis ,ULTRASONIC imaging ,CONTRAST media - Abstract
Ultrasound technology can provide high-resolution imaging of blood flow following spinal cord injury (SCI). Blood flow imaging may improve critical care management of SCI, yet its duration is limited clinically by the amount of contrast agent injection required for high-resolution, continuous monitoring. In this study, we aim to establish non-contrast ultrasound as a clinically translatable imaging technique for spinal cord blood flow via comparison to contrast-based methods and by measuring the spatial distribution of blood flow after SCI. A rodent model of contusion SCI at the T12 spinal level was carried out using three different impact forces. We compared images of spinal cord blood flow taken using both non-contrast and contrast-enhanced ultrasound. Subsequently, we processed the images as a function of distance from injury, yielding the distribution of blood flow through space after SCI, and found the following. (1) Both non-contrast and contrast-enhanced imaging methods resulted in similar blood flow distributions (Spearman's ρ = 0.55, p < 0.0001). (2) We found an area of decreased flow at the injury epicenter, or umbra (p < 0.0001). Unexpectedly, we found increased flow at the periphery, or penumbra (rostral, p < 0.05; caudal, p < 0.01), following SCI. However, distal flow remained unchanged, in what is presumably unaffected tissue. (3) Finally, tracking blood flow in the injury zones over time revealed interesting dynamic changes. After an initial decrease, blood flow in the penumbra increased during the first 10 min after injury, while blood flow in the umbra and distal tissue remained constant over time. These results demonstrate the viability of non-contrast ultrasound as a clinical monitoring tool. Furthermore, our surprising observations of increased flow in the injury periphery pose interesting new questions about how the spinal cord vasculature reacts to SCI, with potentially increased significance of the penumbra. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Linguistic disparities in cross-language automatic speech recognition transfer from Arabic to Tashlhiyt.
- Author
-
Zellou, Georgia and Lahrouchi, Mohamed
- Subjects
AUTOMATIC speech recognition ,SPEECH perception ,SPEECH ,PROSODIC analysis (Linguistics) ,ERROR rates - Abstract
Tashlhiyt is a low-resource language with respect to acoustic databases, language corpora, and speech technology tools, such as Automatic Speech Recognition (ASR) systems. This study investigates whether a method of cross-language re-use of ASR is viable for Tashlhiyt from an existing commercially-available system built for Arabic. The source and target language in this case have similar phonological inventories, but Tashlhiyt permits typologically rare phonological patterns, including vowelless words, while Arabic does not. We find systematic disparities in ASR transfer performance (measured as word error rate (WER) and Levenshtein distance) for Tashlhiyt across word forms and speaking style variation. Overall, performance was worse for casual speaking modes across the board. In clear speech, performance was lower for vowelless than for voweled words. These results highlight systematic speaking mode- and phonotactic-disparities in cross-language ASR transfer. They also indicate that linguistically-informed approaches to ASR re-use can provide more effective ways to adapt existing speech technology tools for low resource languages, especially when they contain typologically rare structures. The study also speaks to issues of linguistic disparities in ASR and speech technology more broadly. It can also contribute to understanding the extent to which machines are similar to, or different from, humans in mapping the acoustic signal to discrete linguistic representations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.