Descriptor: "T58.5-58.64" / Topic: electronic computers. computer science - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"T58.5-58.64"' showing total 8,859 results

Start Over Descriptor "T58.5-58.64" Topic electronic computers. computer science

8,859 results on '"T58.5-58.64"'

1. Leveraging Deep Learning and Multimodal Large Language Models for Near-Miss Detection Using Crowdsourced Videos

Author: Shadi Jaradat, Mohammed Elhenawy, Huthaifa I. Ashqar, Alexander Paz, and Richi Nayak
Subjects: Near-miss detection, crowdsource, proactive approach, convolutional neural network (CNN), vision transformer, LSTM, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Near-miss traffic incidents, positioned just above "unsafe acts" on the safety triangle theory, offer crucial predictive insights for preventing crashes. However, these incidents are often underrepresented in traffic safety research, which tends to focus primarily on actual crashes. This study introduces a novel AI-based framework designed to detect and analyze near-miss and crash events in crowdsourced dashcam footage. The framework consists of two key components: a deep learning model to segment video streams and identify potential near-miss or crash incidents and a multimodal large language model (MLLM) to further analyze and extract narrative information from the identified events. We evaluated three deep learning models—CNN, Vision Transformers (ViTs), and CNN+LSTM—on a dataset specifically curated for three-class classification (crashes, near-misses, and normal driving events). CNN achieved the highest accuracy (90%) and F1-score (89%) at the frame level. At the event level, ViTs delivered a strong performance with a test accuracy of 77.27% and an F1-score of 67.37%, while CNN+LSTM, although lower in overall performance, demonstrated significant potential with a test accuracy of 78.1% and an F1-score of 68.69%. For a deeper analysis, we applied GPT-4o to process critical safety events (near-misses and crashes), utilizing both zero-shot and few-shot learning for narrative generation and feature extraction. The zero-shot learning method performed better, achieving an accuracy of 81.2% and an F1-score of 81.9%. This study underscores the potential of combining deep learning with MLLMs to enhance traffic safety analysis by integrating near-miss data as a key predictive layer. Our approach highlights the importance of leveraging near-miss incidents to proactively enhance road safety, thereby reducing the likelihood of crashes through early intervention and better event understanding.
Published: 2025
Full Text: View/download PDF

2. Exploring sentiment analysis in handwritten and E-text documents using advanced machine learning techniques: a novel approach

Author: Rayees Ahamad and Kamta Nath Mishra
Subjects: Convolutional Neural Network (CNN), Deep Learning, Emotion Type detection, Handwriting Image to E-Text Conversion (HTC), Intelligent Techniques, Machine Learning, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Traditionally, many people still wish to write on pen and paper. However, it has some drawbacks like accessing and storing physical documents efficiently, searching through them, and sharing them efficiently. Handwriting-to-text recognition classifies an individual’s handwriting and converts it into digital form. However, Handwriting Image to E-Text Conversion (HTC) removes all of the mentioned problems as it is easier to store, retrieve, and use the text as and when required. Emotions are a basic and very important aspect of anyone’s life. To understand this important aspect of an individual’s life, we have to detect emotions using affect data like text, voice, images, etc. This research work investigates the application of machine learning and deep learning methods in performing sentiment analysis on both handwritten and E-text statements. The primary objective of this research work is to distinguish the sentiment polarity and categorize it as positive, negative, or neutral while identifying emotion types such as happiness, sadness, surprise, fear, anger, disgust, and contempt. The study employs sophisticated methodologies to analyze handwritten image documents and E-text statements to provide a comprehensive understanding of sentiment nuances in diverse forms of communication. The authors proposed the Exploration of Sentiment Insights in Handwritten and E-text through Advanced Machine Learning (ESIHE_AML) algorithms-based model that finds the sentiment polarity and emotion types of handwritten text as well as E-Text. The results of the proposed model are described using various machine learning and deep learning-based approaches. Further, it significantly contributes to the advancements in sentiment analysis techniques and offers valuable insights into the emotional content present in both traditional and handwritten text formats. The proposed model shows higher accuracy (more than 90% in all cases) on standard bench mark datasets of Twitter, Kaggle, IAM, and Amazon reviews etc. It may further be increased by employing a hybrid approach of intelligent algorithms. This study highlights the adaptability of the ESIHE_AML algorithms-based model for analyzing sentiments in digital communication systems of the modern era.
Published: 2025
Full Text: View/download PDF

3. A new dimensionality reduction technique based on the Wavelet Transform for cancer classification

Author: Lisardo Fernández, Mariano Pérez, Juan M. Orduña, and José M. Alcaraz
Subjects: Dimensionality reduction, Cancer classification, DNA methylation analysis, Wavelet Transform, Machine learning classification, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Problem DNA methylation and hydroxymethylation have become important epigenetic markers for early detection of cancer. In recent years, there has been a significant increase in both the number of research works on this topic and the number and size of labeled databases with some type of cancer. Although the advent of methylation microarrays such as the HumanMethylation450 platform has greatly reduced the dimensionality of the problem from billions to 450K positions, this data size is still too large to be processed by machine learning algorithms for cancer prediction and classification. Aim In the particular case of methylation, an efficient dimensionality reduction technique should also preserve the spatial information of the original data in order to properly predict and classify cancer. Method This work proposes a new approach for data dimensionality reduction technique based on the Discrete Wavelet Transform (DWT), which preserves spatial information. We have evaluated the proposed technique with a dataset collected from the most important cancer databases according to their social impact, and we have compared our proposal to five well-known dimensionality reduction techniques: PCA, ReliefF, Isomap, LLE and UMAP. Results The performance evaluation results show that the proposed technique significantly reduces both the computational resources and the execution time required for dimensionality reduction. In addition, it significantly improves the accuracy achieved in the classification by a support vector machine when it uses as input data the resulting dataset yielded by each technique. Conclusions The proposed approach based on the DWT can be considered as an efficient alternative for those cases where dimensionality reduction must preserve spatial information.
Published: 2025
Full Text: View/download PDF

4. I know your stance! Analyzing Twitter users’ political stance on diverse perspectives

Author: Jisu Kim, Dongjae Kim, and Eunil Park
Subjects: Political stance, Tweet user stance model, Twitter, Machine learning, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract The popularity of social network service users has increased in recent years, altering politicians’ interest level in social network services. Given this trend, social network services now play a central role in political communication channels, enabling them to express and share their opinions and news directly with citizens. Therefore, many researchers have attempted to investigate social network service users’ political stances and proposed a user’s political stance model utilizing this dataset. Understanding and detecting social network services and a user’s political stance can play a significant role in marketing strategies and determining election winners. In light of this, the present study examined Twitter from diverse perspectives to analyze and detect a Twitter user’s political stance. This study collected Twitter datasets and labeled a user’s stance using a clustering approach to determine whether a user was a Democrat or a Republican. After an exploratory analysis of users’ tweet content: image and text, user network, and profile description, the tweet user stance detection model was proposed and tested. The results indicated notable differences between Democrats and Republicans from diverse perspectives on Twitter, with an accuracy of 85.35% compared with baseline models. The implications and limitations of this study were discussed based on the results.
Published: 2025
Full Text: View/download PDF

5. Binary plant rhizome growth-based optimization algorithm: an efficient high-dimensional feature selection approach

Author: Jin Zhang, Fu Yan, and Jianqiang Yang
Subjects: Feature selection, Binary plant root growth optimization algorithm, High dimensional data, Optimization, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Feature selection is a pivotal research area within machine learning, tasked with pinpointing the essential subset of features from a broad array that critically influences a model’s predictive capabilities. This process enhances model precision and drastically lowers the computational demands associated with training and predicting. Consequently, more advanced optimization techniques are employed to address the challenge of feature selection. This paper introduces an innovative intelligent optimization algorithm, the Plant Root Growth Optimization (PRGO) algorithm, inspired by the structure of plant rhizomes and the way they absorb nutrients.In the algorithm, the plant rhizomes are divided into two categories, the taproot and the fibrous root.the growth process of the taproot plants is associated with the global exploration search, and the growth process of the fibrous root plants relates to the local exploitation search.The global asymptotic convergence of the algorithm is proved by applying Markov’s correlation theory, and simulation results using CEC2014 and CEC2017 test sets show that the proposed algorithm has excellent performance.Moreover, a binary variant of this algorithm (BPRGO) has been specifically crafted in this research to tackle the complexities of high-dimensional feature selection issues. The algorithm was compared to eight well-known feature selection methods and its performance was evaluated using a variety of evaluation metrics on 16 high-dimensional datasets from the Arizona State University feature selection library. and the performance of the proposed algorithm was evaluated through feature subset size, classification accuracy, fitness value, and F1-score. The experimental results show that BPRGO achieves the best performance, which has stronger feature reduction ability and achieves better overall performance on most datasets. BPRGO can obtain extremely smaller feature subsets while maintaining much higher classification accuracy, and satisfactory F1-score.
Published: 2025
Full Text: View/download PDF

6. A new framework to assess the impact of new IT-based technologies on the success of quality management system

Author: Yiying Cao and Farah Qasim Ahmed Alyousuf
Subjects: Internet of things, Big data, Business intelligence, Cloud computing, Distributed systems, NFC and RFID, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Today, the ever-increasing growth of technology and data has undeniably affected various industries. Due to the importance of the factor of comprehensive quality management (QM), big data, and information technology (IT) in the industry and the increasing attention to it, different attitudes regarding the ways and means of reaching it have been presented by these sources. As a result, the purpose of this research is to study new IT-based technologies related to the success of the QMS. The statistical population of this research is the managers and employees of the technology sectors, which is equal to 400 people. The sample size was equal to 196 people, and a simple random sampling method was used to obtain the sample size. This research is applied in terms of its nature and purpose, and in terms of the data collection method, it is a descriptive and survey type. PLS structural equation modeling and SPSS software were used for data analysis. The findings showed that the internet of things (IoT), business intelligence (BI), cloud computing, and distributed systems have a significant impact on the success of the QMS, while near-field communication and radio frequency identification have an impact on the success of the QMS through the intermediary variable of the IoT. Finally, the results revealed that the expert systems also affect the success of the QMS through the mediator variable of BI.
Published: 2025
Full Text: View/download PDF

7. Refining prognostic assessment of diffuse large B-cell lymphoma: insights from multi-omics and single-cell analysis unveil SRM as a key target for regulating immunotherapy

Author: Xiaojie Liang, Jia Guo, Baiwei Luo, Weixiang Lu, Qiumin Chen, Yeling Deng, Yunong Yang, and Liang Wang
Subjects: DLBCL, Proliferation, Stromal, Immune, Risk stratification, SRM, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Purposes Previous studies have demonstrated that proliferation, stroma or immunity strongly influence the prognosis and therapeutic resistance of diffuse large B-cell lymphoma (DLBCL). Herein, we aimed to integrate proliferation, stromal, and immune (PSI) features to systematically evaluate the risk stratification and explore novel therapeutic targets in DLBCL. Methods Using data from multiple researches, we comprehensively evaluated the characteristics and prognostic impact of PSI features in DLBCL, and developed a novel risk stratification model (PSI score) with a consistent cutoff value to stratify the risk of 3,229 DLBCL patients from different cohorts. Mechanisms underlying adverse prognosis in the high-risk DLBCLs were investigated through transcriptomic (n = 3,229), genomic (n = 576), and scRNA-seq (n = 20) analyses. Results We identified a high-risk DLBCL subgroup (HPSI, 36.1% of DLBCL). HPSI was characterized by upregulation of spermidine synthase (SRM) and cold tumor microenvironment (TME). Compared to low-risk group, HPSI exhibited poorer prognosis, with lower 3-year OS (51.7% vs. 78.1%, P
Published: 2025
Full Text: View/download PDF

8. A problem-agnostic approach to feature selection and analysis using SHAP

Author: John T. Hancock, Taghi M. Khoshgoftaar, and Qianxin Liang
Subjects: Class imbalance, Feature selection, SHAP, Credit Card Fraud Detection, One-class classification, Binary-class classification, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Feature selection is an effective data reduction technique. SHapley Additive exPlanations (SHAP) can be used to provide a feature importance ranking for models built with labeled or unlabeled data. Thus, one may use the SHAP feature importance ranking in a feature selection technique by selecting the k highest ranking features. Furthermore, this SHAP-based feature selection technique is applicable regardless of the availability of labels for data. We use the Kaggle Credit Card Fraud detection dataset to simulate three label availability scenarios. When no labeled data is available, unsupervised learners should be used. We explore feature selection for data reduction with Isolation Forest and SHAP for this case. When data of one class is available, a one-class classifier, such as Gaussian Mixture Model (GMM) can be used in combination with SHAP for determining feature importance, and for feature selection. Finally, if labeled data from both classes is available a binary-class classifier can be used in conjunction with SHAP for data reduction. Our contribution is to provide a comparative analysis of features selected in the three label availability scenarios. Our primary conclusion is that feature sets may be reduced with SHAP without compromising performance. To the best of our knowledge, this is the first study to explore a feature analysis technique, applicable in the three label availability scenarios.
Published: 2025
Full Text: View/download PDF

9. A novel multivariate time series dataset of outdoor sport activities

Author: Matarmaa Jarno
Subjects: Multivariate time series, Outdoor sport, Sport exercises, Sport dataset, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract This study introduces a novel multivariate time series dataset of 228 outdoor sport activities recorded by individual non-competitive athlete in uncontrolled environments. The dataset includes three features: Heart Rate, Speed, and Altitude, and covers five sport categories: walking, running, skiing, roller-skiing, and biking. The data was collected using two types of Garmin sport watches. The original dataset was carefully pre-processed using typical data cleansing methods such as gaps filling, and value format transformations. Furthermore, activity filtering was implemented for missing sensor value data and using domain knowledge of sport categories. Full length sequences, varying from 10 min to several hours, were split into equal length segments, approximately 1 min. To address the small number of instances data was augmented using several consecutive segments from the same activity. However, only a small part of the whole original data was used as a computational cost–information gain tradeoff. Three-dimensional dataset is divided into three parts, each dimension to its own comma separated value (CSV) file. The dataset aims to provide a unique resource for researchers and practitioners in the field of sports science, human performance analysis, and activity recognition. It aims to complement the very limited or non-existent publicly available sport activity datasets.
Published: 2025
Full Text: View/download PDF

10. A multitasking ant system for multi-depot pick-up and delivery location routing problem with time window

Author: Haoyuan Lv, Ruochen Liu, and Jianxia Li
Subjects: MDPDLRPTW, Evolutionary multitasking optimization, Ant system, Negative transfer, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Instant delivery service has brought great convenience to our modern life. In order to improve its efficiency, multi-depot pick-up-and-delivery location routing problem with time windows (MDPDLRPTW) is proposed in this paper. Existing works related to MDPDLRPTW focus on obtaining a depot location scheme by clustering and perform route planning on it through single-task optimization. They are powerless to simultaneously explore the solution spaces of multiple routing tasks under different location schemes. Furthermore, ignoring the potential general knowledge among different schemes leads to redundant optimization. In this work, MDPDLRPTW is modeled as a multi-transformation optimization (MTFO) problem and a novel two-stage algorithm based on multitasking ant system (MTAS) is designed to solve it. In the first stage, a clustering algorithm based on spatio-temporal feature is used to group similar customer pairs, and the clustering centers are set as warehouses. Afterward, multiple localization schemes are selected through non-dominated sorting based on spatio-temporal density. In the second stage, MTAS concurrently optimizes multiple routing tasks based on these location schemes, each task is assigned to an ant system solver. Furthermore, MTAS achieves knowledge sharing among all routing tasks through adaptive similarity measurement and cross-task pheromone fusion strategy. The former can dynamically capture the relationship between tasks to adjust the transfer strength of task pairs, and the latter realizes adaptive knowledge transfer by pheromone-matrix mixing. Experimental results show that MTAS can efficiently utilize the common knowledge to achieve competitive performance.
Published: 2025
Full Text: View/download PDF

11. Vehicle-routing problem for low-carbon cold chain logistics based on the idea of cost–benefit

Author: Yan Liu, Fengming Tao, and Rui Zhu
Subjects: Cold chain logistics, Vehicle-routing problem, Customer satisfaction, Carbon trading, Cost–benefit, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract In the low-carbon economy, the fresh industry constitutes an “impossible triangle” in products, prices and services. Therefore, based on the idea of cost–benefit, a comprehensive vehicle routing problem optimization model with the objective function of minimizing the cost of unit satisfied customer is presented. Then, a hybrid algorithm called local search genetic algorithm (LSGA) is proposed, which amalgamates the destroy-repair operator with GA algorithm. Extensive numerical experiments verify the feasibility and effectiveness of the proposed model and algorithm. Furthermore, the sensitivity analysis of freshness-keeping cost, carbon price and customer satisfaction weights were conducted. The experimental results show that appropriate freshness-keeping effort can reduce total costs and improve customer satisfaction. Increasing carbon price within a certain range can effectively reduce carbon emissions, and there is a trade-off relationship between carbon emissions and customer satisfaction. The results of considering both time satisfaction and freshness satisfaction are better than considering time satisfaction alone.
Published: 2025
Full Text: View/download PDF

12. XTNSR: Xception-based transformer network for single image super resolution

Author: Jagrati Talreja, Supavadee Aramvith, and Takao Onoye
Subjects: Single image super-resolution, Local feature window transformer block, Multi-layer feature fusion block, Xception block, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Single image super resolution has significantly advanced by utilizing transformers-based deep learning algorithms. However, challenges still need to be addressed in handling grid-like image patches with higher computational demands and addressing issues like over-smoothing in visual patches. This paper presents a Deep Learning model for single-image super-resolution. In this paper, we present the XTNSR model, a novel multi-path network architecture that combines Local feature window transformers (LWFT) with Xception blocks for single-image super-resolution. The model processes grid-like image patches effectively and reduces computational complexity by integrating a Patch Embedding layer. Whereas the Xception blocks use depth-wise separable convolutions for hierarchical feature extraction, the LWFT blocks capture long-range dependencies and fine-grained qualities. A multi-layer feature fusion block with skip connections, part of this hybrid architecture, guarantees efficient local and global feature fusion. The experimental results show better performance in Peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and visual quality than the state-of-the-art techniques. By optimizing parameters, the suggested architecture also lowers computational complexity. Overall, the architecture presents a promising approach for advancing image super-resolution capabilities.
Published: 2025
Full Text: View/download PDF

13. A crossover operator for objective functions defined over graph neighborhoods with interdependent and related variables

Author: Jaume Jordan, Javier Palanca, Victor Sanchez-Anguix, and Vicente Julian
Subjects: Genetic algorithms, Optimization, Artificial intelligence, Metaheuristics, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This article presents a new crossover operator for problems with an underlying graph structure where edges point to prospective interdependence relationships between decision variables and neighborhoods shape the definition of the global objective function via a sum of different expressions, one for each neighborhood. The main goal of this work is to propose a crossover operator that is broadly applicable, adaptable, and effective across a wide range of problem settings characterized by objective functions that are expressed in terms of graph neighbourhoods with interdependent and related variables. Extensive experimentation has been conducted to compare and evaluate the proposed crossover operator with both classic and specialized crossover operators. More specifically, the crossover operators have been tested under a variety of graph types, which model how variables are involved in interdependencies, different types of expressions in which interdependent variables are combined, and different numbers of decision variables. The results suggest that the new crossover operator is statistically better or at least as good as the best-performing crossover in 75% of the families of problems tested.
Published: 2025
Full Text: View/download PDF

14. Efficient guided inpainting of larger hole missing images based on hierarchical decoding network

Author: Xiucheng Dong, Yaling Ju, Dangcheng Zhang, Bing Hou, and Jinqing He
Subjects: Hierarchical decoding network, Gradient priors, Multi-dimensional efficient attention, Efficient context fusion, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract When dealing with images containing large hole-missing regions, deep learning-based image inpainting algorithms often face challenges such as local structural distortions and blurriness. In this paper, a novel hierarchical decoding network for image inpainting is proposed. Firstly, the structural priors extracted from the encoding layer are utilized to guide the first decoding layer, while residual blocks are employed to extract deep-level image features. Secondly, multiple hierarchical decoding layers progressively fill in the missing regions from top to bottom, then interlayer features and gradient priors are used to guide information transfer between layers. Furthermore, a proposed Multi-dimensional Efficient Attention is introduced for feature fusion, enabling more effective extraction of image features across different dimensions compared to conventional methods. Finally, Efficient Context Fusion combines the reconstructed feature maps from different decoding layers into the image space, preserving the semantic integrity of the output image. Experiments have been conducted to validate the effectiveness of the proposed method, demonstrating superior performance in both subjective and objective evaluations. When inpainting images with missing regions ranging from 50% to 60%, the proposed method achieves improvements of 0.02 dB (0.22 dB) and 0.001 (0.003) in PSNR and SSIM, on the CelebA-HQ (Places2) dataset, respectively.
Published: 2025
Full Text: View/download PDF

15. Short-term urban traffic forecasting in smart cities: a dynamic diffusion spatial-temporal graph convolutional network

Author: Xiang Yin, Junyang Yu, Xiaoyu Duan, Lei Chen, and Xiaoli Liang
Subjects: Traffic forecasting, Graph convolutional network, Spatial-temporal, Dynamic generation, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Short-term traffic forecasting is an important part of intelligent transportation systems. Accurately predicting short-term traffic trends can avoid traffic congestion and plan travel routes, which is of great significance to urban management and traffic scheduling. The difficulty of short-term urban traffic forecasting is that the traffic flow is random and will be dynamically changed by the traffic conditions of nearby nodes. In order to solve this problem, this paper proposes a model based on Dynamic Diffusion Spatial-Temporal Graph Convolutional Network. It first combines the dynamic generation matrix and the static distance matrix to grasp real-time traffic conditions, and then introduces the diffusion random walk strategy to capture the correlation of spatial nodes. Finally, the convolutional LSTM module is used to mine the spatiotemporal dependence of traffic data to improve the accuracy of traffic prediction. Compared to several baseline models, the experimental results show that the model is 7% better than other models on several metrics and demonstrates the necessity of the module through ablation experiments.
Published: 2025
Full Text: View/download PDF

16. What Time Is It? Finding Which Temporal Features is More Useful for Next Activity Prediction

Author: Lerina Aversano, Martina Iammarino, Antonella Madau, Giuseppe Pirlo, and Gianfranco Semeraro
Subjects: Process mining, next activity prediction, temporal information, classification, predictive process monitoring, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Process Mining merges data science and process science that allows for the analysis of recorded process data by capturing activities within event-logs. It finds more and more applications for the optimization of the production and administrative processes of private companies and public administrations. This field consists of several areas: process discovery, compliance monitoring, process improvement, and predictive process monitoring. Considering predictive process monitoring, the subarea of next activity prediction helps to obtain a prediction about the next activity performed using control flow data, event data with no attributes other than the timestamp, activity label, and case identifier. A popular approach in this subarea is to use sub-sequences of events, called prefixes and extracted with a sliding window, to predict the next activity. In the literature, several features are added to increase performance. Specifically, this article addresses the problem of predicting the next activity in predictive process monitoring, focusing on the usefulness of temporal features. While past research has explored a variety of features to improve prediction accuracy, the contribution of temporal information remains unclear. In this article it is proposed a comparative analysis of temporal features, such as differences in timestamp, time of day, and day of week, extracted for each event in a prefix. Using both k-fold cross-validation for robust benchmarking and a 75/25 split to simulate real scenarios in which new process events are predicted based on past data, it is shown that timestamp differences within the same prefix consistently outperform other temporal features. Our results are further validated by Shapley's value analysis, highlighting the importance of timestamp differences in improving the accuracy of next activity prediction.
Published: 2025
Full Text: View/download PDF

17. Comparative Analysis of Traditional and Modern NLP Techniques on the CoLA Dataset: From POS Tagging to Large Language Models

Author: Abdessamad Benlahbib, Achraf Boumhidi, Anass Fahfouh, and Hamza Alami
Subjects: Large language models (LLMs), linguistic acceptability, natural language processing, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: The task of classifying linguistic acceptability, exemplified by the CoLA (Corpus of Linguistic Acceptability) dataset, poses unique challenges for natural language processing (NLP) models. These challenges include distinguishing between subtle grammatical errors, understanding complex syntactic structures, and detecting semantic inconsistencies, all of which make the task difficult even for human annotators. In this article, we compare a range of techniques, from traditional methods such as Part-of-Speech (POS) tagging and feature extraction methods like CountVectorizer with Term Frequency-Inverse Document Frequency (TF-IDF) and N-grams, to modern embeddings such as FastText and Embeddings from Language Models (ELMo), as well as deep learning architectures like transformers and Large Language Models (LLMs). Our experiments show a clear improvement in performance as models evolve from traditional to more advanced approaches. Notably, state-of-the-art (SOTA) results were obtained by fine-tuning GPT-4o with extensive hyperparameter tuning, including experimenting with various epochs and batch sizes. This comparative analysis provides valuable insights into the relative strengths of each technique for identifying morphological, syntactic, and semantic violations, highlighting the effectiveness of LLMs in these tasks.
Published: 2025
Full Text: View/download PDF

18. Optimizing Energy Efficiency in UPA-Assisted SWIPT Massive MIMO Systems Over Rician Fading Channels

Author: Mohammad Hassan Adeli, Dariush Abbasi-Moghadam, Hossein Fotouhi, and S. Mohammad Razavizadeh
Subjects: 3D beamforming, convex optimization, energy efficiency, massive MIMO, nonlinear energy harvesting, power allocation, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Massive Multiple Input Multiple Output (mMIMO) is a promising solution for enabling green communication in next-generation wireless networks. Integrating mMIMO with Simultaneous Wireless Information and Power Transfer (SWIPT) technology can further enhance the system efficiencies in terms of Energy Efficiency (EE) and spectral efficiency. This article studies the feasibility and energy-efficient design of a uniform planar antenna (UPA)-assisted mMIMO-enabled SWIPT system. The downlink transmission of the SWIPT mMIMO system over the Rician fading channels is investigated with terminals harvesting energy based on a nonlinear energy harvesting model. We derive approximate expressions for signal-to-interference-plus-noise Ratio (SINR) and harvested power. Additionally, we formulate an EE optimization problem considering user-level quality of service and total transmit power constraints. To solve this nonconvex problem, we jointly optimize the allocated power and Power Splitting (PS) ratios by exploiting the fractional programming and convex-concave procedure approaches. Results demonstrate the superiority of our proposed design compared to the conventional scenarios with equal power allocation and fixed PS ratio algorithms with about 2 to 5 times EE improvements. The Results also indicate a considerably higher growth rate on EE by increasing the number of antennas and Rician factors compared to the two other methods.
Published: 2025
Full Text: View/download PDF

19. Statistical Validity of Neural-Net Benchmarks

Author: Alain Hadges and Srikar Bellur
Subjects: Bayesian credible interval, benchmark essay, comparison, factorial experiment, hyper-parameters, machine learning, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Claims of better, faster or more efficient neural-net designs often hinge on low single digit percentage improvements (or less) in accuracy or speed compared to others. Current benchmark differences used for comparison have been based on a number of different metrics such as recall, the best of five-runs, the median of five runs, Top-1, Top-5, BLEU, ROC, RMS, etc. These metrics implicitly assert comparable distributions of metrics. Conspicuous by their absence are measures of statistical validity of these benchmark comparisons. This study examined neural-net benchmark metric distributions and determined there are researcher degrees of freedom that may affect comparison validity. An essay is developed and proposed for benchmarking and comparing reasonably expected neural-net performance metrics that minimizes researcher degrees of freedom. The essay includes an estimate of the effects and the interactions of hyper-parameter settings on the benchmark metrics of a neural-net as a measure of its optimization complexity.
Published: 2025
Full Text: View/download PDF

20. Energy Efficiency of Kernel and User Space Level VPN Solutions in AIoT Networks

Author: ALEKSANDAR JEVREMOVIC, Zona Kostic, Ivan Chorbev, Dragan Perakovic, Andrii Shalaginov, and Ivan Cvitic
Subjects: Internet of Things (IoT), AIoT, WireGuard, OpenSSL, energy efficiency, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: The ability to process data locally using complex algorithms is becoming increasingly important in Internet of Things (IoT) contexts. Numerous factors contribute to this trend, including the requirement for immediate response, the need to protect data privacy/security, a lack of adequate infrastructure, and the desire to reduce costs. Due to the extensive hardware requirements (in terms of required computing power, memory, and other resources) for handling various scenarios, edge devices are typically configured to utilize general-purpose operating systems, primarily GNU/Linux. However, energy efficiency remains a critical requirement for this devices, especially in battery-powered scenarios (where energy inefficiency could make the device completely inoperable). Local data processing usually minimizes, but not entirely eliminates, data exchange with the environment. Along with energy costs of data processing, it is critical to also consider the energy efficiency of data protection when communicating with the environment. In this article, we evaluate the energy efficiency of kernel-level and user-space-level communication protection solutions: WireGuard and OpenSSL. These systems are evaluated on a range of hardware platforms, including Raspberry Pi 3, Nvidia Jetson NANO, Nvidia Jetson TX2, and Nvidia Jetson AGX Xavier. The energy efficiency of these systems was determined by examining long transfer streams with maximum channel/CPU utilization. We discovered that determining the energy efficiency of a device or protocol is difficult due to the high reliance on factors such as communication speed and direction.
Published: 2025
Full Text: View/download PDF

21. A systematic review of AI-enhanced techniques in credit card fraud detection

Author: Ibrahim Y. Hafez, Ahmed Y. Hafez, Ahmed Saleh, Amr A. Abd El-Mageed, and Amr A. Abohany
Subjects: Fraud attacks, Fraud detection (FD), Credit card fraud detection (CCFD), Machine learning (ML), Deep learning (DL), Meta-heuristic optimization (MHO), Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract The rapid increase of fraud attacks on banking systems, financial institutions, and even credit card holders demonstrate the high demand for enhanced fraud detection (FD) systems for these attacks. This paper provides a systematic review of enhanced techniques using Artificial Intelligence (AI), machine learning (ML), deep learning (DL), and meta-heuristic optimization (MHO) algorithms for credit card fraud detection (CCFD). Carefully selected recent research papers have been investigated to examine the effectiveness of these AI-integrated approaches in recognizing a wide range of fraud attacks. These AI techniques were evaluated and compared to discover the advantages and disadvantages of each one, leading to the exploration of existing limitations of ML or DL-enhanced models. Discovering the limitation is crucial for future work and research to increase the effectiveness and robustness of various AI models. The key finding from this study demonstrates the need for continuous development of AI models that could be alert to the latest fraudulent activities.
Published: 2025
Full Text: View/download PDF

22. Transformer enabled multi-modal medical diagnosis for tuberculosis classification

Author: Sachin Kumar, Shivani Sharma, and Kassahun Tadesse Megra
Subjects: Transformer, Multimodal medical analysis, Tuberculosis classification, Lung disease diagnosis, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Recently, multimodal data analysis in medical domain has started receiving a great attention. Researchers from both computer science, and medicine are trying to develop models to handle multimodal medical data. However, most of the published work have targeted the homogeneous multimodal data. The collection and preparation of heterogeneous multimodal data is a complex and time-consuming task. Further, development of models to handle such heterogeneous multimodal data is another challenge. This study presents a cross modal transformer-based fusion approach for multimodal clinical data analysis using medical images and clinical data. The proposed approach leverages the image embedding layer to convert image into visual tokens, and another clinical embedding layer to convert clinical data into text tokens. Further, a cross-modal transformer module is employed to learn a holistic representation of imaging and clinical modalities. The proposed approach was tested for a multi-modal lung disease tuberculosis data set. Further, the results are compared with recent approaches proposed in the field of multimodal medical data analysis. The comparison shows that the proposed approach outperformed the other approaches considered in the study. Another advantage of this approach is that it is faster to analyze heterogeneous multimodal medical data in comparison to existing methods used in the study, which is very important if we do not have powerful machines for computation.
Published: 2025
Full Text: View/download PDF

23. Enhancing cardiac diagnostics: a deep learning ensemble approach for precise ECG image classification

Author: Ahmed Alsayat, Alshimaa Abdelraof Mahmoud, Saad Alanazi, Ayman Mohamed Mostafa, Nasser Alshammari, Majed Abdullah Alrowaily, Hosameldeen Shabana, and Mohamed Ezz
Subjects: Cardiovascular diseases, Deep learning, ECG classification, Neural network architectures, Transfer learning, Ensemble learning, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Cardiovascular diseases are a global health challenge that necessitates improvements in diagnostic accuracy and efficiency. This study examines the potential of deep learning (DL) models for the classification of electrocardiogram (ECG) images to assist in the identification of various cardiac conditions. We initiated a two-tiered experimental framework to investigate the effectiveness of several neural network architectures in this medical application. In the first experiment, eight distinct neural network models were selected based on their top-5 accuracy on the ImageNet validation dataset and were fine-tuned using transfer learning techniques. These models were assessed using a cross-validation scheme, focusing on balanced accuracy, precision, recall, and the F1-score to evaluate their classification capabilities across four cardiac conditions: Myocardial Infarction (MI), abnormal heartbeat, historical MI, and normal ECG patterns. The second experiment extended our inquiry into the power of ensemble learning. By testing all possible combinations of the chosen models, we explored 120 ensemble configurations. The resulting analysis identified the best-performing ensemble set, which did not include the least effective model based on F1 score rankings. The most effective ensemble, composed of Inception, MobileNet, and NASNetLarge, achieved an F1 score of 0.9651 and a balanced accuracy of 0.9640, indicating a superior predictive performance. The ROC curve analysis yielded near-perfect Area Under the Curve (AUC) values for all classes, underscoring the ensemble’s proficiency in distinguishing between the specified cardiac conditions. The outcomes of this research highlight the synergistic benefit of ensembles in DL applications for medical imaging and suggest a promising approach for the early detection and diagnosis of cardiac diseases, potentially improving clinical outcomes and patient care.
Published: 2025
Full Text: View/download PDF

24. A bi-subpopulation coevolutionary immune algorithm for multi-objective combinatorial optimization in multi-UAV task allocation

Author: Xi Chen, Yu Wan, Jingtao Qi, Zipeng Zhao, Yirun Ruan, and Jun Tang
Subjects: Coevolutionary algorithm, Multi-objective combinatorial optimization, Multi-objective immune algorithm, Multi-UAV, Task allocation problem, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract With the development of Unmanned Aerial Vehicle (UAV) technology towards multi-UAV and UAV swarm, multi-UAV cooperative task allocation has more and more influence on the success or failure of UAV missions. From the operational research point of view, such problems belong to high-dimensional combinatorial optimization problems, which makes the solving process face many challenges. One is that the discrete and high-dimensional decision variables make the quality of the solution obtained with acceptable time not guaranteed. Second, the desired solution of real missions often needs to satisfy multiple objective functions, or a set of solutions for decision-making. Therefore, this paper constructs a Multi-objective Combinatorial Optimization in Multi-UAV Task Allocation Problem (MCOTAP) model, and proposes a Bi-subpopulation Coevolutionary Immune Algorithm (BCIA). The two coevolutionary mechanisms improve the lower limit of population diversity, and the evolutionary strategy pool integrating multiple strategies and the adaptive strategy selection mechanism enhance the local search ability in the late evolution. In the experiments, BCIA competes fairly with the mainstream multi-objective evolutionary algorithms (MOEAs), multi-objective immune algorithms (MOIAs) and the recently proposed multi-UAV mission planning algorithms. The experimental results on different test problems (including several multi-objective combinatorial optimization benchmark problems and the proposed MCOTAP model) show that BCIA has superior performance in solving multi-objective combinatorial optimization problems (MCOPs). At the same time, the effectiveness of each design component of BCIA has been comprehensively verified in the ablation study.
Published: 2025
Full Text: View/download PDF

25. View adaptive multi-object tracking method based on depth relationship cues

Author: Haoran Sun, Yang Li, Guanci Yang, Zhidong Su, and Kexin Luo
Subjects: Multi-object tracking, Tracking-by-detection, View adaptive, Depth relationship, Data association, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Multi-object tracking (MOT) tasks face challenges from multiple perception views due to the diversity of application scenarios. Different views (front-view and top-view) have different imaging and data distribution characteristics, but the current MOT methods do not consider these differences and only adopt a unified association strategy to deal with various occlusion situations. This paper proposed View Adaptive Multi-Object Tracking Method Based on Depth Relationship Cues (ViewTrack) to enable MOT to adapt to the scene's dynamic changes. Firstly, based on exploiting the depth relationships between objects by using the position information of the bounding box, a view-type recognition method based on depth relationship cues (VTRM) is proposed to perceive the changes of depth and view within the dynamic scene. Secondly, by adjusting the interval partitioning strategy to adapt to the changes in view characteristics, a view adaptive partitioning method for tracklet sets and detection sets (VAPM) is proposed to achieve sparse decomposition in occluded scenes. Then, combining pedestrian displacement with Intersection over Union (IoU), a displacement modulated Intersection over Union method (DMIoU) is proposed to improve the association accuracy between detection and tracklet boxes. Finally, the comparison results with 12 representative methods demonstrate that ViewTrack outperforms multiple metrics on the benchmark datasets. The code is available at https://github.com/Hamor404/ViewTrack .
Published: 2025
Full Text: View/download PDF

26. Mape: defending against transferable adversarial attacks using multi-source adversarial perturbations elimination

Author: Xinlei Liu, Jichao Xie, Tao Hu, Peng Yi, Yuxiang Hu, Shumin Huo, and Zhen Zhang
Subjects: Deep learning security, Pattern recognition, Image classification, Adversarial example, Adversarial defense, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Neural networks are vulnerable to meticulously crafted adversarial examples, leading to high-confidence misclassifications in image classification tasks. Due to their consistency with regular input patterns and the absence of reliance on the target model and its output information, transferable adversarial attacks exhibit a notably high stealthiness and detection difficulty, making them a significant focus of defense. In this work, we propose a deep learning defense known as multi-source adversarial perturbations elimination (MAPE) to counter diverse transferable attacks. MAPE comprises the single-source adversarial perturbation elimination (SAPE) mechanism and the pre-trained models probabilistic scheduling algorithm (PPSA). SAPE utilizes a thoughtfully designed channel-attention U-Net as the defense model and employs adversarial examples generated by a pre-trained model (e.g., ResNet) for its training, thereby enabling the elimination of known adversarial perturbations. PPSA introduces model difference quantification and negative momentum to strategically schedule multiple pre-trained models, thereby maximizing the differences among adversarial examples during the defense model’s training and enhancing its robustness in eliminating adversarial perturbations. MAPE effectively eliminates adversarial perturbations in various adversarial examples, providing a robust defense against attacks from different substitute models. In a black-box attack scenario utilizing ResNet-34 as the target model, our approach achieves average defense rates of over 95.1% on CIFAR-10 and over 71.5% on Mini-ImageNet, demonstrating state-of-the-art performance.
Published: 2025
Full Text: View/download PDF

27. Protocol-based set-membership state estimation for linear repetitive processes with uniform quantization: a zonotope-based approach

Author: Minghao Gao, Pengfei Yang, Hailong Tan, and Qi Li
Subjects: Set-membership estimation, Linear repetitive processes, 2-D systems, Uniform quantization, Weighted try-once-discard protocol, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This paper is concerned with the zonotopic state estimation problem for a class of linear repetitive processes (LRPs) with weighted try-once-discard protocols (WTODPs) subject to uniform quantization. In such a system, the process disturbance and measurement noise are generally assumed to be unknown but bounded in certain zonotopes. The measurement data are uniformly quantized prior to entering the network. In order to effectively curb data collision, a WTODP is considered, based on which only the selected sensor is allowed to transmit the data through network. The aim of this paper is to find a zonotope that covers all possible states consistent with the system model and WTODP-based measured outputs. By using the zonotope properties, a zonotope containing all possible states is first constructed whose size is then minimized by designing an appropriate correlation matrix. Moreover, a sufficient condition is offered for the existence of an upper bound on the size of this zonotope. At last, we valid the efficacy of the developed estimation approach via an illustrate example.
Published: 2025
Full Text: View/download PDF

28. A generalized diffusion model for remaining useful life prediction with uncertainty

Author: Bincheng Wen, Xin Zhao, Xilang Tang, Mingqing Xiao, Haizhen Zhu, and Jianfeng Li
Subjects: Remaining useful life, Kalman filter, General diffusion model, Prognostic, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Forecasting the remaining useful life (RUL) is a crucial aspect of prognostics and health management (PHM), which has garnered significant attention in academic and industrial domains in recent decades. The accurate prediction of RUL relies on the creation of an appropriate degradation model for the system. In this paper, a general representation of diffusion process models with three sources of uncertainty for RUL estimation is constructed. According to time-space transformation, the analytic equations that approximate the RUL probability distribution function (PDF) are inferred. The results demonstrate that the proposed model is more general, covering several existing simplified cases. The parameters of the model are then calculated utilizing an adaptive technique based on the Kalman filter and expectation maximization with Rauch-Tung-Striebel (KF-EM-RTS). KF-EM-RTS can adaptively estimate and update unknown parameters, overcoming the limits of strong Markovian nature of diffusion model. Linear and nonlinear degradation datasets from real working environments are used to validate the proposed model. The experiments indicate that the proposed model can achieve accurate RUL estimation results.
Published: 2025
Full Text: View/download PDF

29. Preference learning based deep reinforcement learning for flexible job shop scheduling problem

Author: Xinning Liu, Li Han, Ling Kang, Jiannan Liu, and Huadong Miao
Subjects: Flexible job shop scheduling problem, Preference learning, Proximal policy optimization, Deep reinforcement learning, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract The flexible job shop scheduling problem (FJSP) holds significant importance in both theoretical research and practical applications. Given the complexity and diversity of FJSP, improving the generalization and quality of scheduling methods has become a hot topic of interest in both industry and academia. To address this, this paper proposes a Preference-Based Mask-PPO (PBMP) algorithm, which leverages the strengths of preference learning and invalid action masking to optimize FJSP solutions. First, a reward predictor based on preference learning is designed to model reward prediction by comparing random fragments, eliminating the need for complex reward function design. Second, a novel intelligent switching mechanism is introduced, where proximal policy optimization (PPO) is employed to enhance exploration during sampling, and masked proximal policy optimization (Mask-PPO) refines the action space during training, significantly improving efficiency and solution quality. Furthermore, the Pearson correlation coefficient (PCC) is used to evaluate the performance of the preference model. Finally, comparative experiments on FJSP benchmark instances of varying sizes demonstrate that PBMP outperforms traditional scheduling strategies such as dispatching rules, OR-Tools, and other deep reinforcement learning (DRL) algorithms, achieving superior scheduling policies and faster convergence. Even with increasing instance sizes, preference learning proves to be an effective reward mechanism in reinforcement learning for FJSP. The ablation study further highlights the advantages of each key component in the PBMP algorithm across performance metrics.
Published: 2025
Full Text: View/download PDF

30. A traffic prediction method for missing data scenarios: graph convolutional recurrent ordinary differential equation network

Author: Ming Jiang, Zhiwei Liu, and Yan Xu
Subjects: Intelligent transportation, Traffic prediction, Missing data, Graph convolutional neural network, Neural ordinary differential equation, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Traffic prediction plays an increasingly important role in intelligent transportation systems and smart cities. Both travelers and urban managers rely on accurate traffic information to make decisions about route selection and traffic management. Due to various factors, both human and natural, traffic data often contains missing values. Addressing the impact of missing data on traffic flow prediction has become a widely discussed topic in the academic community and holds significant practical importance. Existing spatiotemporal graph models typically rely on complete data, and the presence of missing values can significantly degrade prediction performance and disrupt the construction of dynamic graph structures. To address this challenge, this paper proposes a neural network architecture designed specifically for missing data scenarios—graph convolutional recurrent ordinary differential equation network (GCRNODE). GCRNODE combines recurrent networks based on ordinary differential equation (ODE) with spatiotemporal memory graph convolutional networks, enabling accurate traffic prediction and effective modeling of dynamic graph structures even in the presence of missing data. GCRNODE uses ODE to model the evolution of traffic flow and updates the hidden states of the ODE through observed data. Additionally, GCRNODE employs a data-independent spatiotemporal memory graph convolutional network to capture the dynamic spatial dependencies in missing data scenarios. The experimental results on three real-world traffic datasets demonstrate that GCRNODE outperforms baseline models in prediction performance under various missing data rates and scenarios. This indicates that the proposed method has stronger adaptability and robustness in handling missing data and modeling spatiotemporal dependencies.
Published: 2025
Full Text: View/download PDF

31. A novel group-based framework for nature-inspired optimization algorithms with adaptive movement behavior

Author: Adam Robson, Kamlesh Mistry, and Wai-Lok Woo
Subjects: Classification, Feature selection, Nature-inspired algorithms, Optimization, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This paper proposes two novel group-based frameworks that can be implemented into almost any nature-inspired optimization algorithm. The proposed Group-Based (GB) and Cross Group-Based (XGB) framework implements a strategy which modifies the attraction and movement behaviors of base nature-inspired optimization algorithms and a mechanism that creates a continuing variance within population groupings, while attempting to maintain levels of computational simplicity that have helped nature-inspired optimization algorithms gain notoriety within the field of feature selection. Through this functionality, the proposed framework seeks to increase search diversity within the population swarm to address issues such as premature convergence, and oscillations within the swarm. The proposed frameworks have shown promising results when implemented into the Bat algorithm (BA), Firefly algorithm (FA), and Particle Swarm Optimization algorithm (PSO), all of which are popular when applied to the field of feature selection, and have been shown to perform well in a variety of domains, gaining notoriety due to their powerful search capabilities.
Published: 2025
Full Text: View/download PDF

32. Adaptive temporal-difference learning via deep neural network function approximation: a non-asymptotic analysis

Author: Guoyong Wang, Tiange Fu, Ruijuan Zheng, Xuhui Zhao, Junlong Zhu, and Mingchuan Zhang
Subjects: Adaptive methods, Non-asymptotic convergence, Nonlinear function approximation, Reinforcement learning, Temporal-difference learning, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Although deep reinforcement learning has achieved notable practical achievements, its theoretical foundations have been scarcely explored until recent times. Nonetheless, the rate of convergence for current neural temporal-difference (TD) learning algorithms is constrained, largely due to their high sensitivity to stepsize choices. In order to mitigate this issue, we propose an adaptive neural TD algorithm (AdaBNTD) inspired by the superior performance of adaptive gradient techniques in training deep neural networks. Simultaneously, we derive non-asymptotic bounds for AdaBNTD within the Markovian observation framework. In particular, AdaBNTD is capable of converging to the global optimum of the mean square projection Bellman error (MSPBE) with a convergence rate of $${{\mathcal {O}}}(1/\sqrt{K})$$ O ( 1 / K ) , where K denotes the iteration count. Besides, the effectiveness AdaBNTD is also verified through several reinforcement learning benchmark domains.
Published: 2025
Full Text: View/download PDF

33. Optimization of high-dimensional expensive multi-objective problems using multi-mode radial basis functions

Author: Jiangtao Shen, Xinjing Wang, Ruixuan He, Ye Tian, Wenxin Wang, Peng Wang, and Zhiwen Wen
Subjects: Multi-objective optimization problem, High-dimensional, Expensive optimization, Surrogate ensemble, Structure design of BWBUG, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Numerous surrogate-assisted evolutionary algorithms are developed for multi-objective expensive problems with low dimensions, but scarce works have paid attention to that with high dimensions, i.e., generally more than 30 decision variables. In this paper, we propose a multi-mode radial basis functions-assisted evolutionary algorithm (MMRAEA) for solving high-dimensional expensive multi-objective optimization problems. To improve the reliability, the proposed algorithm uses radial basis functions based on three modes to cooperate to provide the qualities and uncertainty information of candidate solutions. Meanwhile, bi-population based on competitive swarm optimizer and genetic algorithm are applied for better exploration and exploitation in high-dimensional search space. Accordingly, an infill criterion based on multi-mode of radial basis functions that comprehensively considers the quality and uncertainty of candidate solutions is proposed. Experimental results on widely-used benchmark problems with up to 100 decision variables demonstrate the effectiveness of our proposal. Furthermore, the proposed method is applied to the structure optimization of the blended-wing-body underwater glider (BWBUG) and gets impressive solutions.
Published: 2025
Full Text: View/download PDF

34. DMR: disentangled and denoised learning for multi-behavior recommendation

Author: Yijia Zhang, Wanyu Chen, Fei Cai, Zhenkun Shi, and Feng Qi
Subjects: Multi-behavior recommendation, Fine-grained preferences, Contrastive learning, Graph convolutional network, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract In recommender systems, leveraging auxiliary behaviors (e.g. view, cart) to enhance the recommendation in the target behavior (e.g. purchase) is crucial for mitigating the sparsity issue inherent in single-behavior recommendation. This has given rise to the multi-behavior recommendation (MBR). Existing MBR task faces two primary challenges. First, the irrelevant auxiliary behaviors that do not align with the target behavior, can negatively impact the prediction accuracy for user preference in the target behavior. Second, these methods typically learn coarse-grained user preferences, failing to model the consistency and distinctiveness among multiple behaviors at a fine-grained level. To address these issues, we propose a disentangled and denoised model for multi-behavior recommendation (DMR), which employs user preferences reflected in the target behavior to guide the learning of user and item embeddings in auxiliary behaviors. Specifically, we first design a disentangled graph convolutional network, modeling the fine-grained user preference under multiple behaviors in view of item attribute domains. We also propose a denoised contrastive learning strategy, where we align the user preferences in multiple behaviors by reducing the influence of noisy data existing in auxiliary behaviors. Experimental results on two real-world datasets show the proposal can improve the performance of MBR models effectively, which achieves on average 3.12% on the Retailrocket dataset and 3.28% on the Beibei dataset over the performance of state-of-the-art baselines. Extensive experiments also demonstrate our model’s competitive performance for fine-grained preference learning and denoised learning.
Published: 2025
Full Text: View/download PDF

35. Computationally expensive constrained problems via surrogate-assisted dynamic population evolutionary optimization

Author: Zan Yang, Chen Jiang, and Jiansheng Liu
Subjects: Expensive constrained optimization, Radial basis function, Dynamic population, Sandwich structures, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This paper proposes a surrogate-assisted dynamic population optimization algorithm (SDPOA) for the purpose of solving computationally expensive constrained optimization problems, in which the population is dynamically updated based on the real-time iteration information to achieve targeted searches for solutions with different qualities. Specifically, the population is dynamically constructed by simultaneously considering the real-time feasibility, convergence, and diversity information of all the previously evaluated solutions. The evolution strategies adapted to dynamic populations are designed to arrange targeted search resources for individuals with different potentials. Specifically, for mutation, targeted base solution selection for the top 2 and other center points is designed for emphasizing the exploitation in promising regions; for selection, the search sources arranged on the best and other population individuals are adaptively adjusted with the iteration progresses; for constraint handling, the diversity of infeasible solutions is integrated into the original constraint-domination principle to avoid the locality of only using constraint violation to rank infeasible solutions. For accelerating the convergence, the sparse local search is designed based on update state of the current best solution in which two excellent but non adjacent individuals are used to provide valuable guidance information for local search. Therefore, SDPOA strikes a balance between feasibility, diversity, and convergence. Empirical studies demonstrate that the SDPOA achieves the best performance among all the compared state-of-the-art algorithms, and the SDPOA can obtain new structures with smaller compliance in the design of polyline-based core sandwich structures.
Published: 2025
Full Text: View/download PDF

36. Microscale search-based algorithm based on time-space transfer for automated test case generation

Author: Yinghan Hong, Fangqing Liu, Han Huang, Yi Xiang, Xueming Yan, and Guizhen Mai
Subjects: Test case generation, Path coverage, Large-scale optimization, Relationship matrix, Time-space transfer, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Automated test case generation for path coverage (ATCG-PC) is a major challenge in search-based software engineering due to its complexity as a large-scale black-box optimization problem. However, existing search-based approaches often fail to achieve high path coverage in large-scale unit programs. This is due to their expansive decision space and the presence of hundreds of feasible paths. In this paper, we present a microscale (small-size subsets of the decomposed decision set) search-based algorithm with time-space transfer (MISA-TST). This algorithm aims to identify more accurate subspaces consisting of optimal solutions based on two strategies. The dimension partition strategy employs a relationship matrix to track subspaces corresponding to the target paths. Additionally, the specific value strategy allows MISA-TST to focus the search on the neighborhood of specific dimension values rather than the entire dimension space. Experiments conducted on nine normal-scale and six large-scale benchmarks demonstrate the effectiveness of MISA-TST. The large-scale unit programs encompass hundreds of feasible paths or more than 1.00E+50 test cases. The results show that MISA-TST achieves significantly higher path coverage than other state-of-the-art algorithms in most benchmarks. Furthermore, the combination of the two time-space transfer strategies significantly enhances the performance of search-based algorithms like MISA, especially in large-scale unit programs.
Published: 2025
Full Text: View/download PDF

37. Enhancing zero-shot stance detection via multi-task fine-tuning with debate data and knowledge augmentation

Author: Qinlong Fan, Jicang Lu, Yepeng Sun, Qiankun Pi, and Shouxin Shang
Subjects: Zero-shot stance detection, LLMs, Debate corpus data, Multi-task fine-tuning, Knowledge augmentation, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract In the real world, stance detection tasks often involve assessing the stance or attitude of a given text toward new, unseen targets, a task known as zero-shot stance detection. However, zero-shot stance detection often suffers from issues such as sparse data annotation and inherent task complexity, which can lead to lower performance. To address these challenges, we propose combining fine-tuning of Large Language Models (LLMs) with knowledge augmentation for zero-shot stance detection. Specifically, we leverage stance detection and related tasks from debate corpora to perform multi-task fine-tuning of LLMs. This approach aims to learn and transfer the capability of zero-shot stance detection and reasoning analysis from relevant data. Additionally, we enhance the model’s semantic understanding of the given text and targets by retrieving relevant knowledge from external knowledge bases as context, alleviating the lack of relevant contextual knowledge. Compared to ChatGPT, our model achieves a significant improvement in the average F1 score, with an increase of 15.74% on the SemEval 2016 Task 6 A and 3.55% on the P-Stance dataset. Our model outperforms current state-of-the-art models on these two datasets, demonstrating the superiority of multi-task fine-tuning with debate data and knowledge augmentation.
Published: 2025
Full Text: View/download PDF

38. Enhancing navigation performance in unknown environments using spiking neural networks and reinforcement learning with asymptotic gradient method

Author: Xiaode Liu, Yufei Guo, Yuanpei Chen, Jie Zhou, Yuhan Zhang, Weihang Peng, Xuhui Huang, and Zhe Ma
Subjects: Spiking neural networks (SNN), Reinforcement learning, Autonomous navigation, Unmanned vehicle, Asymptotic gradient method, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Achieving accurate and generalized autonomous navigation in unknown environments poses a significant challenge in robotics and artificial intelligence. Animals exhibits superlative navigation capabilities by combining the representation of internal neurals and sensory cues of self-motion and external information. This paper proposes a brain-inspired navigation method based upon the spiking neural networks (SNN) and reinforcement learning, integrated with a lidar system that serves as the local environment explorer, by which realizes high performance of obstacle avoidance and target arrival in mapless circumstances. An asymptotic gradient method is introduced to optimize the backpropagation during training, which facilitates the improvement of model robustness. The results of our experiments conducted on the Gazebo platform showcase how our approach effectively improves navigation performance in various intricate environments. Our approach yielded a higher success navigation rate ranging from 2% to 5%, depending on the SNN timesteps. Considering the inherent lower computational cost of SNN, this work contributes to advancing the fusion of SNN and reinforcement learning techniques for energy-efficient autonomous navigation tasks in real-world mapless scenarios.
Published: 2025
Full Text: View/download PDF

39. A hybrid Framework for plant leaf disease detection and classification using convolutional neural networks and vision transformer

Author: Sherihan Aboelenin, Foriaa Ahmed Elbasheer, Mohamed Meselhy Eltoukhy, Walaa M. El-Hady, and Khalid M. Hosny
Subjects: Farming, Plant leaf disease classification, Hybrid model, Deep learning, Convolutional neural networks (CNNs), Feature concatenation, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Recently, scientists have widely utilized Artificial Intelligence (AI) approaches in intelligent agriculture to increase the productivity of the agriculture sector and overcome a wide range of problems. Detection and classification of plant diseases is a challenging problem due to the vast numbers of plants worldwide and the numerous diseases that negatively affect the production of different crops. Early detection and accurate classification of plant diseases is the goal of any AI-based system. This paper proposes a hybrid framework to improve classification accuracy for plant leaf diseases significantly. This proposed model leverages the strength of Convolutional Neural Networks (CNNs) and Vision Transformers (ViT), where an ensemble model, which consists of the well-known CNN architectures VGG16, Inception-V3, and DenseNet20, is used to extract robust global features. Then, a ViT model is used to extract local features to detect plant diseases precisely. The performance proposed model is evaluated using two publicly available datasets (Apple and Corn). Each dataset consists of four classes. The proposed hybrid model successfully detects and classifies multi-class plant leaf diseases and outperforms similar recently published methods, where the proposed hybrid model achieved an accuracy rate of 99.24% and 98% for the apple and corn datasets.
Published: 2025
Full Text: View/download PDF

40. Robust underwater object tracking with image enhancement and two-step feature compression

Author: Jiaqing Li, Chaocan Xue, Xuan Luo, Yubin Fu, and Bin Lin
Subjects: Underwater object tracking, Correlation filter, Underwater image enhancement, Feature compression, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Developing a robust algorithm for underwater object tracking (UOT) is crucial to support the sustainable development and utilization of marine resources. In addition to open-air tracking challenges, the visual object tracking (VOT) task presents further difficulties in underwater environments due to visual distortions, color cast issues, and low-visibility conditions. To address these challenges, this study introduces a novel underwater target tracking framework based on correlation filter (CF) with image enhancement and a two-step feature compression mechanism. Underwater image enhancement mitigates the impact of visual distortions and color cast issues on target appearance modeling, while the two-step feature compression strategy addresses low-visibility conditions by compressing redundant features and combining multiple compressed features based on the peak-to-sidelobe ratio (PSR) indicator for accurate target localization. The excellent performance of the proposed method is demonstrated through evaluation on two public UOT datasets.
Published: 2025
Full Text: View/download PDF

41. Energy Use and Demand Prediction Using Time-Series Deep Learning Forecasting Techniques: Application for a University Campus

Author: Bivin Pradeep, Parag Kulkarni, Farman Ullah, and Abderrahmane Lakas
Subjects: Energy efficiency, demand prediction, sustainability, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: A growing global impetus has emerged to enhance the sustainability of energy systems and practices. The two popular levers to achieve this goal include increasing the proportion of clean energy in the energy mix and enhancing energy efficiency. The former involves reducing reliance on fossil fuel-based energy sources and increasing the adoption of renewable energy. The latter involves understanding factors that impact the current energy footprint and improving the efficiencies of the process. University campuses comprise many buildings, and it is well-known that buildings have a sizeable energy footprint. Therefore, it is beneficial to understand their energy consumption and identify ways in which this could be further optimised. Furthermore, catering to the energy demand requires appropriate provisioning with significant costs associated with energy procurement on-demand. To address this, it is vital to predict demand in advance accurately. In this article, we elaborate on these two aspects, i.e., analysis of energy consumption and demand forecasting using deep learning-based time series techniques such as Long short-term memory (LSTM), Bi-directional Long short-term memory (BiLSTM), Gated recurrent unit (GRU), and Bidirectional Gated recurrent units (BiGRU). We analyse the different parameter optimisers and history window lengths to select a better hyper-parameter set for accurate energy use and demand prediction. Findings from this study show that the prediction follows the actual demand curve with a minimum RMSE of 65.354 MWh and 65.936 MWh for window sizes of four and six for validation (testing), respectively. The window size six performs better for most time-series algorithms and hyperparameter combinations.
Published: 2025
Full Text: View/download PDF

42. A deep contrastive learning-based image retrieval system for automatic detection of infectious cattle diseases

Author: Veerayuth Kittichai, Morakot Kaewthamasorn, Apinya Arnuphaprasert, Rangsan Jomtarak, Kaung Myat Naing, Teerawat Tongloy, Santhad Chuwongin, and Siridech Boonsang
Subjects: Anaplasmosis, Automatic tools, Deep neural network, Deep contrastive learning, Triplet margin loss, An image retrieval procedure, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Anaplasmosis, which is caused by Anaplasma spp. and transmitted by tick bites, is one of the most serious livestock animal diseases worldwide, causing significant economic losses as well as public health issues. Anaplasma marginale, a gram-negative intracellular obligate bacterium, can cause disease in cattle and other ruminants. Because of the insufficient quality of the slides, a microscopic diagnostic procedure is time-consuming and challenging to diagnose. Intra- and inter-rater variation is frequently imposed on by technicians who are underqualified and unexperienced. Alternatively, algorithms could support local employees in tracking disease transmission and quick action, especially in Thailand where this cattle disease is common. As a result, the study intends to create an automated tool based on a deep neural network linked with an image-retrieval procedure for recognizing infections in microscopic pictures. The Resnext-50 model, which serves as the embedding space’s backbone and is optimized by Triplet-Margin loss, outperforms, with averaged accuracy and specificity ratings of 91.30 percent and 92.83 percent, respectively. The model’s performance was also improved by a fine-tuned procedure between k-nearest neighbor and its normalized distance of each data point, including precision of 0.833 ± 0.134, specificity of 0.930 ± 0.054, recall of 0.838 ± 0.118, and accuracy of 0.915 ± 0.025, respectively. Five-fold cross-validation confirms that the trained model using the optimal k-nearest neighbor (kNN) for the image-based retrieval system, involving 12 images, prevents overfitting via dataset variations indicating areas under the receiver operating curve rankings ranging from 0.917 to 0.922. The image retrieval technique demonstrated in this research is a prototype for a variety of applications. The findings may aid in the early diagnosis of anaplasmosis infections in remote areas without access to veterinary care or costly molecular diagnostic tools.
Published: 2025
Full Text: View/download PDF

43. Advancing hospital healthcare: achieving IoT-based secure health monitoring through multilayer machine learning

Author: Ke Qi
Subjects: IoT, Machine learning, ANN, C-IoT, Smart clinical device, Computer engineering. Computer hardware, TK7885-7895, Information technology, T58.5-58.64, Electronic computers. Computer science, QA75.5-76.95
Abstract: Abstract Background Data based clinical decision support system is a boon for health care monitoring. Smart healthcare monitoring systems play a vital role in the early diagnosis and detection of the physical and mental health of patients. The smart clinical IoT (C-IoT) systems are data-driven and provide efficient support for this purpose. Purpose There is a need to have a secure, accurate, and efficient HCM system that is capable of processing large amounts of patient data for timely diagnosis and detection of various health complications. Traditional ways of migration are imprecise, less secure, and do not cover all angles necessary in the contemporary healthcare environment. Because of this, the conceptual IoT-based secure health monitoring system employs machine learning algorithms for enhanced accuracy. Method This study presents the conjugate applications of machine learning algorithms with the cloud-based C-IoT model systems. This model is a lightweight encryption block model that maintains provisional security for health and clinical data. It assists in patient’s health issues which are diagnosed with the existing database of the history of that patient and proper measures are taken with proper diagnosis and using this prediction model. The health status is diagnosed from the pre-historical database of the patient’s database. Results This cloud-based smart C-IoT system shows the results approximately with 91% accuracy while using Artificial Neural Network (ANN) algorithms. This smart C-IoT-based health issue diagnostic model is one step ahead toward the modernization of society 5.0. Future prospects The proposed IoT-based secure health monitoring system expands the surgeries of health care by achieving a high diagnostic accuracy of 91% employing ANN algorithms, the excellence of which is founded on data intensity with prior patient data, and the data security by lightweight encryption algorithms. Aligned with Society 5.0, it brings new, friendly, and efficient features to healthcare that replace many existing methods with better ones in terms of precision, security, and coverage.
Published: 2025
Full Text: View/download PDF

44. CSTrans: cross-subdomain transformer for unsupervised domain adaptation

Author: Junchi Liu, Xiang Zhang, and Zhigang Luo
Subjects: Vision transformer, Subdomain adaptation, Index matching module, Unsupervised discriminative clustering, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Unsupervised domain adaptation (UDA) aims to make full use of a labeled source domain data to classify an unlabeled target domain data. With the success of Transformer in various vision tasks, existing UDA methods borrow strong Transformer framework to learn global domain-invariant feature representation from the domain level or category level. Of them, the cross-attention as a key component acts for the cross-domain feature alignment, benefiting from its robustness. Intriguingly, we find that the robustness makes the model insensitive to the sub-grouping property within the same category of both source and target domains, known as the subdomain structure. This is because the robustness regards some fine-grained information as the noises and removes them. To overcome this shortcoming, we propose an end-to-end Cross-Subdomain Transformer framework (CSTrans) to exploit the transferability of subdomain structures and the robustness of cross-attention to calibrate inter-domain features. Specifically, there are two innovations in this paper. First, we devise an efficient Index Matching Module (IMM) to calculate the cross-attention of the same category in different domains and learn the domain-invariant representation. This not only simplifies the traditional daunting image-pair selection but also paves the safer way for guarding fine-grained subdomain information. This is because the IMM implements reliable feature confusion. Second, we introduce discriminative clustering to mine the subdomain structures in the same category and further learn subdomain discrimination. Both aspects cooperates with each other for fewer training stages. We perform extensive studies on five benchmarks, and the respective experimental results show that, as compared to existing UDA siblings, CSTrans attains remarkable results with average classification accuracy of 94.3%, 92.1%, and 85.4% on datasets Office-31, ImageCLEF-DA, and Office-Home, respectively.
Published: 2025
Full Text: View/download PDF

45. PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs

Author: Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, and Yujue Zhou
Subjects: Generalized zero-shot learning, Placeholder learning, Multi-label recognition, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero .
Published: 2025
Full Text: View/download PDF

46. APDL: an adaptive step size method for white-box adversarial attacks

Author: Jiale Hu, Xiang Li, Changzheng Liu, Ronghua Zhang, Junwei Tang, Yi Sun, and Yuedong Wang
Subjects: Adversarial attacks, Deep learning, Image classification, White-box attacks, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Recent research has shown that deep learning models are vulnerable to adversarial attacks, including gradient attacks, which can lead to incorrect outputs. The existing gradient attack methods typically rely on repetitive multistep strategies to improve their attack success rates, resulting in longer training times and severe overfitting. To address these issues, we propose an adaptive perturbation-based gradient attack method with dual-loss optimization (APDL). This method adaptively adjusts the single-step perturbation magnitude based on an exponential distance function, thereby accelerating the convergence process. APDL achieves convergence in fewer than 10 iterations, outperforming the traditional nonadaptive methods and achieving a high attack success rate with fewer iterations. Furthermore, to increase the transferability of gradient attacks such as APDL across different models and reduce the effects of overfitting on the training model, we introduce a triple-differential logit fusion (TDLF) method grounded in knowledge distillation principles. This approach mitigates the edge effects associated with gradient attacks by adjusting the hardness and softness of labels. Experiments conducted on ImageNet-compatible datasets demonstrate that APDL is significantly faster than the commonly used nonadaptive methods, whereas the TDLF method exhibits strong transferability.
Published: 2025
Full Text: View/download PDF

47. IMTLM-Net: improved multi-task transformer based on localization mechanism network for handwritten English text recognition

Author: Qianfeng Zhang, Feng Liu, and Wanru Song
Subjects: Handwritten English text recognition, English composition dataset, Transformer, Local feature extraction, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Intelligence technology has widely empowered education. As an example, Optical Character Recognition (OCR) can be used in smart education scenarios such as online homework correction and teaching data analysis. One of the fundamental yet challenging tasks is to recognize images of handwritten English text as editable text accurately. This is because handwritten text tends to have different writing habits as well as smearing and overlapping, resulting in hard alignment between the image and the real text. Additionally, the lack of data on handwritten text further leads to a lower recognition rate. To address the above issue, on the one hand, this paper extends the existing dataset and introduces hyphenated data annotation to provide data support for improving the robustness and discrimination of the model; on the other hand, a novel framework named Improved Multi-task Transformer based on Localization Mechanism Network (IMTLM-Net) is proposed for handwritten English text recognition. IMTLM-Net contains two parts, namely the encoding and decoding modules. The encoding module introduces a dual-stream processing mechanism. That is, in the simultaneous processing of text and images, a Vision Transformer (VIT) is utilized to encode images, and a Permutation Language Model (PLM) is designed for word arrangement. Two Multiple Head Attention (MHA) units are employed in the decoding module, focusing on text sequences and image sequences. Moreover, the localization mechanism (LM) is applied to enhance font structure feature extraction from image data, which in turn improves the model’s ability to capture complex details. Numerous experiments demonstrate that the proposed method achieves state-of-the-art results in handwritten text recognition.
Published: 2025
Full Text: View/download PDF

48. A quadratic $$\nu $$ ν -support vector regression approach for load forecasting

Author: Yanhe Jia, Shuaiguang Zhou, Yiwen Wang, Fengming Lin, and Zheming Gao
Subjects: Kernel-free support vector regression, Electric load forecasting, Machine learning, Weighted support vector regression, Feature weighting, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract This article focuses on electric load forecasting, which is a challenging task in the energy industry. In this paper, a novel kernel-free $$\nu $$ ν -support vector regression model is proposed for electric load forecasting. The proposed model produces a reduced quadratic surface for nonlinear regression. A feature weighting strategy is adopted to estimate the relevance of the features in the load history. To reduce the effects of outliers in the load history, a weight is assigned to represent the relative importance of each data point. Some computational experiments are conducted on some public benchmark data sets to show the superior performance of the proposed model over some widely used regression models. The results of some extensive computational experiments on the electric load data from the Global Energy Forecasting Competition 2012 and the ISO New England demonstrate better average accuracy of the proposed model.
Published: 2025
Full Text: View/download PDF

49. MSM-TDE: multi-scale semantics mining and tiny details enhancement network for retinal vessel segmentation

Author: Hongbin Zhang, Jin Zhang, Xuan Zhong, Ya Feng, Guangli Li, Xiong Li, Jingqin Lv, and Donghong Ji
Subjects: Retinal vessel segmentation, Multi-scale semantics mining, Tiny details enhancement, U-Net, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Retinal image segmentation is crucial for the early diagnosis of some diseases like diabetes and hypertension. Current methods face many challenges, such as inadequate multi-scale semantics and insufficient global information. In view of this, we propose a network called multi-scale semantics mining and tiny details enhancement (MSM-TDE). First, a multi-scale feature input module is designed to capture multi-scale semantics information from the source. Then a fresh multi-scale attention guidance module is constructed to mine local multi-scale semantics while a global semantics enhancement module is proposed to extract global multi-scale semantics. Additionally, an auxiliary vessel detail enhancement branch using dynamic snake convolution is built to enhance the tiny vessel details. Extensive experimental results on four public datasets validate the superiority of MSM-TDE, which obtains competitive performance with satisfactory model complexity. Notably, this study provides an innovative idea of multi-scale semantics mining by diverse methods.
Published: 2025
Full Text: View/download PDF

50. An interaction relational inference method for a coal-mining equipment system

Author: Xiangang Cao, Jiajun Gao, Xin Yang, Fuyuan Zhao, and Boyang Cheng
Subjects: Coal-mining equipment, Interaction relations, System dynamics, Graph neural network, Electronic computers. Computer science, QA75.5-76.95, Information technology, T58.5-58.64
Abstract: Abstract Multiple potential interactions occur in a coal-mining equipment system during operation, which is crucial for understanding and predicting the dynamic system evolution. Existing methods for building interaction relations in coal-mining equipment systems face problems including incomplete selection of system nodes and difficulty in defining interaction-relation types and distinguishing interaction-relation weights. This study proposes an interaction-relation inference method EMIFC-CIRI for coal-mining equipment systems. EMIFC-CIRI first builds a monitoring index system for coal-mining equipment based on evidence and then accurately selects system nodes. The interaction constructor of the CIRI interaction inference model in this method introduces Gumbel-softmax technology, which autonomously generates multiple types of interaction relations based on several probability matrices. CIRI’s interaction optimizer introduces an attention mechanism to assign weights to interaction relations, and it predicts future system states based on device-monitoring data and interaction relations, optimizing the types and weights of interaction relations between nodes by reducing prediction errors. The study included experiments on relevant datasets. The results show that EMIFC-CIRI successfully built various interaction relations of different strengths, with a 156.17% improvement in interaction-relation quality and a 68.17% improvement in dynamic modeling performance compared with state-of-the-art comparison methods. This study provides a new perspective for research in the field of interaction reasoning of coal-mining equipment systems.
Published: 2025
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

8,859 results on '"T58.5-58.64"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources