5,910 results on '"Neural nets"'
Search Results
2. Super‐resolution reconstruction algorithm for medical images by fusion of wavelet transform and multi‐scale adaptive feature selection.
- Author
-
Wang, QiaoSu and Ma, Qiaomei
- Subjects
- *
IMAGE reconstruction algorithms , *IMAGE processing , *COMPUTER vision , *COMPUTED tomography , *FEATURE selection , *DISCRETE wavelet transforms , *WAVELET transforms - Abstract
Conventional computed tomography (CT) images often suffer from blurred edges and unclear details. Image super‐resolution methods can significantly enhance CT image quality, thereby improving diagnostic accuracy. To better extract detailed features and enhance the cascading effects of different feature levels, we propose a novel medical image super‐resolution algorithm that integrates discrete wavelet transform and multi‐scale adaptive feature selection. Our approach uses both the low‐resolution image and its high‐frequency component from the frequency domain as network inputs, with the high‐frequency component providing learning supervision, which enhances detail fidelity in reconstruction. Additionally, we introduce a multi‐scale adaptive feature selection module to learn from different layers of CT images and their inter‐layer correlations. Finally, the pixel information is efficiently integrated by a coordinate attention mechanism incorporating the concept of squeeze excitation. Experimental results show that our method outperforms state‐of‐the‐art methods, achieving superior reconstruction at scale factors of 2, 4, and 8, especially at scale factor 8, where it surpasses others by 1.12 in PSNR, 0.0145 in SSIM, and 0.0038 in LPIPS. Visually, our method also delivers more accurate details and better perceptual quality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Recognition of vehicle license plates in highway scenes with deep fusion network and connectionist temporal classification.
- Author
-
Hua, Liru, Ma, Xinyi, Zhao, Chihang, Zhang, Bailing, Su, Zijun, and Wu, Yuhang
- Subjects
- *
IMAGE recognition (Computer vision) , *AUTOMOBILE license plates , *PATTERN recognition systems , *ARTIFICIAL neural networks , *INTELLIGENT transportation systems , *RECURRENT neural networks - Abstract
License plate recognition is crucial in Intelligent Transportation Systems (ITS) for vehicle management, traffic monitoring, and security inspection. In highway scenarios, this task faces challenges such as diversity, blurriness, occlusion, and illumination variation of license plates. This article explores Recurrent Neural Networks based on Connectionist Temporal Classification (RNN‐CTC) for license plate recognition in challenging highway conditions. Four neural network models: ResNet50, ResNeXt, InceptionV3, and SENet, all combined with RNN‐CTC are comparatively evaluated. Furthermore, a novel architecture named ResNet50 Deep Fusion Network using Connectionist Temporal Classification (ResNet50‐DFN‐CTC) is proposed. Comparative and ablation experiments are conducted using the Highway License Plate Dataset of Southeast University (HLPD‐SU). Results demonstrate the superior performance of ResNet50‐DFN‐CTC in challenging highway conditions, achieving 93.158% accuracy with a processing time of 7.91 ms, outperforming other tested models. This research contributes to advancing license plate recognition technology for real‐world highway applications under adverse conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. RIHINNet: A robust image hiding method against JPEG compression based on invertible neural network.
- Author
-
Jin, Xin, Pan, Chengyi, Cheng, Zien, Dong, Yunyun, and Jiang, Qian
- Subjects
- *
DIGITAL images , *IMAGE processing , *QUALITY factor , *JPEG (Image coding standard) , *RANDOM noise theory - Abstract
Image hiding is a task that embeds secret images in digital images without being detected. The performance of image hiding has been greatly improved by using the invertible neural network. However, current image hiding methods are less robust in the face of Joint Photographic Experts Group (JPEG) compression. The secret image cannot be extracted from the stego image after JPEG compression of the stego image. Some methods show good robustness for some certain JPEG compression quality factors but poor robustness for other common JPEG compression quality factors. An image‐hiding network (RIHINNet) that is robust to all common JPEG compression quality factors is proposed. First of all, the loss function is redesigned; thus, the secret image is hidden as much as possible in the area that is less likely to be changed after JPEG compression. Second, the classifier is designed, which can help the model to select the extractor according to the range of JPEG compression degree. Finally, the interval robustness of the secret image extraction is improved through the design of a denoising module. Experimental results show that this RIHINNet outperforms other state‐of‐the‐art image‐hiding methods in the face of JPEG compressed noise with random compression quality factors, with more than 10 dB peak signal‐to‐noise ratio improvement in secret image recovery on ImageNet, COCO and DIV2K datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Energy‐based PINNs for solving coupled field problems: Concepts and application to the multi‐objective optimal design of an induction heater.
- Author
-
Baldan, Marco and Di Barba, Paolo
- Abstract
Physics‐informed neural networks (PINNs) are neural networks (NNs) that directly encode model equations, like Partial Differential Equations (PDEs), in the network itself. While most of the PINN algorithms in the literature minimize the local residual of the governing equations, there are energy‐based approaches that take a different path by minimizing the variational energy of the model. It is shown that in the case of the steady thermal equation weakly coupled to magnetic equation, the energy‐based approach displays multiple advantages compared to the standard residual‐based PINN: it is more computationally efficient, it requires a lower order of derivatives to compute, and it involves less hyperparameters. The analyzed benchmark problems are the single‐ and multi‐objective optimal design of an inductor for the controlled heating of a graphite plate. The optimized device is designed by involving a multi‐physics problem: a time‐harmonic magnetic problem and a steady thermal problem. For the former, a deep neural network solving the direct problem is supervisedly trained on Finite Element Analysis (FEA) data. In turn, the solution of the latter relies on a hypernetwork that takes as input the inductor geometry parameters and outputs the model weights of an energy‐based PINN (or ePINN). Eventually, the ePINN predicts the temperature field within the graphite plate. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Graph neural networks as strategic transport modelling alternative ‐ A proof of concept for a surrogate.
- Author
-
Narayanan, Santhanakrishnan, Makarov, Nikita, and Antoniou, Constantinos
- Subjects
GRAPH neural networks ,DEEP diving ,PROOF of concept ,STRATEGIC planning - Abstract
Practical applications of graph neural networks (GNNs) in transportation are still a niche field. There exists a significant overlap between the potential of GNNs and the issues in strategic transport modelling. However, it is not clear whether GNN surrogates can overcome (some of) the prevalent issues. Investigation of such a surrogate will show their advantages and the disadvantages, especially throwing light on their potential to replace complex transport modelling approaches in the future, such as the agent‐based models. In this direction, as a pioneer work, this paper studies the plausibility of developing a GNN surrogate for the classical four‐step approach, one of the established strategic transport modelling approaches. A formal definition of the surrogate is presented, and an augmented data generation procedure is introduced. The network of the Greater Munich metropolitan region is used for the necessary data generation. The experimental results show that GNNs have the potential to act as transport planning surrogates and the deeper GNNs perform better than their shallow counterparts. Nevertheless, as expected, they suffer performance degradation with an increase in network size. Future research should dive deeper into formulating new GNN approaches, which are able to generalize to arbitrary large networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Wind energy system fault classification using deep CNN and improved PSO‐tuned extreme gradient boosting.
- Author
-
Lee, Chun‐Yao and Maceren, Edu Daryl C.
- Subjects
CONVOLUTIONAL neural networks ,PARTICLE swarm optimization ,FAULT diagnosis ,WIND power ,DEEP learning - Abstract
Intelligent fault diagnosis for wind energy systems requires identifying unique characteristics to differentiate various fault types effectively, even when data discrepancy occurs due to the unpredictable and dynamic nature of its environment. This article addresses some of the challenges of fault classification in wind energy systems by proposing an integrated approach that combines deep learning features with a resampled supervisory control and data acquisition (SCADA) dataset. The methodology involves resampling the imbalanced SCADA dataset using synthetic minority oversampling technique (SMOTE) and near‐miss undersampling techniques, extracting deep learning features using deep convolutional neural network, and feeding them into an XGBoost (extreme gradient boosting) classifier with tuned parameters using adaptive elite‐particle swarm optimization (AEPSO). The effectiveness of the proposed method is demonstrated through validation conducted on a different imbalanced dataset showing superior performance metrics in terms of accuracy. Additionally, the study contributes to methodological advancements in wind turbine fault diagnosis by providing a rigorous framework for fault classification. It is confirmed that utilizing the extracted deep learning features into the resampled data can significantly affect the classification performance metrics. Furthermore, the proposed integrated approach shows significance for fault diagnosis enhancement in wind energy systems and advancing the field towards more efficient and reliable operation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A novel density‐based representation for point cloud and its ability to facilitate classification.
- Author
-
Xie, Xianlin and Tang, Xue‐song
- Subjects
- *
IMAGE recognition (Computer vision) , *FEATURE extraction , *IMAGE processing , *POINT cloud , *POINT processes - Abstract
Currently, in the field of processing 3D point cloud data, two primary representation methods have emerged: point‐based methods and voxel‐based methods. However, the former suffer from significant computational costs and lack the ease of handling exhibited by voxel‐based methods. Conversely, the later often encounter challenges related to information loss resulting from downsampling operations, thereby impeding subsequent tasks. To address these limitations, this article introduces a novel density‐based representation method for voxel partitioning. Additionally, a corresponding network structure is devised to extract features from this specific density representation, thereby facilitating the successful completion of classification tasks. The experiments are implemented on ModelNet40 and MNIST demonstrate that the proposed 3D convolution can achieve the‐state‐of‐the‐art performance based on the voxels. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Research on image saliency detection based on deep neural network.
- Author
-
Qiu, Linrun, Zhang, Dongbo, and Hu, Yingkun
- Subjects
- *
ARTIFICIAL neural networks , *OBJECT recognition (Computer vision) , *COMPUTER vision , *FEATURE extraction , *IMAGE processing , *EDGE detection (Image processing) - Abstract
As a hot research field at present, computer vision is devoted to the rapid acquisition and application of target information from images or videos by simulating human visual mechanism. In order to improve the accuracy and efficiency of image detection, image saliency region detection technology has received more and more attention in the field of computer vision research; an important research content in the field, the core part of which lies in the research on algorithms related to feature extraction and saliency calculation of targets. This paper analyzes the multi‐feature fusion saliency detection model and visual saliency calculation process, and based on the existing algorithm, by improving the VGG16 network, a fully convolutional network saliency detection algorithm is proposed. The qualitative and quantitative experimental results show that compared with the four mainstream methods of BL, GS, SF, and RFCN, our algorithm not only improves the accuracy of salient object detection, but also effectively solves the problem of target edge blur. Therefore, this study has improved the accuracy and efficiency of saliency detection, which can not only promote the development of computer vision technology, but also provide support for research in the field of image processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Concurrent PV production and consumption load forecasting using CT‐Transformer deep learning to estimate energy system flexibility.
- Author
-
Zarghami, Mohammad, Niknam, Taher, Aghaei, Jamshid, and Nezhad, Azita Hatami
- Subjects
SOLAR energy ,PHOTOVOLTAIC power systems ,ENERGY consumption ,FORECASTING ,DEEP learning ,INSTRUCTIONAL systems - Abstract
The integration of renewable energy sources (RESs) into power systems has increased significantly due to technical, economic, and environmental factors, necessitating greater flexibility to manage variable consumption loads and renewable energy generation. Accurate forecasting of solar energy production and consumption load is critical for enhancing power system flexibility. This study introduces a novel deep learning model, a spatial‐temporal hybrid convolutional‐transformer (CT‐Transformer) network with unique features and extended memory capacity. Additionally, a flexibility index (FI) is introduced to evaluate power system flexibility (PSF) based on the forecasting results. The CT‐Transformer forecasts PSF for the next 24 and 168 hours, using the FI to evaluate PSF based on forecasting results. The input data includes meteorological, solar energy production, load demand, and pricing data from France, comprising hourly data from 2015 and 2016 (about 17,520 entries). Data preprocessing involves correcting incomplete and irrelevant segments. The CT‐Transformer's performance is compared to other deep learning techniques, showing superior results with the lowest prediction error (2.5%) and a maximum error of 10.1% (MAE). It also achieved a prediction error of 0.08% for system flexibility, compared to the highest error of 0.96%. This research highlights the CT‐Transformer's potential for accurate RES and load forecasting and PSF evaluation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Active sonar target recognition method based on multi‐domain transformations and attention‐based fusion network.
- Author
-
Wang, Qingcui, Du, Shuanping, Zhang, Wei, and Wang, Fangyong
- Subjects
- *
AUDITORY perception , *SIGNAL processing , *FEATURE extraction , *SONAR , *OCEAN , *ECHO - Abstract
The classification and recognition of underwater targets by an active sonar system remain challenging and complex. Traditional methods have limited classification performance in time and spatially varying ocean channels. An active sonar target recognition method is proposed based on multi‐domain transformations and an attention‐based fusion network. Initially, the active target echo undergoes time‐frequency analysis, auditory signal processing, and matched filtering to represent target attributes in joint spatial‐time‐frequency domains. Subsequently, multiple attention‐based fusion models fuse the multi‐domain transformations either early or late in the processing stages. An attention module further enhances significant feature channels through adaptive weight assignment. Experiment results demonstrate that the recognition accuracy of active sonar echoes using multi‐domain transformations improves significantly compared to that of single‐domain methods, with an increase of up to 10.5%. The incorporation of multiple transformation domains provides complementary information about the target, thereby enhancing the network's representation ability, especially with limited data samples. Furthermore, the findings indicate that feature fusion of multiple transformations in a high‐level feature space yields more informative and effective results for active sonar echoes compared to low‐level feature spaces. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. HIST: Hierarchical and sequential transformer for image captioning.
- Author
-
Lv, Feixiao, Wang, Rui, Jing, Lihua, and Dai, Pengwen
- Subjects
- *
TRANSFORMER models , *COMPUTER vision , *ARTIFICIAL intelligence , *FEATURE extraction , *NATURAL languages - Abstract
Image captioning aims to automatically generate a natural language description of a given image, and most state‐of‐the‐art models have adopted an encoder–decoder transformer framework. Such transformer structures, however, show two main limitations in the task of image captioning. Firstly, the traditional transformer obtains high‐level fusion features to decode while ignoring other‐level features, resulting in losses of image content. Secondly, the transformer is weak in modelling the natural order characteristics of language. To address theseissues, the authors propose a HIerarchical and Sequential Transformer (HIST) structure, which forces each layer of the encoder and decoder to focus on features of different granularities, and strengthen the sequentially semantic information. Specifically, to capture the details of different levels of features in the image, the authors combine the visual features of multiple regions and divide them into multiple levels differently. In addition, to enhance the sequential information, the sequential enhancement module in each decoder layer block extracts different levels of features for sequentially semantic extraction and expression. Extensive experiments on the public datasets MS‐COCO and Flickr30k have demonstrated the effectiveness of our proposed method, and show that the authors' method outperforms most of previous state of the arts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Multi‐scale skeleton simplification graph convolutional network for skeleton‐based action recognition.
- Author
-
Zhang, Fan, Chongyang, Ding, Liu, Kai, and Hongjin, Liu
- Subjects
- *
FEATURE extraction , *SKELETON , *RECOGNITION (Psychology) - Abstract
Human action recognition based on graph convolutional networks (GCNs) is one of the hotspots in computer vision. However, previous methods generally rely on handcrafted graph, which limits the effectiveness of the model in characterising the connections between indirectly connected joints. The limitation leads to weakened connections when joints are separated by long distances. To address the above issue, the authors propose a skeleton simplification method which aims to reduce the number of joints and the distance between joints by merging adjacent joints into simplified joints. Group convolutional block is devised to extract the internal features of the simplified joints. Additionally, the authors enhance the method by introducing multi‐scale modelling, which maps inputs into sequences across various levels of simplification. Combining with spatial temporal graph convolution, a multi‐scale skeleton simplification GCN for skeleton‐based action recognition (M3S‐GCN) is proposed for fusing multi‐scale skeleton sequences and modelling the connections between joints. Finally, M3S‐GCN is evaluated on five benchmarks of NTU RGB+D 60 (C‐Sub, C‐View), NTU RGB+D 120 (X‐Sub, X‐Set) and NW‐UCLA datasets. Experimental results show that the authors' M3S‐GCN achieves state‐of‐the‐art performance with the accuracies of 93.0%, 97.0% and 91.2% on C‐Sub, C‐View and X‐Set benchmarks, which validates the effectiveness of the method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A NoisyNet deep reinforcement learning method for frequency regulation in power systems.
- Author
-
Zhang, Boming, Iu, Herbert, Zhang, Xinan, and Chau, Tat Kei
- Subjects
- *
DEEP reinforcement learning , *NOISE , *BUSES - Abstract
This study thoroughly investigates the NoisyNet Deep Deterministic Policy Gradient (DDPG) for frequency regulation. Compared with the conventional DDPG method, the suggested method can provide several benefits. First, the parameter noise will explore different strategies more thoroughly and can potentially discover better policies that it might miss if only action noise were used, which helps the actor achieve an optimal control strategy, resulting in enhanced dynamic response. Second, by employing the delayed policy update policy work with the proposed framework, the training process exhibits faster convergence, enabling rapid adaptation to changing disturbances. To substantiate its efficacy, the scheme is subjected to simulation tests on both an IEEE three‐area power system, an IEEE 39 bus power system, and an IEEE 68 bus system. A comprehensive performance comparison was performed against other DDPG‐based methods to validate and evaluate the performance of the proposed LFC scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Fine‐grained spectrum map inference: A novel approach based on deep residual network.
- Author
-
He, Shoushuai, Zhu, Lei, Wang, Lei, Zeng, Weijun, and Qin, Zhen
- Subjects
- *
SPECTRUM allocation , *INTERNET radio , *FEATURE extraction , *WIRELESS communications , *DATABASES , *MULTIDIMENSIONAL databases - Abstract
Spectrum map is a database that stores multidimensional representations of spectrum situation information. It provides support for spectrum sensing and endows wireless communication networks with intelligence. However, the ubiquitous deployment of monitoring devices leads to huge costs of operation and maintenance. It indicates that an approach is needed to reduce the number of monitoring devices, but prevent the degradation of data granularity. Therefore, this paper focuses on the accurate construction of the spectrum map. It aims to infer the fine‐grained spectrum situation of the target region based on coarse‐grained observation. In order to solve this problem, an inference framework based on deep residual network is developed in this paper. In the case of rule deployment for sensing nodes, it adopts the idea of super resolution to improve the accuracy of the spectrum map. The framework is composed of two major parts: an inference network, which generates fine‐grained spectrum maps from coarse‐grained counterparts by using feature extraction module and upsampling construction module; and a fusion network, which considers the influence of environmental factors to further improve the performance. A large number of experiments on simulated datasets verify the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. BTSC: Binary tree structure convolution layers for building interpretable decision‐making deep CNN.
- Author
-
Wang, Yuqi, Dai, Dawei, Liu, Da, Xia, Shuyin, and Wang, Guoyin
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,COMPUTER vision ,DEEP learning ,VISUAL fields - Abstract
Although deep convolution neural network (DCNN) has achieved great success in computer vision field, such models are considered to lack interpretability in decision‐making. One of fundamental issues is that its decision mechanism is considered to be a "black‐box" operation. The authors design the binary tree structure convolution (BTSC) module and control the activation level of particular neurons to build the interpretable DCNN model. First, the authors design a BTSC module, in which each parent node generates two independent child layers, and then integrate them into a normal DCNN model. The main advantages of the BTSC are as follows: 1) child nodes of the different parent nodes do not interfere with each other; 2) parent and child nodes can inherit knowledge. Second, considering the activation level of neurons, the authors design an information coding objective to guide neural nodes to learn the particular information coding that is expected. Through the experiments, the authors can verify that: 1) the decision‐making made by both the ResNet and DenseNet models can be explained well based on the "decision information flow path" (known as the decision‐path) formed in the BTSC module; 2) the decision‐path can reasonably interpret the decision reversal mechanism (Robustness mechanism) of the DCNN model; 3) the credibility of decision‐making can be measured by the matching degree between the actual and expected decision‐path. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Person re‐identification via deep compound eye network and pose repair module.
- Author
-
Gu, Hongjian, Zou, Wenxuan, Cheng, Keyang, Wu, Bin, Ghafoor, Humaira Abdul, and Zhan, Yongzhao
- Abstract
Person re‐identification is aimed at searching for specific target pedestrians from non‐intersecting cameras. However, in real complex scenes, pedestrians are easily obscured, which makes the target pedestrian search task time‐consuming and challenging. To address the problem of pedestrians' susceptibility to occlusion, a person re‐identification via deep compound eye network (CEN) and pose repair module is proposed, which includes (1) A deep CEN based on multi‐camera logical topology is proposed, which adopts graph convolution and a Gated Recurrent Unit to capture the temporal and spatial information of pedestrian walking and finally carries out pedestrian global matching through the Siamese network; (2) An integrated spatial‐temporal information aggregation network is designed to facilitate pose repair. The target pedestrian features under the multi‐level logic topology camera are utilised as auxiliary information to repair the occluded target pedestrian image, so as to reduce the impact of pedestrian mismatch due to pose changes; (3) A joint optimisation mechanism of CEN and pose repair network is introduced, where multi‐camera logical topology inference provides auxiliary information and retrieval order for the pose repair network. The authors conducted experiments on multiple datasets, including Occluded‐DukeMTMC, CUHK‐SYSU, PRW, SLP, and UJS‐reID. The results indicate that the authors' method achieved significant performance across these datasets. Specifically, on the CUHK‐SYSU dataset, the authors' model achieved a top‐1 accuracy of 89.1% and a mean Average Precision accuracy of 83.1% in the recognition of occluded individuals. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Multimodal imbalanced‐data fault diagnosis method based on a dual‐branch interactive fusion network.
- Author
-
He, Jing, Yin, Ling, and Sheng, Zhenwen
- Subjects
- *
FAULT diagnosis , *DEEP learning , *RELIABILITY in engineering , *DIAGNOSIS methods , *ROTATING machinery - Abstract
Bearing‐fault diagnosis in rotating machinery is essential for ensuring the safety and reliability of mechanical systems. However, under complicated working conditions, the number of normal mechanical equipment samples can far exceed the number of faulty ones. When the data are so imbalanced, data fault diagnosis cannot be easily conducted using conventional deep learning methods. This study proposes a fault diagnosis method based on a dual‐branch interactive fusion network, which improves the accuracy and stability of bearing‐fault diagnosis. First, a dual‐branch feature representation network comprising an iterative attention‐feature fusion residual neural network and a long short‐term memory network is designed for extracting different modal features. Meanwhile, intermodal fusion of the extracted features is performed through multilayer perception. Based on the cost‐sensitive regularization loss, a new joint loss function is then designed for network training. Finally, the effectiveness of the proposed method is verified through comparative experiments, visualization analyses, ablation experiments, and generalization performance experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Closed‐loop stability analysis of deep reinforcement learning controlled systems with experimental validation.
- Author
-
Mohiuddin, Mohammed Basheer, Boiko, Igor, Azzam, Rana, and Zweiri, Yahya
- Subjects
- *
DEEP reinforcement learning , *ITERATIVE learning control , *ARTIFICIAL intelligence , *SYSTEM analysis , *POLYNOMIAL approximation - Abstract
Trained deep reinforcement learning (DRL) based controllers can effectively control dynamic systems where classical controllers can be ineffective and difficult to tune. However, the lack of closed‐loop stability guarantees of systems controlled by trained DRL agents hinders their adoption in practical applications. This research study investigates the closed‐loop stability of dynamic systems controlled by trained DRL agents using Lyapunov analysis based on a linear‐quadratic polynomial approximation of the trained agent. In addition, this work develops an understanding of the system's stability margin to determine operational boundaries and critical thresholds of the system's physical parameters for effective operation. The proposed analysis is verified on a DRL‐controlled system for several simulated and experimental scenarios. The DRL agent is trained using a detailed dynamic model of a non‐linear system and then tested on the corresponding real‐world hardware platform without any fine‐tuning. Experiments are conducted on a wide range of system states and physical parameters and the results have confirmed the validity of the proposed stability analysis (https://youtu.be/QlpeD5sTlPU). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. An optimization scheme for vehicular edge computing based on Lyapunov function and deep reinforcement learning.
- Author
-
Zhu, Lin, Tan, Long, Li, Bingxian, and Tian, Huizi
- Subjects
- *
DEEP reinforcement learning , *MOBILE computing , *COMPUTER networks , *EDGE computing , *DIGITAL twins , *VEHICLE routing problem - Abstract
Traditional vehicular edge computing research usually ignores the mobility of vehicles, the dynamic variability of the vehicular edge environment, the large amount of real‐time data required for vehicular edge computing, the limited resources of edge servers, and collaboration issues. In response to these challenges, this article proposes a vehicular edge computing optimization scheme based on the Lyapunov function and Deep Reinforcement Learning. In this solution, this article uses Digital Twin technology (DT) to simulate the vehicular edge environment. The edge server DT is used to simulate the vehicular edge environment under the edge server, and the base station DT is used to simulate the entire vehicular edge system environment. Based on the real‐time data obtained from DT simulation, this paper defines the Lyapunov function to simplify the migration cost of vehicle tasks between servers into a multi‐objective dynamic optimization problem. It solves the problem by applying the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. Experimental results show that compared with other algorithms, this scheme can effectively optimize the allocation and collaboration of vehicular edge computing resources and reduce the delay and energy consumption caused by vehicle task processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Emotion classification with multi‐modal physiological signals using multi‐attention‐based neural network.
- Author
-
Zou, Chengsheng, Deng, Zhen, He, Bingwei, Yan, Maosong, Wu, Jie, and Zhu, Zhaoju
- Subjects
AFFECTIVE computing ,BLOOD volume ,EMOTIONS ,SIGNAL classification ,NETWORK performance - Abstract
The ability to effectively classify human emotion states is critically important for human‐computer or human‐robot interactions. However, emotion classification with physiological signals is still a challenging problem due to the diversity of emotion expression and the characteristic differences in different modal signals. A novel learning‐based network architecture is presented that can exploit four‐modal physiological signals, electrocardiogram, electrodermal activity, electromyography, and blood volume pulse, and make a classification of emotion states. It features two kinds of attention modules, feature‐level, and semantic‐level, which drive the network to focus on the information‐rich features by mimicking the human attention mechanism. The feature‐level attention module encodes the rich information of each physiological signal. While the semantic‐level attention module captures the semantic dependencies among modals. The performance of the designed network is evaluated with the open‐source Wearable Stress and Affect Detection dataset. The developed emotion classification system achieves an accuracy of 83.88%. Results demonstrated that the proposed network could effectively process four‐modal physiological signals and achieve high accuracy of emotion classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Predicting the quality of user contributions via LSTMs
- Author
-
Agrawal, R and de Alfaro, L
- Subjects
LSTM ,Neural Nets ,Machine Learning ,User Reputation ,Reputation Systems ,Wikipedia ,Vandalism Detection - Abstract
In many collaborative systems it is useful to automatically estimate the quality of new contributions; the estimates can be used for instance to flag contributions for review. To predict the quality of a contribution by a user, it is useful to take into account both the characteristics of the revision itself, and the past history of contributions by that user. In several approaches, the user's history is first summarized into a number of features, such as number of contributions, user reputation, time from previous revision, and so forth. These features are then passed along with features of the current revision to a machine-learning classifier, which outputs a prediction for the user contribution. The summarization step is used because the usual machine learning models, such as neural nets, SVMs, etc. rely on a fixed number of input features.We show in this paper that this manual selection of summarization features can be avoided by adopting machine-learning approaches that are able to cope with temporal sequences of input. In particular, we show that Long-Short Term Memory (LSTM) neural nets are able to process directly the variable-length history of a user's activity in the system, and produce an output that is highly predictive of the quality of the next contribution by the user. Our approach does not eliminate the process of feature selection, which is present in all machine learning. Rather, it eliminates the need for deciding which features from a user's past are most useful for predicting the future: we can simply pass to the machine-learning apparatus all the past, and let it come up with an estimate for the quality of the next contribution.We present models combining LSTM and NN for predicting revision quality and show that the prediction accuracy attained is far superior to the one obtained using the NN alone. More interestingly, we also show that the prediction attained is superior to the one obtained using user reputation as a feature summarizing the quality of a user's past work. This can be explained by noting that the primary function of user reputation is to provide an incentive towards performing useful contributions, rather than to be a feature optimized for prediction of future contribution quality. We also show that the LSTM output changes in a natural way in response to user behavior, increasing when the user performs a sequence of good quality contributions, and decreasing when the user performs a sequence of low-quality work. The LSTM output for a user could thus be usefully shown to other users, alongside the user's reputation and other information.
- Published
- 2023
23. Energy‐based PINNs for solving coupled field problems: Concepts and application to the multi‐objective optimal design of an induction heater
- Author
-
Marco Baldan and Paolo Di Barba
- Subjects
design engineering ,finite element analysis ,induction heating ,neural nets ,Pareto optimisation ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Abstract Physics‐informed neural networks (PINNs) are neural networks (NNs) that directly encode model equations, like Partial Differential Equations (PDEs), in the network itself. While most of the PINN algorithms in the literature minimize the local residual of the governing equations, there are energy‐based approaches that take a different path by minimizing the variational energy of the model. It is shown that in the case of the steady thermal equation weakly coupled to magnetic equation, the energy‐based approach displays multiple advantages compared to the standard residual‐based PINN: it is more computationally efficient, it requires a lower order of derivatives to compute, and it involves less hyperparameters. The analyzed benchmark problems are the single‐ and multi‐objective optimal design of an inductor for the controlled heating of a graphite plate. The optimized device is designed by involving a multi‐physics problem: a time‐harmonic magnetic problem and a steady thermal problem. For the former, a deep neural network solving the direct problem is supervisedly trained on Finite Element Analysis (FEA) data. In turn, the solution of the latter relies on a hypernetwork that takes as input the inductor geometry parameters and outputs the model weights of an energy‐based PINN (or ePINN). Eventually, the ePINN predicts the temperature field within the graphite plate.
- Published
- 2024
- Full Text
- View/download PDF
24. RIHINNet: A robust image hiding method against JPEG compression based on invertible neural network
- Author
-
Xin Jin, Chengyi Pan, Zien Cheng, Yunyun Dong, and Qian Jiang
- Subjects
image processing ,neural nets ,steganography ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Image hiding is a task that embeds secret images in digital images without being detected. The performance of image hiding has been greatly improved by using the invertible neural network. However, current image hiding methods are less robust in the face of Joint Photographic Experts Group (JPEG) compression. The secret image cannot be extracted from the stego image after JPEG compression of the stego image. Some methods show good robustness for some certain JPEG compression quality factors but poor robustness for other common JPEG compression quality factors. An image‐hiding network (RIHINNet) that is robust to all common JPEG compression quality factors is proposed. First of all, the loss function is redesigned; thus, the secret image is hidden as much as possible in the area that is less likely to be changed after JPEG compression. Second, the classifier is designed, which can help the model to select the extractor according to the range of JPEG compression degree. Finally, the interval robustness of the secret image extraction is improved through the design of a denoising module. Experimental results show that this RIHINNet outperforms other state‐of‐the‐art image‐hiding methods in the face of JPEG compressed noise with random compression quality factors, with more than 10 dB peak signal‐to‐noise ratio improvement in secret image recovery on ImageNet, COCO and DIV2K datasets.
- Published
- 2024
- Full Text
- View/download PDF
25. Recognition of vehicle license plates in highway scenes with deep fusion network and connectionist temporal classification
- Author
-
Liru Hua, Xinyi Ma, Chihang Zhao, Bailing Zhang, Zijun Su, and Yuhang Wu
- Subjects
computer vision ,image classification ,image recognition ,neural nets ,pattern recognition ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract License plate recognition is crucial in Intelligent Transportation Systems (ITS) for vehicle management, traffic monitoring, and security inspection. In highway scenarios, this task faces challenges such as diversity, blurriness, occlusion, and illumination variation of license plates. This article explores Recurrent Neural Networks based on Connectionist Temporal Classification (RNN‐CTC) for license plate recognition in challenging highway conditions. Four neural network models: ResNet50, ResNeXt, InceptionV3, and SENet, all combined with RNN‐CTC are comparatively evaluated. Furthermore, a novel architecture named ResNet50 Deep Fusion Network using Connectionist Temporal Classification (ResNet50‐DFN‐CTC) is proposed. Comparative and ablation experiments are conducted using the Highway License Plate Dataset of Southeast University (HLPD‐SU). Results demonstrate the superior performance of ResNet50‐DFN‐CTC in challenging highway conditions, achieving 93.158% accuracy with a processing time of 7.91 ms, outperforming other tested models. This research contributes to advancing license plate recognition technology for real‐world highway applications under adverse conditions.
- Published
- 2024
- Full Text
- View/download PDF
26. Super‐resolution reconstruction algorithm for medical images by fusion of wavelet transform and multi‐scale adaptive feature selection
- Author
-
QiaoSu Wang and Qiaomei Ma
- Subjects
computer vision ,convolutional neural nets ,discrete wavelet transforms ,image resolution ,medical image processing ,neural nets ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Conventional computed tomography (CT) images often suffer from blurred edges and unclear details. Image super‐resolution methods can significantly enhance CT image quality, thereby improving diagnostic accuracy. To better extract detailed features and enhance the cascading effects of different feature levels, we propose a novel medical image super‐resolution algorithm that integrates discrete wavelet transform and multi‐scale adaptive feature selection. Our approach uses both the low‐resolution image and its high‐frequency component from the frequency domain as network inputs, with the high‐frequency component providing learning supervision, which enhances detail fidelity in reconstruction. Additionally, we introduce a multi‐scale adaptive feature selection module to learn from different layers of CT images and their inter‐layer correlations. Finally, the pixel information is efficiently integrated by a coordinate attention mechanism incorporating the concept of squeeze excitation. Experimental results show that our method outperforms state‐of‐the‐art methods, achieving superior reconstruction at scale factors of 2, 4, and 8, especially at scale factor 8, where it surpasses others by 1.12 in PSNR, 0.0145 in SSIM, and 0.0038 in LPIPS. Visually, our method also delivers more accurate details and better perceptual quality.
- Published
- 2024
- Full Text
- View/download PDF
27. Graph neural networks as strategic transport modelling alternative ‐ A proof of concept for a surrogate
- Author
-
Santhanakrishnan Narayanan, Nikita Makarov, and Constantinos Antoniou
- Subjects
neural nets ,strategic planning ,transport modelling and microsimulation ,Transportation engineering ,TA1001-1280 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract Practical applications of graph neural networks (GNNs) in transportation are still a niche field. There exists a significant overlap between the potential of GNNs and the issues in strategic transport modelling. However, it is not clear whether GNN surrogates can overcome (some of) the prevalent issues. Investigation of such a surrogate will show their advantages and the disadvantages, especially throwing light on their potential to replace complex transport modelling approaches in the future, such as the agent‐based models. In this direction, as a pioneer work, this paper studies the plausibility of developing a GNN surrogate for the classical four‐step approach, one of the established strategic transport modelling approaches. A formal definition of the surrogate is presented, and an augmented data generation procedure is introduced. The network of the Greater Munich metropolitan region is used for the necessary data generation. The experimental results show that GNNs have the potential to act as transport planning surrogates and the deeper GNNs perform better than their shallow counterparts. Nevertheless, as expected, they suffer performance degradation with an increase in network size. Future research should dive deeper into formulating new GNN approaches, which are able to generalize to arbitrary large networks.
- Published
- 2024
- Full Text
- View/download PDF
28. Active sonar target recognition method based on multi‐domain transformations and attention‐based fusion network
- Author
-
Qingcui Wang, Shuanping Du, Wei Zhang, and Fangyong Wang
- Subjects
echo ,feature extraction ,neural nets ,sonar target recognition ,Telecommunication ,TK5101-6720 - Abstract
Abstract The classification and recognition of underwater targets by an active sonar system remain challenging and complex. Traditional methods have limited classification performance in time and spatially varying ocean channels. An active sonar target recognition method is proposed based on multi‐domain transformations and an attention‐based fusion network. Initially, the active target echo undergoes time‐frequency analysis, auditory signal processing, and matched filtering to represent target attributes in joint spatial‐time‐frequency domains. Subsequently, multiple attention‐based fusion models fuse the multi‐domain transformations either early or late in the processing stages. An attention module further enhances significant feature channels through adaptive weight assignment. Experiment results demonstrate that the recognition accuracy of active sonar echoes using multi‐domain transformations improves significantly compared to that of single‐domain methods, with an increase of up to 10.5%. The incorporation of multiple transformation domains provides complementary information about the target, thereby enhancing the network's representation ability, especially with limited data samples. Furthermore, the findings indicate that feature fusion of multiple transformations in a high‐level feature space yields more informative and effective results for active sonar echoes compared to low‐level feature spaces.
- Published
- 2024
- Full Text
- View/download PDF
29. BTSC: Binary tree structure convolution layers for building interpretable decision‐making deep CNN
- Author
-
Yuqi Wang, Dawei Dai, Da Liu, Shuyin Xia, and Guoyin Wang
- Subjects
deep learning ,deep neural networks ,neural nets ,pattern classification ,Computational linguistics. Natural language processing ,P98-98.5 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Although deep convolution neural network (DCNN) has achieved great success in computer vision field, such models are considered to lack interpretability in decision‐making. One of fundamental issues is that its decision mechanism is considered to be a “black‐box” operation. The authors design the binary tree structure convolution (BTSC) module and control the activation level of particular neurons to build the interpretable DCNN model. First, the authors design a BTSC module, in which each parent node generates two independent child layers, and then integrate them into a normal DCNN model. The main advantages of the BTSC are as follows: 1) child nodes of the different parent nodes do not interfere with each other; 2) parent and child nodes can inherit knowledge. Second, considering the activation level of neurons, the authors design an information coding objective to guide neural nodes to learn the particular information coding that is expected. Through the experiments, the authors can verify that: 1) the decision‐making made by both the ResNet and DenseNet models can be explained well based on the "decision information flow path" (known as the decision‐path) formed in the BTSC module; 2) the decision‐path can reasonably interpret the decision reversal mechanism (Robustness mechanism) of the DCNN model; 3) the credibility of decision‐making can be measured by the matching degree between the actual and expected decision‐path.
- Published
- 2024
- Full Text
- View/download PDF
30. HIST: Hierarchical and sequential transformer for image captioning
- Author
-
Feixiao Lv, Rui Wang, Lihua Jing, and Pengwen Dai
- Subjects
computer vision ,feature extraction ,learning (artificial intelligence) ,neural nets ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Image captioning aims to automatically generate a natural language description of a given image, and most state‐of‐the‐art models have adopted an encoder–decoder transformer framework. Such transformer structures, however, show two main limitations in the task of image captioning. Firstly, the traditional transformer obtains high‐level fusion features to decode while ignoring other‐level features, resulting in losses of image content. Secondly, the transformer is weak in modelling the natural order characteristics of language. To address theseissues, the authors propose a HIerarchical and Sequential Transformer (HIST) structure, which forces each layer of the encoder and decoder to focus on features of different granularities, and strengthen the sequentially semantic information. Specifically, to capture the details of different levels of features in the image, the authors combine the visual features of multiple regions and divide them into multiple levels differently. In addition, to enhance the sequential information, the sequential enhancement module in each decoder layer block extracts different levels of features for sequentially semantic extraction and expression. Extensive experiments on the public datasets MS‐COCO and Flickr30k have demonstrated the effectiveness of our proposed method, and show that the authors’ method outperforms most of previous state of the arts.
- Published
- 2024
- Full Text
- View/download PDF
31. Multi‐scale skeleton simplification graph convolutional network for skeleton‐based action recognition
- Author
-
Fan Zhang, Ding Chongyang, Kai Liu, and Liu Hongjin
- Subjects
computer vision ,convolution ,feature extraction ,neural net architecture ,neural nets ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Human action recognition based on graph convolutional networks (GCNs) is one of the hotspots in computer vision. However, previous methods generally rely on handcrafted graph, which limits the effectiveness of the model in characterising the connections between indirectly connected joints. The limitation leads to weakened connections when joints are separated by long distances. To address the above issue, the authors propose a skeleton simplification method which aims to reduce the number of joints and the distance between joints by merging adjacent joints into simplified joints. Group convolutional block is devised to extract the internal features of the simplified joints. Additionally, the authors enhance the method by introducing multi‐scale modelling, which maps inputs into sequences across various levels of simplification. Combining with spatial temporal graph convolution, a multi‐scale skeleton simplification GCN for skeleton‐based action recognition (M3S‐GCN) is proposed for fusing multi‐scale skeleton sequences and modelling the connections between joints. Finally, M3S‐GCN is evaluated on five benchmarks of NTU RGB+D 60 (C‐Sub, C‐View), NTU RGB+D 120 (X‐Sub, X‐Set) and NW‐UCLA datasets. Experimental results show that the authors’ M3S‐GCN achieves state‐of‐the‐art performance with the accuracies of 93.0%, 97.0% and 91.2% on C‐Sub, C‐View and X‐Set benchmarks, which validates the effectiveness of the method.
- Published
- 2024
- Full Text
- View/download PDF
32. Wind energy system fault classification using deep CNN and improved PSO‐tuned extreme gradient boosting
- Author
-
Chun‐Yao Lee and Edu Daryl C. Maceren
- Subjects
fault diagnosis ,neural nets ,particle swarm optimisation ,wind turbine technology and control ,Renewable energy sources ,TJ807-830 - Abstract
Abstract Intelligent fault diagnosis for wind energy systems requires identifying unique characteristics to differentiate various fault types effectively, even when data discrepancy occurs due to the unpredictable and dynamic nature of its environment. This article addresses some of the challenges of fault classification in wind energy systems by proposing an integrated approach that combines deep learning features with a resampled supervisory control and data acquisition (SCADA) dataset. The methodology involves resampling the imbalanced SCADA dataset using synthetic minority oversampling technique (SMOTE) and near‐miss undersampling techniques, extracting deep learning features using deep convolutional neural network, and feeding them into an XGBoost (extreme gradient boosting) classifier with tuned parameters using adaptive elite‐particle swarm optimization (AEPSO). The effectiveness of the proposed method is demonstrated through validation conducted on a different imbalanced dataset showing superior performance metrics in terms of accuracy. Additionally, the study contributes to methodological advancements in wind turbine fault diagnosis by providing a rigorous framework for fault classification. It is confirmed that utilizing the extracted deep learning features into the resampled data can significantly affect the classification performance metrics. Furthermore, the proposed integrated approach shows significance for fault diagnosis enhancement in wind energy systems and advancing the field towards more efficient and reliable operation.
- Published
- 2024
- Full Text
- View/download PDF
33. Research on image saliency detection based on deep neural network
- Author
-
Linrun Qiu, Dongbo Zhang, and Yingkun Hu
- Subjects
edge detection ,feature extraction ,image matching ,neural nets ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract As a hot research field at present, computer vision is devoted to the rapid acquisition and application of target information from images or videos by simulating human visual mechanism. In order to improve the accuracy and efficiency of image detection, image saliency region detection technology has received more and more attention in the field of computer vision research; an important research content in the field, the core part of which lies in the research on algorithms related to feature extraction and saliency calculation of targets. This paper analyzes the multi‐feature fusion saliency detection model and visual saliency calculation process, and based on the existing algorithm, by improving the VGG16 network, a fully convolutional network saliency detection algorithm is proposed. The qualitative and quantitative experimental results show that compared with the four mainstream methods of BL, GS, SF, and RFCN, our algorithm not only improves the accuracy of salient object detection, but also effectively solves the problem of target edge blur. Therefore, this study has improved the accuracy and efficiency of saliency detection, which can not only promote the development of computer vision technology, but also provide support for research in the field of image processing.
- Published
- 2024
- Full Text
- View/download PDF
34. A novel density‐based representation for point cloud and its ability to facilitate classification
- Author
-
Xianlin Xie and Xue‐song Tang
- Subjects
image classification ,image processing ,neural nets ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Currently, in the field of processing 3D point cloud data, two primary representation methods have emerged: point‐based methods and voxel‐based methods. However, the former suffer from significant computational costs and lack the ease of handling exhibited by voxel‐based methods. Conversely, the later often encounter challenges related to information loss resulting from downsampling operations, thereby impeding subsequent tasks. To address these limitations, this article introduces a novel density‐based representation method for voxel partitioning. Additionally, a corresponding network structure is devised to extract features from this specific density representation, thereby facilitating the successful completion of classification tasks. The experiments are implemented on ModelNet40 and MNIST demonstrate that the proposed 3D convolution can achieve the‐state‐of‐the‐art performance based on the voxels.
- Published
- 2024
- Full Text
- View/download PDF
35. Fine‐grained spectrum map inference: A novel approach based on deep residual network
- Author
-
Shoushuai He, Lei Zhu, Lei Wang, Weijun Zeng, and Zhen Qin
- Subjects
neural nets ,radio spectrum management ,Telecommunication ,TK5101-6720 - Abstract
Abstract Spectrum map is a database that stores multidimensional representations of spectrum situation information. It provides support for spectrum sensing and endows wireless communication networks with intelligence. However, the ubiquitous deployment of monitoring devices leads to huge costs of operation and maintenance. It indicates that an approach is needed to reduce the number of monitoring devices, but prevent the degradation of data granularity. Therefore, this paper focuses on the accurate construction of the spectrum map. It aims to infer the fine‐grained spectrum situation of the target region based on coarse‐grained observation. In order to solve this problem, an inference framework based on deep residual network is developed in this paper. In the case of rule deployment for sensing nodes, it adopts the idea of super resolution to improve the accuracy of the spectrum map. The framework is composed of two major parts: an inference network, which generates fine‐grained spectrum maps from coarse‐grained counterparts by using feature extraction module and upsampling construction module; and a fusion network, which considers the influence of environmental factors to further improve the performance. A large number of experiments on simulated datasets verify the effectiveness of the proposed method.
- Published
- 2024
- Full Text
- View/download PDF
36. A NoisyNet deep reinforcement learning method for frequency regulation in power systems
- Author
-
Boming Zhang, Herbert Iu, Xinan Zhang, and Tat Kei Chau
- Subjects
neural nets ,power system control ,Distribution or transmission of electric power ,TK3001-3521 ,Production of electric energy or power. Powerplants. Central stations ,TK1001-1841 - Abstract
Abstract This study thoroughly investigates the NoisyNet Deep Deterministic Policy Gradient (DDPG) for frequency regulation. Compared with the conventional DDPG method, the suggested method can provide several benefits. First, the parameter noise will explore different strategies more thoroughly and can potentially discover better policies that it might miss if only action noise were used, which helps the actor achieve an optimal control strategy, resulting in enhanced dynamic response. Second, by employing the delayed policy update policy work with the proposed framework, the training process exhibits faster convergence, enabling rapid adaptation to changing disturbances. To substantiate its efficacy, the scheme is subjected to simulation tests on both an IEEE three‐area power system, an IEEE 39 bus power system, and an IEEE 68 bus system. A comprehensive performance comparison was performed against other DDPG‐based methods to validate and evaluate the performance of the proposed LFC scheme.
- Published
- 2024
- Full Text
- View/download PDF
37. Concurrent PV production and consumption load forecasting using CT‐Transformer deep learning to estimate energy system flexibility
- Author
-
Mohammad Zarghami, Taher Niknam, Jamshid Aghaei, and Azita Hatami Nezhad
- Subjects
learning systems ,load forecasting ,neural nets ,solar photovoltaic systems ,Renewable energy sources ,TJ807-830 - Abstract
Abstract The integration of renewable energy sources (RESs) into power systems has increased significantly due to technical, economic, and environmental factors, necessitating greater flexibility to manage variable consumption loads and renewable energy generation. Accurate forecasting of solar energy production and consumption load is critical for enhancing power system flexibility. This study introduces a novel deep learning model, a spatial‐temporal hybrid convolutional‐transformer (CT‐Transformer) network with unique features and extended memory capacity. Additionally, a flexibility index (FI) is introduced to evaluate power system flexibility (PSF) based on the forecasting results. The CT‐Transformer forecasts PSF for the next 24 and 168 hours, using the FI to evaluate PSF based on forecasting results. The input data includes meteorological, solar energy production, load demand, and pricing data from France, comprising hourly data from 2015 and 2016 (about 17,520 entries). Data preprocessing involves correcting incomplete and irrelevant segments. The CT‐Transformer's performance is compared to other deep learning techniques, showing superior results with the lowest prediction error (2.5%) and a maximum error of 10.1% (MAE). It also achieved a prediction error of 0.08% for system flexibility, compared to the highest error of 0.96%. This research highlights the CT‐Transformer's potential for accurate RES and load forecasting and PSF evaluation.
- Published
- 2024
- Full Text
- View/download PDF
38. Quantification of finger grasps during activities of daily life using convolutional neural networks: A pilot study
- Author
-
Manuela Paulina Trejo Ramírez, Callum John Thornton, Neil Darren Evans, and Michael John Chappell
- Subjects
biomechanics ,neural nets ,prosthetics ,Medical technology ,R855-855.5 - Abstract
Abstract Quantifying finger kinematics can improve the authors’ understanding of finger function and facilitate the design of efficient prosthetic devices while also identifying movement disorders and assessing the impact of rehabilitation interventions. Here, the authors present a study that quantifies grasps depicted in taxonomies during selected Activities of Daily Living (ADL). A single participant held a series of standard objects using specific grasps which were used to train Convolutional Neural Networks (CNN) for each of the four fingers individually. The experiment also recorded hand manipulation of objects during ADL. Each set of ADL finger kinematic data was tested using the trained CNN, which identified and quantified the grasps required to accomplish each task. Certain grasps appeared more often depending on the finger studied, meaning that even though there are physiological interdependencies, fingers have a certain degree of autonomy in performing dexterity tasks. The identified and most frequent grasps agreed with the previously reported findings, but also highlighted that an individual might have specific dexterity needs which may vary with profession and age. The proposed method can be used to identify and quantify key grasps for finger/hand prostheses, to provide a more efficient solution that is practical in their day‐to‐day tasks.
- Published
- 2024
- Full Text
- View/download PDF
39. Person re‐identification via deep compound eye network and pose repair module
- Author
-
Hongjian Gu, Wenxuan Zou, Keyang Cheng, Bin Wu, Humaira Abdul Ghafoor, and Yongzhao Zhan
- Subjects
computer vision ,neural nets ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Person re‐identification is aimed at searching for specific target pedestrians from non‐intersecting cameras. However, in real complex scenes, pedestrians are easily obscured, which makes the target pedestrian search task time‐consuming and challenging. To address the problem of pedestrians' susceptibility to occlusion, a person re‐identification via deep compound eye network (CEN) and pose repair module is proposed, which includes (1) A deep CEN based on multi‐camera logical topology is proposed, which adopts graph convolution and a Gated Recurrent Unit to capture the temporal and spatial information of pedestrian walking and finally carries out pedestrian global matching through the Siamese network; (2) An integrated spatial‐temporal information aggregation network is designed to facilitate pose repair. The target pedestrian features under the multi‐level logic topology camera are utilised as auxiliary information to repair the occluded target pedestrian image, so as to reduce the impact of pedestrian mismatch due to pose changes; (3) A joint optimisation mechanism of CEN and pose repair network is introduced, where multi‐camera logical topology inference provides auxiliary information and retrieval order for the pose repair network. The authors conducted experiments on multiple datasets, including Occluded‐DukeMTMC, CUHK‐SYSU, PRW, SLP, and UJS‐reID. The results indicate that the authors’ method achieved significant performance across these datasets. Specifically, on the CUHK‐SYSU dataset, the authors’ model achieved a top‐1 accuracy of 89.1% and a mean Average Precision accuracy of 83.1% in the recognition of occluded individuals.
- Published
- 2024
- Full Text
- View/download PDF
40. Feature selection algorithm for substation main equipment defect text mining based on natural language processing
- Author
-
Xiaoqing Mai, Tianhu Zhang, Changwu Hu, and Yan Zhang
- Subjects
neural nets ,power grids ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract The dimension of relevant text feature space and feature weight of substation main equipment defect information is high, so it is difficult to accurately select mining features. The Natural Language Processing (NLP) medium and short‐term neural network model is used to realise the defect information text feature word segmentation in the log. After extracting the text features of defect information of main substation equipment with high categories to form the feature space; the TF‐IDF algorithm is designed to calculate the importance weight of text keywords, judge the criticality of defect information text feature vocabulary, accurately locate defect information text features, and realise defect information text feature mining. Experiments show that the algorithm has high precision for specific word segmentation of massive substation main equipment log information.
- Published
- 2024
- Full Text
- View/download PDF
41. Emotion classification with multi‐modal physiological signals using multi‐attention‐based neural network
- Author
-
Chengsheng Zou, Zhen Deng, Bingwei He, Maosong Yan, Jie Wu, and Zhaoju Zhu
- Subjects
affective computing ,neural net architecture ,neural nets ,Computer engineering. Computer hardware ,TK7885-7895 ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract The ability to effectively classify human emotion states is critically important for human‐computer or human‐robot interactions. However, emotion classification with physiological signals is still a challenging problem due to the diversity of emotion expression and the characteristic differences in different modal signals. A novel learning‐based network architecture is presented that can exploit four‐modal physiological signals, electrocardiogram, electrodermal activity, electromyography, and blood volume pulse, and make a classification of emotion states. It features two kinds of attention modules, feature‐level, and semantic‐level, which drive the network to focus on the information‐rich features by mimicking the human attention mechanism. The feature‐level attention module encodes the rich information of each physiological signal. While the semantic‐level attention module captures the semantic dependencies among modals. The performance of the designed network is evaluated with the open‐source Wearable Stress and Affect Detection dataset. The developed emotion classification system achieves an accuracy of 83.88%. Results demonstrated that the proposed network could effectively process four‐modal physiological signals and achieve high accuracy of emotion classification.
- Published
- 2024
- Full Text
- View/download PDF
42. Multimodal imbalanced‐data fault diagnosis method based on a dual‐branch interactive fusion network
- Author
-
Jing He, Ling Yin, and Zhenwen Sheng
- Subjects
data analysis ,fault diagnosis ,neural nets ,sensor fusion ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Abstract Bearing‐fault diagnosis in rotating machinery is essential for ensuring the safety and reliability of mechanical systems. However, under complicated working conditions, the number of normal mechanical equipment samples can far exceed the number of faulty ones. When the data are so imbalanced, data fault diagnosis cannot be easily conducted using conventional deep learning methods. This study proposes a fault diagnosis method based on a dual‐branch interactive fusion network, which improves the accuracy and stability of bearing‐fault diagnosis. First, a dual‐branch feature representation network comprising an iterative attention‐feature fusion residual neural network and a long short‐term memory network is designed for extracting different modal features. Meanwhile, intermodal fusion of the extracted features is performed through multilayer perception. Based on the cost‐sensitive regularization loss, a new joint loss function is then designed for network training. Finally, the effectiveness of the proposed method is verified through comparative experiments, visualization analyses, ablation experiments, and generalization performance experiments.
- Published
- 2024
- Full Text
- View/download PDF
43. Closed‐loop stability analysis of deep reinforcement learning controlled systems with experimental validation
- Author
-
Mohammed Basheer Mohiuddin, Igor Boiko, Rana Azzam, and Yahya Zweiri
- Subjects
control system analysis ,cranes ,iterative learning control ,learning (artificial intelligence) ,learning systems ,neural nets ,Control engineering systems. Automatic machinery (General) ,TJ212-225 - Abstract
Abstract Trained deep reinforcement learning (DRL) based controllers can effectively control dynamic systems where classical controllers can be ineffective and difficult to tune. However, the lack of closed‐loop stability guarantees of systems controlled by trained DRL agents hinders their adoption in practical applications. This research study investigates the closed‐loop stability of dynamic systems controlled by trained DRL agents using Lyapunov analysis based on a linear‐quadratic polynomial approximation of the trained agent. In addition, this work develops an understanding of the system's stability margin to determine operational boundaries and critical thresholds of the system's physical parameters for effective operation. The proposed analysis is verified on a DRL‐controlled system for several simulated and experimental scenarios. The DRL agent is trained using a detailed dynamic model of a non‐linear system and then tested on the corresponding real‐world hardware platform without any fine‐tuning. Experiments are conducted on a wide range of system states and physical parameters and the results have confirmed the validity of the proposed stability analysis (https://youtu.be/QlpeD5sTlPU).
- Published
- 2024
- Full Text
- View/download PDF
44. A new fault location method for high‐voltage transmission lines based on ICEEMDAN‐MSA‐ConvGRU model
- Author
-
Taorong Jia, Lixiao Yao, and Guoqing Yang
- Subjects
artificial intelligence ,fault location ,neural nets ,power distribution lines ,power overhead lines ,Distribution or transmission of electric power ,TK3001-3521 ,Production of electric energy or power. Powerplants. Central stations ,TK1001-1841 - Abstract
Abstract Given the complex form of distribution line faults, the accuracy of fault location using traditional artificial intelligence networks needs to be further improved. Here, a combined fault location method is proposed for a 110 kV distribution line based on the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), mantis search algorithm (MSA), and convolutional gate recurrent unit (ConvGRU). Firstly, the study used the ICEEMDAN algorithm to decompose the signals and discard the high‐frequency signals with low correlation so as to achieve the purpose of noise cancellation. Then, the study used the root mean square error (RMSE) of the ConvGRU model training as the adaptation value, optimized the internal parameters of the model using the MSA algorithm, and obtained a combined fault locating model. By using the proposed model, the effects of the fault form and transition impedance changes on the location accuracy were analysed, and the location accuracy was compared with other artificial intelligence methods. The location accuracy index showed that the proposed model had a better convergence speed of training error than the traditional model. Also, the RMSE of the localization results was reduced by 50%, with a higher fault location accuracy.
- Published
- 2024
- Full Text
- View/download PDF
45. Using conditional Invertible Neural Networks to perform mid‐term peak load forecasting
- Author
-
Benedikt Heidrich, Matthias Hertel, Oliver Neumann, Veit Hagenmeyer, and Ralf Mikut
- Subjects
artificial intelligence and data analytics ,load forecasting ,neural nets ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Abstract Measures for balancing the electrical grid, such as peak shaving, require accurate peak forecasts for lower aggregation levels of electrical loads. Thus, the Big Data Energy Analytics Laboratory (BigDEAL) challenge—organised by the BigDEAL—focused on forecasting three different daily peak characteristics in low aggregated load time series. In particular, participants of the challenge were asked to provide long‐term forecasts with horizons of up to 1 year in the qualification. The authors present the approach of the KIT‐IAI team from the Institute for Automation and Applied Informatics at the Karlsruhe Institute of Technology. The approach to the challenge is based on a hybrid generative model. In particular, the authors use a conditional Invertible Neural Network (cINN). The cINN gets the forecast of a sliding mean as representative of the trend, different weather features, and calendar information as conditioning input. By this, the proposed hybrid method achieved second place overall and won two out of three tracks of the BigDEAL challenge.
- Published
- 2024
- Full Text
- View/download PDF
46. Adaptive soft threshold transformer for radar high‐resolution range profile target recognition
- Author
-
Siyu Chen, Xiaohong Huang, and Weibo Xu
- Subjects
artificial intelligence ,neural nets ,noise ,object recognition ,radar ,radar signal processing ,Telecommunication ,TK5101-6720 - Abstract
Abstract Radar High‐Resolution Range Profile (HRRP) has great potential for target recognition because it can provide target structural information. Existing work commonly applies deep learning to extract deep features from HRRPs and achieve impressive recognition performance. However, most approaches are unable to distinguish between the target and non‐target regions in the feature extraction process and do not fully consider the impact of background noise, which is harmful to recognition, especially at low signal‐to‐noise ratios (SNR). To tackle these problems, the authors propose a radar HRRP target recognition framework termed Adaptive Soft Threshold Transformer (ASTT), which is composed of a patch embedding (PE) layer, ASTT blocks, and Discrete Wavelet Patch Merging (DWPM) layers. Given the limited semantic information of individual range cells, the PE layer integrates nearby isolated range cells into semantically explicit target structure patches. Thanks to its convolutional layer and attention mechanism, the ASTT blocks assign a weight to each patch to locate the target areas in the HRRP while capturing local features and constructing sequence correlations. Moreover, the ASTT block efficiently filters noise features in combination with a soft threshold function to further enhance the recognition performance at low SNR, where the threshold is adaptively determined. Utilising the reversibility of the discrete wavelet transform, the DWPM layer efficiently eliminates the loss of valuable information during the pooling process. Experiments based on simulated and measured datasets show that the proposed method has excellent target recognition performance, noise robustness, and small‐scale range shift robustness.
- Published
- 2024
- Full Text
- View/download PDF
47. Dynamic spatial‐temporal network for traffic forecasting based on joint latent space representation
- Author
-
Qian Yu, Liang Ma, Pei Lai, and Jin Guo
- Subjects
intelligent transportation systems ,traffic modeling ,management and control ,neural nets ,Transportation engineering ,TA1001-1280 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Abstract In the era of data‐driven transportation development, traffic forecasting is crucial. Established studies either ignore the inherent spatial structure of the traffic network or ignore the global spatial correlation and may not capture the spatial relationships adequately. In this work, a Dynamic Spatial‐Temporal Network (DSTN) based on Joint Latent Space Representation (JLSR) is proposed for traffic forecasting. Specifically, in the spatial dimension, a JLSR network is developed by integrating graph convolution and spatial attention operations to model complex spatial dependencies. Since it can adaptively fuse the representation information of local topological space and global dynamic space, a more comprehensive spatial dependency can be captured. In the temporal dimension, a Stacked Bidirectional Unidirectional Gated Recurrent Unit (SBUGRU) network is developed, which captures long‐term temporal dependencies through both forward and backward computations and superimposed recurrent layers. On these bases, DSTN is developed in an encoder‐decoder framework and periodicity is flexibly modeled by embedding branches. The performance of DSTN is validated on two types of real‐world traffic flow datasets, and it improves over baselines.
- Published
- 2024
- Full Text
- View/download PDF
48. Cantonese sentence dataset for lip‐reading
- Author
-
Yewei Xiao, Xuanming Liu, Lianwei Teng, Aosu Zhu, Picheng Tian, and Jian Huang
- Subjects
computer vision ,image processing ,image recognition ,neural nets ,pattern recognition ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract Lip‐reading deciphers speech by observing lip movements without relying on audio data. The rapid advancements in deep learning have significantly improved lip‐reading for both English and Chinese; however, research on dialects such as Cantonese remains scarce. Consequently, most Chinese lip‐reading datasets focus on Mandarin, with only a few addressing Cantonese. To bridge this gap, a sentence‐level Cantonese lip‐reading dataset, designated as Cantonese lip‐reading sentences are introduced, comprising over 500 unique speakers and more than 30,000 samples. To ensure alignment with real‐world scenarios, no restrictions are imposed on factors such as gender, age, posture, lighting conditions, or speech rate. A comprehensive description of the pipeline employed is provided for collecting and constructing the dataset and introduce an innovative visual frontend, 3D‐visual attention net. This frontend combines the advantages of convolution and self‐attention mechanisms to extract fine‐grained lip region features. These features are subsequently input into the conformer backend for temporal sequence modelling, achieving comparable performance on Chinese Mandarin lip reading dataset, lip reading sentences 2, lip reading sentences 3, and Cantonese lip‐reading sentences datasets. Benchmark tests on Cantonese lip‐reading sentences demonstrate the challenges it poses, providing a novel research foundation for dialect lip‐reading and fostering the advancement of Cantonese lip‐reading tasks.
- Published
- 2024
- Full Text
- View/download PDF
49. Hybrid brain tumor classification of histopathology hyperspectral images by linear unmixing and an ensemble of deep neural networks
- Author
-
Inés A. Cruz‐Guerrero, Daniel Ulises Campos‐Delgado, Aldo R. Mejía‐Rodríguez, Raquel Leon, Samuel Ortega, Himar Fabelo, Rafael Camacho, Maria de la Luz Plaza, and Gustavo Callico
- Subjects
biomedical optical imaging ,image classification ,learning (artificial intelligence) ,medical image processing ,neural nets ,Medical technology ,R855-855.5 - Abstract
Abstract Hyperspectral imaging has demonstrated its potential to provide correlated spatial and spectral information of a sample by a non‐contact and non‐invasive technology. In the medical field, especially in histopathology, HSI has been applied for the classification and identification of diseased tissue and for the characterization of its morphological properties. In this work, we propose a hybrid scheme to classify non‐tumor and tumor histological brain samples by hyperspectral imaging. The proposed approach is based on the identification of characteristic components in a hyperspectral image by linear unmixing, as a features engineering step, and the subsequent classification by a deep learning approach. For this last step, an ensemble of deep neural networks is evaluated by a cross‐validation scheme on an augmented dataset and a transfer learning scheme. The proposed method can classify histological brain samples with an average accuracy of 88%, and reduced variability, computational cost, and inference times, which presents an advantage over methods in the state‐of‐the‐art. Hence, the work demonstrates the potential of hybrid classification methodologies to achieve robust and reliable results by combining linear unmixing for features extraction and deep learning for classification.
- Published
- 2024
- Full Text
- View/download PDF
50. Physics‐informed surrogates for electromagnetic dynamics using Transformers and graph neural networks
- Author
-
O. Noakoasteen, C. Christodoulou, Z. Peng, and S. K. Goudos
- Subjects
electromagnetic wave propagation ,finite difference time‐domain analysis ,neural nets ,Telecommunication ,TK5101-6720 ,Electricity and magnetism ,QC501-766 - Abstract
Abstract A novel use case for two data‐driven models, namely, a Transformer and a convolutional graph neural network (CGNN) is proposed. The authors propose to use these models for emulating the dynamics of electromagnetic (EM) propagation and scattering. The Transformer translates a past sequence into a future sequence by constructing representations from the past and using it to predict the future, taking all of its own previous predictions as input at each step of prediction. The CGNN updates the current state of attribute vectors of each node by passing it information (messages) from all of its neighbouring nodes. We train these models with FDTD simulations of plane waves propagating and scattering from PEC objects. The authors demonstrate that, within the bounds of computational resources, the Transformer can be utilised as a surrogate for EM dynamics, providing 14× speed‐up, while the CGNN can be utilised as a next‐frame predictor, providing 9× speed‐up. When comparing the accuracy of these two models with the authors’ previously developed Encoder‐Recurrent‐Decoder (ERD) model, it is observed that the error for both the Transformer and the CGNN remains within the same bound for the ERD model. To the best of the authors’ knowledge, this work is the first to utilise the Transformer as a surrogate for EM dynamics.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.